Single pass workload directed clustering of XML documents
US7512615B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 7, 2003 |
| Grant date | Mar 31, 2009 |
| Priority date | — |
| Expiry date | Feb 1, 2025 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99942
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system for clustering of XML documents is disclosed. The method operates under specified memory-use constraints. The system implements the method and scans an XML document, assigns edge-weights according to the application workload, and maps clusters of XML nodes to disk pages, all in a single parser-controlled pass over the XML data. Application workload information is used to generate XML clustering solutions that lead to substantial reduction in page faults for the workload under consideration. Several approaches for representing workload information are disclosed. For example, the workload may list the XPath operators invoked during the application along with their invocation frequencies. The application workload can be further refined by incorporating additional features such as query importance or query compilation costs. XML access patterns could be also modeled using stochastic approaches.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.