Patent · US Expired

Single pass workload directed clustering of XML documents

US7512615B2 · kind B2 · utility

11Cited by
8References
30Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 7, 2003
Grant dateMar 31, 2009
Priority date
Expiry dateFeb 1, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99942
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for clustering of XML documents is disclosed. The method operates under specified memory-use constraints. The system implements the method and scans an XML document, assigns edge-weights according to the application workload, and maps clusters of XML nodes to disk pages, all in a single parser-controlled pass over the XML data. Application workload information is used to generate XML clustering solutions that lead to substantial reduction in page faults for the workload under consideration. Several approaches for representing workload information are disclosed. For example, the workload may list the XPath operators invoked during the application along with their invocation frequencies. The application workload can be further refined by incorporating additional features such as query importance or query compilation costs. XML access patterns could be also modeled using stochastic approaches.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.