Methods, apparatus and computer programs for evaluating and using a resilient data representation
US7254577B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 29, 2004 |
| Grant date | Aug 7, 2007 |
| Priority date | — |
| Expiry date | Sep 1, 2025 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99942
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Provided are methods, apparatus and computer programs for evaluating the resilience, to structural changes in a data source, of a representative label representing a data element within the data source. Also disclosed are applications using a resilient representative label. For example, a representative label may represent a particular data field or other data element within a semi-structured data source—such as within XML or HTML Web pages. An estimate of resilience to changes can be used to determine whether a candidate representative label satisfies a required degree of resilience, or to enable selection of a label with the highest resilience score among a set of representative labels. The validated or selected representative label may then be used for data extraction, remaining usable despite the possibility of future changes to the structure of a Web page, or for template clustering/classification.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.