Document alignment systems for legacy document conversions
US7882119B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 22, 2005 |
| Grant date | Feb 1, 2011 |
| Priority date | — |
| Expiry date | Oct 26, 2028 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/258
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for aligning documents which may be in different XML formats includes inputting source and target leaves of a source and documents in first and second tree structured formats and assigning a cost to each of a plurality of matches. Each match may include a source leaf and a target leaf or be an unmatched source or target leaf. Matches are identified for which a total cost is minimal, wherein each of the leaves is in at least one of the identified matches. From the identified matches, groups of two or more matches are identified which have a leaf in common. From the groups, probable matches are identified in which more that one target leaf is matched with at least one source leaf or more than one source leaf is matched with a target leaf. An alignment between leaves of the target document and leaves of the source document is output which includes the probable matches.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.