Patent · US Expired

Segmentation of strings into structured records

US7627567B2 · kind B2 · utility

10Cited by
1References
31Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 14, 2004
Grant dateDec 1, 2009
Priority date
Expiry dateNov 6, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99935
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An system for segmenting strings into component parts for use with a database management system. A reference table of string records are segmented into multiple substrings corresponding to database attributes. The substrings within an attribute are analyzed to provide a state model that assumes a beginning, a middle and an ending token topology for that attribute. A null token takes into account an empty attribute component and copying of states allows for erroneous token insertions and misordering. Once the model is created from the clean data, the process breaks or parses an input record into a sequence of tokens. The process then determines a most probable segmentation of the input record by comparing the tokens of the input record with a state models derived for attributes from the reference table.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.