Patent · US Expired

System and method for tokening documents

US7275069B2 · kind B2 · utility

10Cited by
3References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 26, 2004
Grant dateSep 25, 2007
Priority date
Expiry dateJul 22, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99945
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system for tokenizing a document, such as, for example, an XML document. A classifier is configured to assign the at least one character to at least one of a plurality of character classes. Each of a plurality of token logic units is configured to concurrently perform a comparison as specified by an instruction. A comparison may comprise comparing the at least one character class to an operand. An execution unit is configured to select an action from the instruction in response to performing the comparisons and to perform the action. A method of tokenizing a document includes assigning at least one character from a document to at least one of a plurality of character classes and concurrently performing a plurality of comparisons. At least one of the plurality of comparisons comprises comparing the assigned character class to the character from the document. At least one action to be performed is selected based on at least one result produced by performing the comparisons, and the selected action is subsequently performed.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.