Patent · US Expired

Automated taxonomy generation

US7266548B2 · kind B2 · utility

25Cited by
16References
30Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJun 30, 2004
Grant dateSep 4, 2007
Priority date
Expiry dateSep 21, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In a hierarchical taxonomy of document, the categories of information may be structured as a binary tree with the nodes of the binary tree containing information relevant to the search. The binary tree may be ‘trained’ or formed by examining a training set of documents and separating those documents into two child nodes. Each of those sets of documents may then be further split into two nodes to create the binary tree data structure. The nodes may be generated to maximize the likelihood that all of the training documents are in either or both of the two child nodes. In one example, each node of the binary tree may be associated with a list of terms and each term in each list of terms is associated with a probability of that term appearing in a document given that node. New documents may be categorized by the nodes of the tree. For example, the new documents may be assigned to a particular node based upon the statistical similarity between that document and the associated node.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.