Patent · US Expired

Method for searching a file having a format unsupported by a search engine

US6327589A · kind A · utility

31Cited by
11References
34Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 24, 1998
Grant dateDec 4, 2001
Priority date
Expiry dateJun 24, 2018

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Searching a file in a format unsupported by a search engine by creating term-topic links with associated probabilities. A file is retrieved comprising a compressed HTML file or a webpage. The file is parsed to retrieve data associated with title tags and body tags. In addition, user queries are received so that the user may associate a query with the title data. Term-topic links are created by linking terms from the retrieved data and the query with a topic. Heuristics are then used to determine the probability associated with each term-topic link. Term-topic links having a term containing nouns are assigned a higher probability than verbs, verbs are assigned a higher probability than adjectives, and adjectives and adverbs are assigned the same probability. The term-topic links are trained by adjusting the assigned probabilities based on a user defined query and an associated target topic.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.