Method for searching a file having a format unsupported by a search engine
US6327589A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Jun 24, 1998 |
| Grant date | Dec 4, 2001 |
| Priority date | — |
| Expiry date | Jun 24, 2018 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99943
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Searching a file in a format unsupported by a search engine by creating term-topic links with associated probabilities. A file is retrieved comprising a compressed HTML file or a webpage. The file is parsed to retrieve data associated with title tags and body tags. In addition, user queries are received so that the user may associate a query with the title data. Term-topic links are created by linking terms from the retrieved data and the query with a topic. Heuristics are then used to determine the probability associated with each term-topic link. Term-topic links having a term containing nouns are assigned a higher probability than verbs, verbs are assigned a higher probability than adjectives, and adjectives and adverbs are assigned the same probability. The term-topic links are trained by adjusting the assigned probabilities based on a user defined query and an associated target topic.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.