System and method for extraction of factoids from textual repositories
US8706730B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 29, 2005 |
| Grant date | Apr 22, 2014 |
| Priority date | — |
| Expiry date | Jul 3, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/951
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method (400) is disclosed of extracting factoids from text repositories, with the factoids being associated with a given factoid category. The method (400) starts by training a classifier (230) to recognize factoids relevant to that given factoid category. Documents or document summaries relevant to the given factoid category is next collected (410) from the text repositories. Sentences having a predetermined association to the given factoid category is extracted (420) from the documents or said document summaries. Those sentences are classified (440), in a noisy environment, using the classifier (230) to extract snippets containing phrases relevant to the given factoid category. It is the extracted snippets that are the factoid associated with the given factoid category.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.