Patent · US Active

Accessing documents using predictive word sequences

US9069842B2 · kind B2 · utility

147Cited by
8References
21Claims
0Family size

Assignee

Inventor

Key dates

Filing dateSep 28, 2010
Grant dateJun 30, 2015
Priority date
Expiry dateApr 20, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/3338
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and systems for accessing documents in document collections using predictive word sequences are disclosed. A method for accessing documents using predictive word sequences include creating a candidate list of word sequences where respective ones of the word sequences comprise one or more elements derived from the document corpus; expanding the candidate list by adding one or more new word sequences, where each new pattern is created by combining one or more elements derived from the document corpus with one of the word sequences currently in the candidate list; determining a predictive power with respect to the subject for respective ones of entries of the candidate list, where the entries include the word sequences and the new word sequences; pruning from the candidate list ones of said entries with the determined predictive power less than a predetermined threshold; and accessing documents from the document corpus based on the pruned candidate list. The expanding of the candidate list can include creating each new pattern as a gapped sequence, where the gapped sequence comprises one of the word sequences and one of said elements separated by zero or more words. Correspond…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.