Method and apparatus for cleaning data sets for a search process
US8930361B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 3, 2011 |
| Grant date | Jan 6, 2015 |
| Priority date | — |
| Expiry date | May 12, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/215
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An approach is provided for cleaning data sets for a search process. The cleanup platform determines one or more reference documents associated with at least one region. Next, the cleanup platform processes and/or facilitates a processing of the one or more reference documents to determine a frequency distribution of one or more candidate stop words with respect to the at least one region. Then, the cleanup platform causes, at least in part, selection of one or more stop words applicable to the at least one region from the one or more candidate stop words based, at least in part, on one or more frequency distribution criteria. Additionally, the cleanup platform processes and/or facilitates a processing of at least one data set associated with a search process to generate at least one enhanced data set by filtering the one or more stop words from the at least one data set.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.