System for categorizing lists of words of arbitrary origin
US9171267B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 14, 2013 |
| Grant date | Oct 27, 2015 |
| Priority date | — |
| Expiry date | Nov 15, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present disclosure provides for categorization of lists of words. The method comprises querying DBpedia to find the resources related to the given list of words. Once the resources are found, the corresponding media Wikipedia categories can be retrieved, as well as their ancestors, generating a graph of categories. A number of graph analysis algorithms can then be applied to the graph, each returning a selected category. For each algorithm a classifier is trained to decide whether the output of the algorithm is indeed the “best” category. An ensemble weighted majority voting can then be used to select the best category based on votes cast by each classifier. The disclosure demonstrates a more accurate selection of the best category and can include an ensemble majority rated voting algorithm comprising all voting members initially casting one vote; i.e., highest frequency, most frequently occurring word, least common ancestor and centrality measures.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.