Patent · US Active

Intelligent system that dynamically improves its knowledge and code-base for natural language understanding

US11675977B2 · kind B2 · utility

1Cited by

6References

20Claims

0Family size

Assignee

Daash Intelligence, Inc. · US

Inventors

Robert J. Munro · Auckland, NZ
Rob Voigt · Palo Alto, US
Schuyler D. Erle · San Francisco, US
Brendan D. Callahan · Philadelphia, US
Gary C. King · San Jose, US
Jessica D. Long · Chicago, US
Jason Brenier · Oakland, US
Tripti Saxena · Sunnyvale, US
Stefan Krawczyk · Menlo Park, US

Key dates

Filing date	Mar 27, 2020
Grant date	Jun 13, 2023
Priority date	—
Expiry date	Oct 22, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.