Cleaning noise words from transaction descriptions
US10546348B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 17, 2017 |
| Grant date | Jan 28, 2020 |
| Priority date | — |
| Expiry date | Aug 5, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/289
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, system, and non-transitory computer readable medium for removing noise ngrams from transaction records. The method may include obtaining noise ngrams; ordering the noise ngrams based on frequency of occurrence; discarding a portion of the noise ngrams below a frequency threshold to obtain a higher frequency subset of the noise ngrams; obtaining a transaction record of interest; and identifying a portion of the higher frequency subset within the transaction record of interest. Identifying the portion of the higher frequency subset may include constructing a regular expression based on the higher frequency subset; constructing a finite state machine based on the regular expression; providing the transaction record of interest as an input to the finite state machine; and executing the finite state machine. The method may also include removing, based on the identification, the portion of the higher frequency subset from the transaction record of interest.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.