Generating and applying data extraction templates
US9785705B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 16, 2014 |
| Grant date | Oct 10, 2017 |
| Priority date | — |
| Expiry date | Apr 29, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/35
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, apparatus, systems, and computer-readable media are provided for generating and applying data extraction templates. In various implementations, a corpus of plain text communications such as emails may be grouped into clusters based on one or more similarities between the plain text communications. One or more segments of communications of a particular cluster may be classified as transient based on textual pattern matching. One or more other segments of the communications of the particular cluster may be classified as transient based on various criteria. One or more transient segments may be assigned a generic and/or specific semantic data type and/or a confidentiality designation based on various signals. A data extraction template may be generated to extract, from subsequent plain text communications, content associated with transient (and in some cases, non-confidential) segments.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.