Patent · US Active

Systems and methods for detecting sensitive information using pattern recognition

US10878124B1 · kind B1 · utility

10Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 5, 2018
Grant dateDec 29, 2020
Priority date
Expiry dateApr 10, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/044
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and systems for identifying sensitive information are provided. The method includes tokenizing labeled data into first word sequences, the labeled data including sensitive information. The method includes associating the labeled sensitive information with tags. The method includes determining that the first word sequences and the tags satisfy conditions defined by feature functions. The method includes calculating a local maximum of a likelihood function to determine a weight. The method includes tokenizing unlabeled data into second word sequences, the unlabeled data including sensitive information. The method includes executing each feature function based on their weights, the second word sequences, and tag sequences. The method includes selecting tag sequences that maximize probabilities of the second word sequences based on the likelihood function. The method includes identifying sensitive information in the unlabeled data based on the selected tag sequences.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.