Patent · US Active

System and method for identifying poisoned data during data curation using data source characteristics

US12405930B2 · kind B2 · utility

0Cited by
9References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 29, 2023
Grant dateSep 2, 2025
Priority date
Expiry dateJun 5, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2221/034
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and systems for curating data from data sources are disclosed. Data may be curated from various data sources before being supplied to downstream consumers that may rely on the trustworthiness of the curated data to facilitate desired computer-implemented services. During data curation, collected data may undergo anomaly detection to identify anomalies in the data. Data anomalies may indicate the presence of poisoned data that, if provided to downstream consumers, may negatively impact the desired computer-implemented services. When poisoned data is detected among the data, a poisoned portion of the data may be identified using an optimization process. The optimization process may consider the degree of anomalousness of the data (e.g., using statistical representations of the anomaly) and/or characteristics of the data source that supplied the anomalous data to identify the poisoned portion. Remedial actions may be identified and/or performed in order to reduce an impact of the poisoned data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.