Patent · US Active

Clustering and standardizing structured data records generated using an extraction neural network

US12293842B1 · kind B1 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 22, 2024
Grant dateMay 6, 2025
Priority date
Expiry dateAug 22, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG16H10/20
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for normalizing a collection of structured data records generated by an extraction neural network to enable large-scale data processing and analysis for query answering. According to one aspect, there is provided a method that includes extracting text strings from the structured data records, embedding the text strings in a latent space, performing an iterative numerical clustering operation in the latent space to cluster the embeddings of the text strings, and then identifying standardized text strings based on a result of the clustering. The structured data records are normalized using the standardized text strings. The normalized collection of structured data records are then used to generate a response to a user query.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.