Data normalization using data edge platform
US11157510B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 28, 2019 |
| Grant date | Oct 26, 2021 |
| Priority date | — |
| Expiry date | Apr 17, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/221
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed are system and methods for processing and storing data files, using a data edge file format. The data edge file separates information about what symbols are in a data file and information about the corresponding location of those symbols in the data file. The described technique converts a source file comprising symbols into a data edge index having a manifest portion, a symbol portion, and a locality portion. The symbol portion contains a sorted unique set of the symbols from the source file, and the locality portion contains a plurality of location values referencing the symbol portion. The technique includes normalizing the structured data from the source file by modifying the locality manifest portion of the data edge file to include a description of at least one nonexistent column empty locality value at a respective position within the locality file representing an omission of data at an associated the respective position in the source file.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.