System and method for extracting citations from documents and constructing enriched citation databases
US12169516B1 · kind B1 · utility
Inventors
Key dates
| Filing date | Sep 26, 2023 |
| Grant date | Dec 17, 2024 |
| Priority date | — |
| Expiry date | Sep 26, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/335
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for identifying citations from documents and constructing an enriched citation database may include obtaining a document comprising texts, constructing pre-processing filters and citation filters and converting short citations and immediately preceding citations into full citations. The pre-processing filters may include a first set of regular expressions matching non-citation text patterns and applying the pre-processing filters to the document to remove text patterns that match at least one of the first set of regular expressions generates a pre-processed document. The citation filters may include a second set of regular expressions matching citation text patterns and context text patterns and applying the citation filters to the pre-processed document may identify one or more citations and corresponding contexts that match at least one of the second set of regular expressions. The one or more citations and corresponding contexts may then be stored in a citation database.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.