Method of sorting text and string searching
US7734671B1 · kind B1 · utility
Assignee
Inventor
Key dates
| Filing date | Oct 9, 2007 |
| Grant date | Jun 8, 2010 |
| Priority date | — |
| Expiry date | Oct 27, 2028 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99943
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of sorting text for memory efficient searching is disclosed. A FM-index is created on received text, and a number of rows are marked. The locations of the marked rows are stored in data buckets as well as the last column of the FM-index, which is stored as a wavelet tree. Data blocks containing the data buckets are created; containing the number of times each character appears in the data block before each data bucket. A header block is created comprising an array of the number of times each character appears in the last column of the FM-index before each data blocks, the location of the end of the data blocks and the location of the end of the data, and appended to the data block. The header and data blocks are stored. The search process loads data buckets into memory as needed to find the required text.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.