System and method for identifying and categorizing messages extracted from archived message stores
US7577656B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 24, 2006 |
| Grant date | Aug 18, 2009 |
| Priority date | — |
| Expiry date | Feb 26, 2027 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99948
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for identifying messages in a message store is provided. At least part of metadata associated with and at least part of content contained in each of a plurality of messages in a message store are encoded by generating a metadata sequence and a content sequence for each message. The messages are grouped into sets by similar metadata sequences and similar content sequences. The messages in each set are compared. Each such message not matching any other such message in the set is marked as a unique message. Each such message matching at least one other such message in the set is marked as an exact duplicate message. Each such message including a subset of at least one other such message in the set is marked as a near duplicate message.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.