Method for segmenting communication transcripts using unsupervised and semi-supervised techniques
US7912714B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 1, 2008 |
| Grant date | Mar 22, 2011 |
| Priority date | — |
| Expiry date | Mar 4, 2029 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/355
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters within sequences of the collection.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.