Patent · US Active

Self-learning based crawling and rule-based data mining for automatic information extraction

US10762437B2 · kind B2 · utility

0Cited by
4References
22Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 22, 2016
Grant dateSep 1, 2020
Priority date
Expiry dateMay 28, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and Systems for automatic information extraction by performing self-learning crawling and rule-based data mining is provided. The method determines existence of crawl policy within input information and performs at least one of front-end crawling, assisted crawling and recursive crawling. Downloaded data set is pre-processed to remove noisy data and subjected to classification rules and decision tree based data mining to extract meaningful information. Performing crawling techniques leads to smaller relevant datasets pertaining to a specific domain from multi-dimensional datasets available in online and offline sources.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.