Patent · US Active

Intelligent signature-based anti-cloaking web recrawling

US11444977B2 · kind B2 · utility

2Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 22, 2019
Grant dateSep 13, 2022
Priority date
Expiry dateJul 23, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2221/033
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Web sites are crawled using multiple browser profiles to avoid malicious cloaking. Based on web page content returned from HTTP requests using the multiple browser profiles, web sites returning substantively different content to HTTP requests for different browser profiles are identified. Web sites are further filtered by common cloaking behavior, and redirect scripts are extracted from web page content that performed cloaking. Signatures comprising tokenized versions of the redirect scripts are generated and compared to a database of known cloaking signatures. URLs corresponding to signatures having approximate matches with signatures in the database are flagged for recrawling. Recrawled URLs are verified for malicious cloaking again using HTTP requests from multiple browser profiles.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.