Patent · US Active

Global, model-agnostic machine learning explanation technique for textual data

US11720751B2 · kind B2 · utility

1Cited by

0References

25Claims

0Family size

Assignee

Oracle International Corporation · US

Inventors

Zahra Zohrevand · Vancouver, CA
Tayler Hetherington · Vancouver, CA
Karoon Rashedi Nia · Vancouver, CA
Yasha Pushak · Vancouver, CA
Sanjay Jinturkar · Santa Clara, US
Nipun Agarwal · Santa Clara, US

Key dates

Filing date	Jan 11, 2021
Grant date	Aug 8, 2023
Priority date	—
Expiry date	Oct 9, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/20
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A model-agnostic global explainer for textual data processing (NLP) machine learning (ML) models, “NLP-MLX”, is described herein. NLP-MLX explains global behavior of arbitrary NLP ML models by identifying globally-important tokens within a textual dataset containing text data. NLP-MLX accommodates any arbitrary combination of training dataset pre-processing operations used by the NLP ML model. NLP-MLX includes four main stages. A Text Analysis stage converts text in documents of a target dataset into tokens. A Token Extraction stage uses pre-processing techniques to efficiently pre-filter the complete list of tokens into a smaller set of candidate important tokens. A Perturbation Generation stage perturbs tokens within documents of the dataset to help evaluate the effect of different tokens, and combinations of tokens, on the model's predictions. Finally, a Token Evaluation stage uses the ML model and perturbed documents to evaluate the impact of each candidate token relative to predictions for the original documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.