SYSTEMS AND METHODS FOR IDENTIFYING TARGETED DATA
The present disclosure provides methods, systems, computing devices, computing entities, and/or the like for identifying and/or retrieving targeted data found in unstructured documents. In accordance with various aspects, a method is provided that comprises: receiving, a targeted data request identi...
Saved in:
Format | Patent |
---|---|
Language | English |
Published |
29.07.2022
|
Online Access | Get full text |
Cover
Summary: | The present disclosure provides methods, systems, computing devices, computing entities, and/or the like for identifying and/or retrieving targeted data found in unstructured documents. In accordance with various aspects, a method is provided that comprises: receiving, a targeted data request identifying a data subject; processing a first feature representation of each document of a plurality of documents using a classifier machine-learning model to generate a prediction that the document contains the targeted data; generating a dataset that comprises each document having a prediction that satisfy a threshold; processing a second feature representation of each document of the dataset using a clustering machine-learning model to identify a document cluster for the document; and providing the document clusters so that an analysis can be performed on each document cluster to eliminate the document cluster as having targeted data and/or identify the targeted data associated with the data subject found in the document cluster. |
---|