SYSTEMS AND METHODS FOR IDENTIFYING TARGETED DATA
The present disclosure provides methods, systems, computing devices, computing entities, and/or the like for identifying and/or retrieving targeted data found in unstructured documents. In accordance with various aspects, a method is provided that comprises: receiving, a targeted data request identi...
        Saved in:
      
    
          | Format | Patent | 
|---|---|
| Language | English | 
| Published | 
          
        29.07.2022
     | 
| Online Access | Get full text | 
Cover
| Summary: | The present disclosure provides methods, systems, computing devices, computing entities, and/or the like for identifying and/or retrieving targeted data found in unstructured documents. In accordance with various aspects, a method is provided that comprises: receiving, a targeted data request identifying a data subject; processing a first feature representation of each document of a plurality of documents using a classifier machine-learning model to generate a prediction that the document contains the targeted data; generating a dataset that comprises each document having a prediction that satisfy a threshold; processing a second feature representation of each document of the dataset using a clustering machine-learning model to identify a document cluster for the document; and providing the document clusters so that an analysis can be performed on each document cluster to eliminate the document cluster as having targeted data and/or identify the targeted data associated with the data subject found in the document cluster. | 
|---|