Concept-match medical data scrubbing. How pathology text can be used in research

In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pat...

Full description

Saved in:
Bibliographic Details
Published inArchives of pathology & laboratory medicine (1976) Vol. 127; no. 6; p. 680
Main Author Berman, Jules J
Format Journal Article
LanguageEnglish
Published United States College of American Pathologists 01.06.2003
Subjects
Online AccessGet full text
ISSN0003-9985
1543-2165
1543-2165
DOI10.5858/2003-127-680-CMDS

Cover

Abstract In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pathology records can be used for research purposes (without obtaining informed patient consent for each use of each record), provided the data are rendered harmless. Large data sets can be made harmless through 3 computational steps: (1) deidentification, the removal or modification of data fields that can be used to identify a patient (name, social security number, etc); (2) rendering the data ambiguous, ensuring that every data record in a public data set has a nonunique set of characterizing data; and (3) data scrubbing, the removal or transformation of words in free text that can be used to identify persons or that contain information that is incriminating or otherwise private. This article addresses the problem of data scrubbing. To design and implement a general algorithm that scrubs pathology free text, removing all identifying or private information. The Concept-Match algorithm steps through confidential text. When a medical term matching a standard nomenclature term is encountered, the term is replaced by a nomenclature code and a synonym for the original term. When a high-frequency "stop" word, such as a, an, the, or for, is encountered, it is left in place. When any other word is encountered, it is blocked and replaced by asterisks. This produces a scrubbed text. An open-source implementation of the algorithm is freely available. The Concept-Match scrub method transformed pathology free text into scrubbed output that preserved the sense of the original sentences, while it blocked terms that did not match terms found in the Unified Medical Language System (UMLS). The scrubbed product is safe, in the restricted sense that the output retains only standard medical terms. The software implementation scrubbed more than half a million surgical pathology report phrases in less than an hour. Computerized scrubbing can render the textual portion of a pathology report harmless for research purposes. Scrubbing and deidentification methods allow pathologists to create and use large pathology databases to conduct medical research.
AbstractList In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pathology records can be used for research purposes (without obtaining informed patient consent for each use of each record), provided the data are rendered harmless. Large data sets can be made harmless through 3 computational steps: (1) deidentification, the removal or modification of data fields that can be used to identify a patient (name, social security number, etc); (2) rendering the data ambiguous, ensuring that every data record in a public data set has a nonunique set of characterizing data; and (3) data scrubbing, the removal or transformation of words in free text that can be used to identify persons or that contain information that is incriminating or otherwise private. This article addresses the problem of data scrubbing.CONTEXTIn the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pathology records can be used for research purposes (without obtaining informed patient consent for each use of each record), provided the data are rendered harmless. Large data sets can be made harmless through 3 computational steps: (1) deidentification, the removal or modification of data fields that can be used to identify a patient (name, social security number, etc); (2) rendering the data ambiguous, ensuring that every data record in a public data set has a nonunique set of characterizing data; and (3) data scrubbing, the removal or transformation of words in free text that can be used to identify persons or that contain information that is incriminating or otherwise private. This article addresses the problem of data scrubbing.To design and implement a general algorithm that scrubs pathology free text, removing all identifying or private information.OBJECTIVETo design and implement a general algorithm that scrubs pathology free text, removing all identifying or private information.The Concept-Match algorithm steps through confidential text. When a medical term matching a standard nomenclature term is encountered, the term is replaced by a nomenclature code and a synonym for the original term. When a high-frequency "stop" word, such as a, an, the, or for, is encountered, it is left in place. When any other word is encountered, it is blocked and replaced by asterisks. This produces a scrubbed text. An open-source implementation of the algorithm is freely available.METHODSThe Concept-Match algorithm steps through confidential text. When a medical term matching a standard nomenclature term is encountered, the term is replaced by a nomenclature code and a synonym for the original term. When a high-frequency "stop" word, such as a, an, the, or for, is encountered, it is left in place. When any other word is encountered, it is blocked and replaced by asterisks. This produces a scrubbed text. An open-source implementation of the algorithm is freely available.The Concept-Match scrub method transformed pathology free text into scrubbed output that preserved the sense of the original sentences, while it blocked terms that did not match terms found in the Unified Medical Language System (UMLS). The scrubbed product is safe, in the restricted sense that the output retains only standard medical terms. The software implementation scrubbed more than half a million surgical pathology report phrases in less than an hour.RESULTSThe Concept-Match scrub method transformed pathology free text into scrubbed output that preserved the sense of the original sentences, while it blocked terms that did not match terms found in the Unified Medical Language System (UMLS). The scrubbed product is safe, in the restricted sense that the output retains only standard medical terms. The software implementation scrubbed more than half a million surgical pathology report phrases in less than an hour.Computerized scrubbing can render the textual portion of a pathology report harmless for research purposes. Scrubbing and deidentification methods allow pathologists to create and use large pathology databases to conduct medical research.CONCLUSIONSComputerized scrubbing can render the textual portion of a pathology report harmless for research purposes. Scrubbing and deidentification methods allow pathologists to create and use large pathology databases to conduct medical research.
In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pathology records can be used for research purposes (without obtaining informed patient consent for each use of each record), provided the data are rendered harmless. Large data sets can be made harmless through 3 computational steps: (1) deidentification, the removal or modification of data fields that can be used to identify a patient (name, social security number, etc); (2) rendering the data ambiguous, ensuring that every data record in a public data set has a nonunique set of characterizing data; and (3) data scrubbing, the removal or transformation of words in free text that can be used to identify persons or that contain information that is incriminating or otherwise private. This article addresses the problem of data scrubbing. To design and implement a general algorithm that scrubs pathology free text, removing all identifying or private information. The Concept-Match algorithm steps through confidential text. When a medical term matching a standard nomenclature term is encountered, the term is replaced by a nomenclature code and a synonym for the original term. When a high-frequency "stop" word, such as a, an, the, or for, is encountered, it is left in place. When any other word is encountered, it is blocked and replaced by asterisks. This produces a scrubbed text. An open-source implementation of the algorithm is freely available. The Concept-Match scrub method transformed pathology free text into scrubbed output that preserved the sense of the original sentences, while it blocked terms that did not match terms found in the Unified Medical Language System (UMLS). The scrubbed product is safe, in the restricted sense that the output retains only standard medical terms. The software implementation scrubbed more than half a million surgical pathology report phrases in less than an hour. Computerized scrubbing can render the textual portion of a pathology report harmless for research purposes. Scrubbing and deidentification methods allow pathologists to create and use large pathology databases to conduct medical research.
In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based data sets, annotated with clinical information and linked to archived tissues, to discover and validate new diagnostic tests and therapies. Pathology records can be used for research purposes (without obtaining informed patient consent for each use of each record), provided the data are rendered harmless. Large data sets can be made harmless through 3 computational steps: (1) deidentification, the removal or modification of data fields that can be used to identify a patient (name, social security number, etc); (2) rendering the data ambiguous, ensuring that every data record in a public data set has a nonunique set of characterizing data; and (3) data scrubbing, the removal or transformation of words in free text that can be used to identify persons or that contain information that is incriminating or otherwise private. This article addresses the problem of data scrubbing. To design and implement a general algorithm that scrubs pathology free text, removing all identifying or private information. The Concept-Match algorithm steps through confidential text. When a medical term matching a standard nomenclature term is encountered, the term is replaced by a nomenclature code and a synonym for the original term. When a high-frequency "stop" word, such as a, an, the, or for, is encountered, it is left in place. When any other word is encountered, it is blocked and replaced by asterisks. This produces a scrubbed text. An open-source implementation of the algorithm is freely available. The Concept-Match scrub method transformed pathology free text into scrubbed output that preserved the sense of the original sentences, while it blocked terms that did not match terms found in the Unified Medical Language System (UMLS). The scrubbed product is safe, in the restricted sense that the output retains only standard medical terms. The software implementation scrubbed more than half a million surgical pathology report phrases in less than an hour. Computerized scrubbing can render the textual portion of a pathology report harmless for research purposes. Scrubbing and deidentification methods allow pathologists to create and use large pathology databases to conduct medical research.
Author Berman, Jules J
Author_xml – sequence: 1
  givenname: Jules J
  surname: Berman
  fullname: Berman, Jules J
  email: bermanj@mail.nih.gov
  organization: Pathology Informatics, Cancer Diagnosis Program, National Cancer Institute, National Institutes of Health, Rockville, Md 20892, USA. bermanj@mail.nih.gov
BackLink https://www.ncbi.nlm.nih.gov/pubmed/12741890$$D View this record in MEDLINE/PubMed
BookMark eNpdkEtPwzAQhC1URB_wA7ggiwM3F7_jHFF5FKkIJOAc2Y5pUyV2iB1B_z2pKBcuu1rNt6PRTMHIB-8AOCd4LpRQ1xRjhgjNkFQYLZ5uX4_AhAjOECVSjMAE7_U8V2IMpjFuhzOnlJyA8fDDicrxBLwsgreuTajRyW5g48rK6hqWOmkYbdcbU_n1HC7DF2x12oQ6rHcwue8ErfbQONhHV8LKw85Fpzu7OQXHH7qO7uywZ-D9_u5tsUSr54fHxc0KtZTxhKRUipVY2WEQyyQVnBjDBcs45yUVhBpuJeMcmwEkmSwJ1c7kVufGEILZDFz9-rZd-OxdTEVTRevqWnsX-lhkjEpC2B68_AduQ9_5IVtBCcVSZWIPXRyg3gwdFG1XNbrbFX9FsR984Gnw
CODEN APLMAS
ContentType Journal Article
Copyright Copyright College of American Pathologists Jun 2003
Copyright_xml – notice: Copyright College of American Pathologists Jun 2003
DBID CGR
CUY
CVF
ECM
EIF
NPM
3V.
4T-
4U-
7RV
7X7
7XB
88E
88I
8AF
8AO
8C1
8FE
8FH
8FI
8FJ
8FK
ABUWG
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
FYUFA
GHDGH
GNUQQ
HCIFZ
K9.
KB0
LK8
M0S
M1P
M2P
M7P
NAPCQ
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
7X8
DOI 10.5858/2003-127-680-CMDS
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Docstoc
University Readers
Nursing & Allied Health Database
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Science Database (Alumni Edition)
STEM Database
ProQuest Pharma Collection
Public Health Database
ProQuest SciTech Collection
ProQuest Natural Science Journals
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Natural Science Collection
ProQuest One Community College
ProQuest Central Korea
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Nursing & Allied Health Database (Alumni Edition)
ProQuest Biological Science Collection
Health & Medical Collection (Alumni Edition)
Medical Database
Science Database
Biological Science Database
Nursing & Allied Health Premium
ProQuest Central Premium
ProQuest One Academic
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
University Readers
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Health & Medical Complete (Alumni)
ProQuest AP Science
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Health & Medical Research Collection
Health Research Premium Collection
Health and Medicine Complete (Alumni Edition)
Natural Science Collection
ProQuest Central Korea
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
ProQuest Public Health
ProQuest Science Journals (Alumni Edition)
ProQuest Biological Science Collection
ProQuest Central Basic
ProQuest Science Journals
ProQuest One Academic Eastern Edition
ProQuest Nursing & Allied Health Source
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest SciTech Collection
ProQuest Hospital Collection (Alumni)
Nursing & Allied Health Premium
ProQuest Health & Medical Complete
ProQuest Medical Library
ProQuest One Academic UKI Edition
Docstoc
ProQuest Nursing & Allied Health Source (Alumni)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
University Readers
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: BENPR
  name: ProQuest Central - New (Subscription)
  url: http://www.proquest.com/pqcentral?accountid=15518
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 1543-2165
ExternalDocumentID 344291071
12741890
Genre Journal Article
GroupedDBID ---
-~X
.55
.GJ
04C
1CY
23N
2WC
36B
3O-
3V.
53G
5GY
5RE
6PF
7RV
7X7
88E
88I
8AF
8AO
8C1
8FE
8FH
8FI
8FJ
8R4
8R5
AAQOH
AAQQT
AAWTL
ABCQX
ABDBF
ABOCM
ABUWG
ACGFO
ACGOD
ACPRK
ACUHS
ADBBV
ADOJX
AENEX
AFFNX
AFKRA
AHMBA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AZQEC
B0M
BAWUL
BBNVY
BENPR
BHPHI
BKEYQ
BKOMP
BMSDO
BPHCQ
BVXVI
C1A
CCPQU
CGR
CUY
CVF
DIK
DWQXO
E3Z
EAP
EAS
EBC
EBD
EBS
ECF
ECM
ECT
ECV
EHN
EIF
EIHBH
EJD
EMB
EMK
EMOBN
ENC
EPL
EPT
ESX
EX3
F5P
FAC
FAL
FJD
FJW
FRP
FYUFA
GNUQQ
GX1
HCIFZ
HMCUK
IAO
IEA
IHR
IHW
INH
INR
IOF
ITC
J5H
L7B
LK8
M1P
M2P
M2Q
M7P
NAPCQ
NPM
OK1
P2P
PCD
PKN
PQQKQ
PROAC
PSQYO
PV9
Q2X
Q~Q
RWL
RXW
RZL
SV3
TAE
TAF
TR2
TUS
TWZ
UDS
UKHRP
W2D
WH7
WOW
WQ9
X6Y
X7M
Y3D
YQJ
ZGI
ZXP
~8M
4T-
4U-
7XB
8FK
K9.
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
Q9U
7X8
PUEGO
ID FETCH-LOGICAL-p234t-66883d08c3d01c362541bb4537444d2512b4c63440b883176d12aeb9ca9bb1103
IEDL.DBID BENPR
ISSN 0003-9985
1543-2165
IngestDate Thu Sep 04 19:16:08 EDT 2025
Mon Oct 06 18:33:51 EDT 2025
Wed Feb 19 01:30:55 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-p234t-66883d08c3d01c362541bb4537444d2512b4c63440b883176d12aeb9ca9bb1103
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PMID 12741890
PQID 212068750
PQPubID 42082
ParticipantIDs proquest_miscellaneous_73261130
proquest_journals_212068750
pubmed_primary_12741890
PublicationCentury 2000
PublicationDate 2003-Jun
20030601
PublicationDateYYYYMMDD 2003-06-01
PublicationDate_xml – month: 06
  year: 2003
  text: 2003-Jun
PublicationDecade 2000
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Northfield
PublicationTitle Archives of pathology & laboratory medicine (1976)
PublicationTitleAlternate Arch Pathol Lab Med
PublicationYear 2003
Publisher College of American Pathologists
Publisher_xml – name: College of American Pathologists
SSID ssj0009221
Score 1.9287131
Snippet In the normal course of activity, pathologists create and archive immense data sets of scientifically valuable information. Researchers need pathology-based...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 680
SubjectTerms Algorithms
Archives & records
Computing Methodologies
Database Management Systems - classification
Database Management Systems - instrumentation
Database Management Systems - supply & distribution
Databases, Factual - classification
Databases, Factual - supply & distribution
Datasets
Diagnostic tests
Humans
Kidney cancer
Medical Record Linkage - instrumentation
Medical Record Linkage - methods
Medical Records Systems, Computerized - classification
Medical Records Systems, Computerized - instrumentation
Medical Records Systems, Computerized - supply & distribution
Medical Records, Problem-Oriented
Medical research
Pathology
Pathology, Clinical - organization & administration
Patients
Subject Headings
Unified Medical Language System - classification
Unified Medical Language System - instrumentation
Unified Medical Language System - supply & distribution
Title Concept-match medical data scrubbing. How pathology text can be used in research
URI https://www.ncbi.nlm.nih.gov/pubmed/12741890
https://www.proquest.com/docview/212068750
https://www.proquest.com/docview/73261130
Volume 127
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: EBSCOhost Academic Search Ultimate
  customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: ABDBF
  dateStart: 20030101
  isFulltext: true
  titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: DIK
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: GX1
  dateStart: 0
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: 7X7
  dateStart: 19960301
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central - New (Subscription)
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: BENPR
  dateStart: 19960301
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Public Health Database
  customDbUrl:
  eissn: 1543-2165
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009221
  issn: 0003-9985
  databaseCode: 8C1
  dateStart: 19960301
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/publichealth
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB76APEivq3VugevS_eVlyCitaUILUUt9BayycZ6MKm2xb_vbJLWk172ksdhZpn5dr-Z-QCuU6aiVPgp5SbVVPE4pYElHI3UiK49BNzGMrqjsTucqqeZM6vBaNMLY8sqNzGxCNRJHts78i6GWOYiuGZ3i09qRaMsubpR0IgqZYXktpgwVoemsIOxGtB86I8nz79TeIXYSujhOcMpaU5EzH63LNISHnV9Rnujx5e_IWeRegb7sFdhRnJfOvkAaiY7hJ1RxYofwaRX9h5SRJ_xnHyU3AuxxZ9kaSVo8PT7dkOG-TexAsTFRTqxFR8E7Uq0IeulSch7RqrJP_NjmA76r70hrZQS6EJItaKu6_syYX6MC48xJzmKa60c6SmlEgthtIpdqRTT-CL33ISLyOggjgKtEQDIE2hkeWbOgCCMlTLG-Gn7RjAeRDJxtImkYgl-KoIWtDdmCavtvgy3zmnB1fYp7lNLPkSZydfL0EOcyDFhtuC0tGW4KMdphLyYoBOw83__3IZdUSofUsYvoLH6WptLxAMr3YG6N_Nw9Xu8U3n8B9vJswo
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV25TsNAEB0FkIAGcRPCsQWUK_byhYQQJKAASYQ4JDrjtddAgRNwIsTH8W_M2k6ooKNxY3uLPWbe7JuZB7CXMhWlwk8pN6mmiscpDSzhaKRGdO0h4DaW0e323Pa9unxwHmrwNa6FsWmVY5tYGOqkH9s78gM0scxFcM2OB2_UikZZcnWsoBFVygrJUdFhrKrruDKfHxjB5UcXLVzufSHOz-6abVqJDNCBkGpIXdf3ZcL8GB88RnPuKK61cqSnlEqs99cqdqVSTOOH3HMTLiKjgzgKtEbfKXHcKZhRUgUY-82cnvWub366_goxkezDuMYpaVVE6P5BmRQmPOr6jDa7rdvfIW7h6s4XYaHCqOSk3FRLUDPZMsx2KxZ-Ba6bZa0jRbQbP5PXkushNtmU5FbyBqPtp0PS7n8QK3hcXNwTm2FCcB2JNmSUm4S8ZKTqNPS8Cvf_MmlrMJ31M7MBBGGzlDHaa1ungvYnkomjTSQVS_BXEdShMZ6WsDpeeTjZDHXYnbzFc2HJjigz_VEeeohLOTroOqyXcxkOyvYdIS869gRs88-Rd2GufdfthJ2L3lUD5kWpukgZ34Lp4fvIbCMWGeqdasUJPP73JvsGGivrrA
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3JTsMwEB2xSIgLYqesPsDRqrdsSAihlqosRUiA1FuIEwc4kBbSCvFp_B3jOC0nuHHJJYkP4_HMs994HsBhzlSSizCn3OSaKp7mNLKEo5Ea0XWAgNtYRrd343cf1GXf68_A1-QujC2rnMTEKlBng9SekTcxxDIfwTVr5nVVxG27czp8o1ZAyhKtEzUN5yFX5vMDd2_lyUUbp_pIiM75fatLa4EBOhRSjajvh6HMWJjig6cYyj3FtVaeDJRSmc38WqW-VIpp_JAHfsZFYnSUJpHWmDcljjsL84GUka0mDPrBT79fIaZifbij8Ryhitg8bLpyMBFQP2S01Wvf_Q5uqyTXWYalGp2SM-dOKzBjilVY6NX8-xrcttwtR4o4N30mr47lIbbMlJRW7Ab32U_HpDv4IFbquDqyJ9aKBGeQaEPGpcnIS0HqHkPP6_DwLybbgLliUJgtIAiYpUwxUtsbKhh5Epl52iRSsQx_FVEDdiZmieuFVcZTN2jAwfQtrghLcySFGYzLOEBEyjE1N2DT2TIeusYdMa969URs-8-RD2ABXSu-vri52oFF4eQWKeO7MDd6H5s9BCEjvV9NN4HH__avb4736UY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Concept-match+medical+data+scrubbing.+How+pathology+text+can+be+used+in+research&rft.jtitle=Archives+of+pathology+%26+laboratory+medicine+%281976%29&rft.au=Berman%2C+Jules+J&rft.date=2003-06-01&rft.issn=1543-2165&rft.eissn=1543-2165&rft.volume=127&rft.issue=6&rft.spage=680&rft_id=info:doi/10.5858%2F2003-127-680-CMDS&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0003-9985&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0003-9985&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0003-9985&client=summon