ParaMed: a parallel corpus for English–Chinese translation in the biomedical domain

Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine tr...

Full description

Saved in:
Bibliographic Details
Published inBMC medical informatics and decision making Vol. 21; no. 1; pp. 258 - 11
Main Authors Liu, Boxiang, Huang, Liang
Format Journal Article
LanguageEnglish
Published England BioMed Central 06.09.2021
BMC
Subjects
Online AccessGet full text
ISSN1472-6947
1472-6947
DOI10.1186/s12911-021-01621-8

Cover

Abstract Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions on the full dataset. The code and data are available at https://github.com/boxiangliu/ParaMed .
AbstractList Abstract Background Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. Description We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en $$\rightarrow$$ → zh (zh $$\rightarrow$$ → en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en $$\rightarrow$$ → zh (zh $$\rightarrow$$ → en) directions on the full dataset. Conclusions The code and data are available at https://github.com/boxiangliu/ParaMed .
Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions on the full dataset. The code and data are available at https://github.com/boxiangliu/ParaMed .
Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain.BACKGROUNDBiomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain.We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions on the full dataset.DESCRIPTIONWe developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions on the full dataset.The code and data are available at https://github.com/boxiangliu/ParaMed .CONCLUSIONSThe code and data are available at https://github.com/boxiangliu/ParaMed .
Background Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. Description We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en\(\rightarrow\)zh (zh\(\rightarrow\)en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en\(\rightarrow\)zh (zh\(\rightarrow\)en) directions on the full dataset. Conclusions The code and data are available at https://github.com/boxiangliu/ParaMed.
ArticleNumber 258
Author Liu, Boxiang
Huang, Liang
Author_xml – sequence: 1
  givenname: Boxiang
  orcidid: 0000-0002-2595-4463
  surname: Liu
  fullname: Liu, Boxiang
– sequence: 2
  givenname: Liang
  surname: Huang
  fullname: Huang, Liang
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34488734$$D View this record in MEDLINE/PubMed
BookMark eNp9Uk1vEzEQtVARbQN_gANaiQuXwI7X8QcHJBQVqFQEB3q2vPZs4sixg72LxI3_wD_sL6mblKrtgYPtkefN07yZd0qOYopIyEto3wJI_q4AVQDzltYDvN7yCTkBJuicKyaO7sXH5LSUTduCkN3iGTnuGJNSdOyEXH432XxF974xza6GIWBobMq7qTRDys1ZXAVf1ld__i7XPmLBZswmlmBGn2LjYzOusel92qLz1oTGpa3x8Tl5OphQ8MXtOyOXn85-LL_ML759Pl9-vJhbpvg4l6rHtqescxTUIFFZND3SHnCwznAxIFOuaugN7xkCt2qhasa5dqHAcdvNyPmB1yWz0bvstyb_1sl4vf9IeaVNHr0NqNEIOgjDXCcFg84pKZ2yXd8jGAVdV7k-HLh2U1_VWIxVaXhA-jAT_Vqv0i8tGaWc80rw5pYgp58TllFvfbEYgomYpqLpQrQAIAAq9PUj6CZNOdZR3aAYE0LWqczIq_sd3bXyb3sVQA8Am1MpGYc7CLT6xiL6YBFdLaL3FtGyFslHRdaP-3VWVT78r_QabHfCLQ
CitedBy_id crossref_primary_10_3390_electronics13071381
crossref_primary_10_1093_llc_fqac089
crossref_primary_10_1145_3626095
crossref_primary_10_3390_jpm14090923
crossref_primary_10_3390_app12126002
crossref_primary_10_3390_app14167088
crossref_primary_10_1016_j_csl_2023_101582
crossref_primary_10_2478_amns_2025_0565
Cites_doi 10.18653/v1/W17-2507
10.18653/v1/P16-1162
10.18653/v1/W18-6478
10.1136/bmj.316.7124.2a
10.18653/v1/P16-1009
10.1075/cilt.292.32var
10.18653/v1/W16-2301
10.18653/v1/2020.emnlp-main.6
10.1007/3-540-45820-4_14
10.18653/v1/W18-6453
10.18653/v1/W16-2369
10.18653/v1/W16-2347
10.18653/v1/P17-4012
10.18653/v1/W18-6401
10.18653/v1/W17-4717
10.3115/1557769.1557821
10.1162/neco.1997.9.8.1735
10.1093/nar/gkh061
10.1136/bmj.b2354
10.18653/v1/W19-5301
10.18653/v1/W18-6488
ContentType Journal Article
Copyright 2021. The Author(s).
2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
The Author(s) 2021
Copyright_xml – notice: 2021. The Author(s).
– notice: 2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: The Author(s) 2021
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QO
7SC
7X7
7XB
88C
88E
8AL
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
L7M
LK8
L~C
L~D
M0N
M0S
M0T
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
7X8
5PM
DOA
DOI 10.1186/s12911-021-01621-8
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
ProQuest Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Healthcare Administration Database (Alumni)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials - QC
Biological Science Collection
ProQuest Central
Technology Collection (via ProQuest SciTech Premium Collection)
Natural Science Collection
ProQuest One Community College
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Biological Sciences
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
ProQuest Health & Medical Collection
Healthcare Administration Database
PML(ProQuest Medical Library)
ProQuest Biological Science
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
ProQuest Health Management (Alumni Edition)
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Advanced Technologies Database with Aerospace
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest Health Management
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList
MEDLINE
MEDLINE - Academic
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 4
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 1472-6947
EndPage 11
ExternalDocumentID oai_doaj_org_article_ea72f7a4d387413d988d9c3bbe1a9133
PMC8422666
34488734
10_1186_s12911_021_01621_8
Genre Journal Article
GeographicLocations China
GeographicLocations_xml – name: China
GroupedDBID ---
0R~
23N
2WC
53G
5VS
6J9
6PF
7X7
88E
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKPC
AASML
AAWTL
AAYXX
ABDBF
ABUWG
ACGFO
ACGFS
ACIWK
ACPRK
ACUHS
ADBBV
ADUKV
AENEX
AFKRA
AFPKN
AFRAH
AHBYD
AHMBA
AHYZX
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
AQUVI
ARAPS
AZQEC
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BGLVJ
BHPHI
BMC
BPHCQ
BVXVI
C6C
CCPQU
CITATION
CS3
DIK
DU5
DWQXO
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EMB
EMK
EMOBN
ESX
F5P
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
IHR
INH
INR
ITC
K6V
K7-
KQ8
LK8
M0T
M1P
M48
M7P
M~E
O5R
O5S
OK1
OVT
P2P
P62
PGMZT
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
RBZ
RNS
ROL
RPM
RSV
SMD
SOJ
SV3
TR2
TUS
UKHRP
W2D
WOQ
WOW
XSB
CGR
CUY
CVF
ECM
EIF
NPM
PJZUB
PPXIY
PQGLB
3V.
7QO
7SC
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
L7M
L~C
L~D
M0N
P64
PKEHL
PQEST
PQUKI
PRINS
Q9U
7X8
PUEGO
5PM
ID FETCH-LOGICAL-c496t-89be0b243d219f8e9ceabe2b1efcda67fe49d472ba6b4e16c959fcddd0591d6c3
IEDL.DBID BENPR
ISSN 1472-6947
IngestDate Wed Aug 27 01:17:27 EDT 2025
Thu Aug 21 14:28:57 EDT 2025
Thu Sep 04 18:55:43 EDT 2025
Fri Jul 25 19:04:00 EDT 2025
Mon Jul 21 06:00:43 EDT 2025
Tue Jul 01 04:05:53 EDT 2025
Thu Apr 24 23:07:03 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Text mining
Machine translation
Natural language processing
Language English
License 2021. The Author(s).
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c496t-89be0b243d219f8e9ceabe2b1efcda67fe49d472ba6b4e16c959fcddd0591d6c3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0002-2595-4463
OpenAccessLink https://www.proquest.com/docview/2574477824?pq-origsite=%requestingapplication%&accountid=15518
PMID 34488734
PQID 2574477824
PQPubID 42572
PageCount 11
ParticipantIDs doaj_primary_oai_doaj_org_article_ea72f7a4d387413d988d9c3bbe1a9133
pubmedcentral_primary_oai_pubmedcentral_nih_gov_8422666
proquest_miscellaneous_2570111711
proquest_journals_2574477824
pubmed_primary_34488734
crossref_primary_10_1186_s12911_021_01621_8
crossref_citationtrail_10_1186_s12911_021_01621_8
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-09-06
PublicationDateYYYYMMDD 2021-09-06
PublicationDate_xml – month: 09
  year: 2021
  text: 2021-09-06
  day: 06
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: London
PublicationTitle BMC medical informatics and decision making
PublicationTitleAlternate BMC Med Inform Decis Mak
PublicationYear 2021
Publisher BioMed Central
BMC
Publisher_xml – name: BioMed Central
– name: BMC
References 1621_CR31
1621_CR10
1621_CR32
1621_CR11
1621_CR33
1621_CR34
1621_CR13
1621_CR14
WA Gale (1621_CR20) 1993; 19
S Hochreiter (1621_CR35) 1997; 9
1621_CR36
1621_CR15
1621_CR37
1621_CR16
1621_CR38
1621_CR30
1621_CR5
1621_CR3
1621_CR9
J Tiedemann (1621_CR12) 2012; 2012
1621_CR8
1621_CR7
1621_CR6
1621_CR17
1621_CR39
1621_CR18
1621_CR42
1621_CR21
O Bodenreider (1621_CR4) 2004; 32
1621_CR22
1621_CR23
1621_CR24
1621_CR25
A Das (1621_CR2) 2009; 338
1621_CR26
1621_CR27
S Bird (1621_CR19) 2009
1621_CR40
I Bamforth (1621_CR1) 1998; 316
1621_CR41
1621_CR28
1621_CR29
References_xml – ident: 1621_CR8
  doi: 10.18653/v1/W17-2507
– ident: 1621_CR37
  doi: 10.18653/v1/P16-1162
– ident: 1621_CR27
  doi: 10.18653/v1/W18-6478
– ident: 1621_CR24
– ident: 1621_CR6
– volume: 2012
  start-page: 2214
  year: 2012
  ident: 1621_CR12
  publication-title: LREC
– volume: 316
  start-page: 2
  issue: 7124
  year: 1998
  ident: 1621_CR1
  publication-title: BMJ
  doi: 10.1136/bmj.316.7124.2a
– ident: 1621_CR28
– ident: 1621_CR5
  doi: 10.18653/v1/P16-1009
– ident: 1621_CR22
  doi: 10.1075/cilt.292.32var
– ident: 1621_CR41
  doi: 10.18653/v1/W16-2301
– ident: 1621_CR40
  doi: 10.18653/v1/2020.emnlp-main.6
– volume-title: Natural language processing with Python
  year: 2009
  ident: 1621_CR19
– ident: 1621_CR21
  doi: 10.1007/3-540-45820-4_14
– ident: 1621_CR13
– ident: 1621_CR26
  doi: 10.18653/v1/W18-6453
– ident: 1621_CR32
– ident: 1621_CR9
– ident: 1621_CR30
– ident: 1621_CR11
– ident: 1621_CR16
  doi: 10.18653/v1/W16-2369
– ident: 1621_CR17
– ident: 1621_CR38
– ident: 1621_CR15
  doi: 10.18653/v1/W16-2347
– ident: 1621_CR3
– ident: 1621_CR23
– ident: 1621_CR34
  doi: 10.18653/v1/P17-4012
– ident: 1621_CR36
  doi: 10.18653/v1/W18-6401
– ident: 1621_CR25
– ident: 1621_CR42
  doi: 10.18653/v1/W17-4717
– ident: 1621_CR7
– ident: 1621_CR29
  doi: 10.3115/1557769.1557821
– volume: 19
  start-page: 75
  issue: 1
  year: 1993
  ident: 1621_CR20
  publication-title: Comput Linguist
– volume: 9
  start-page: 1735
  issue: 8
  year: 1997
  ident: 1621_CR35
  publication-title: Neural Comput
  doi: 10.1162/neco.1997.9.8.1735
– ident: 1621_CR10
– ident: 1621_CR33
– volume: 32
  start-page: 267
  issue: suppl–1
  year: 2004
  ident: 1621_CR4
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh061
– ident: 1621_CR31
– volume: 338
  start-page: 2354
  year: 2009
  ident: 1621_CR2
  publication-title: BMJ
  doi: 10.1136/bmj.b2354
– ident: 1621_CR14
  doi: 10.18653/v1/W19-5301
– ident: 1621_CR18
– ident: 1621_CR39
  doi: 10.18653/v1/W18-6488
SSID ssj0017835
Score 2.3449736
Snippet Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified...
Background Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train...
Abstract Background Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 258
SubjectTerms Algorithms
Bilingualism
China
Clinical trials
Datasets
Domains
Editorials
English language
Health informatics
Humans
Interpreters
Language
Language translation
Machine translation
Natural Language Processing
Text mining
Translating
Translation
Translations
Translators
Websites
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT-MwELYQhxUXtA92CS95JW4ooo5dx-YGCFStVMSBStwsPyaiUjdFtL3zH_iH_JKdSdKKrlZw2UukxI7ijMcz39jzYOwYdaLSfXKrUD2ZK5V8bsCTs782oUpBpEgb-sMbPRipX_f9-zelvsgnrE0P3BLuFHxZVKVXSRpUfjJZY5KNMgQQ3qKBRdK3Z3tLY6o7P6D9jGWIjNGnM9RqtBVYkOms8WrW1FCTrf9fEPNvT8k3quf6M9vuMCM_b8f6hW1A_ZV9Gnan4t_Y6NY_ebw9455TKu_JBCYcrcrHxYwjJuVdqO7r8wtVy4YZ8DlpqNYLjo9rjiCQt3H4NGU8TX_7cb3DRtdXd5eDvCuXkEdl9Tw3NkAvFEomlEKVARvBByiCgComr8sKlE2qLILXQYHQ0fYttqSECEskHeV3tllPa9hlnCA30dhTOZjCSyoVK4OPSaYKv2AzJpbUc7HLJU4lLSausSmMdi3FHVLcNRR3JmMnq3ce20wa7_a-oElZ9aQs2M0D5A3X8Yb7iDcydrCcUtctzZlDGaVUicBIZeznqhkXFZ2U-Bqmi6YPyj1RCpGxHy0HrEYi0aA1pcS3yzXeWBvqeks9fmgSdxsKW9Z673_82z7bKhp-piQSB2xz_rSAQ8RH83DULIU_pSoOLw
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Scholars Portal Journals: Open Access
  dbid: M48
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3di9QwEB_OE8QX8dvqKRF8k-qlyaaJIKLicQgnPrhwbyFf1YXaPfcD7t78H_wP_UucSdvVleVeCm0SOkwmmd9kJjMAz1AnSjWhsAp5KEopoyt1chTsr7Rvoucx0IH-ySd1PJUfTyenezCWOxoYuNxp2lE9qemifXH-4-INLvjXecFr9XKJOosO-ioyjBU-9RW4mv1FFMon_3oV6JRjvDizc9yWcso5_HcBz__jJ_9RSEc34caAJNnbfupvwV7qbsO1k8FXfgemn93C4esr5hgl-G7b1DK0Nc_WS4ZIlQ0XeH___EU1tNMysRXprT42js06htCQ9bfzaSJZnH93s-4uTI8-fHl_XA5FFMogjVqV2vh06CspIu5NjU4mJOdT5XlqQnSqbpI0UdaVd8rLxFUwE4MtMSLu4lEFcQ_2u3mXHgAjIB6Ex55oFlZOUAFZ4V2IIjb4B1MAH7lnw5BhnApdtDZbGlrZnuMWOW4zx60u4PlmzFmfX-PS3u9oUjY9KTd2_jBffLXDUrPJ1VVTOxmFRrgkotE6ZroTdwZN8gIOxim1o7xZ3LmkrBEuyQKebppxqZH_xHVpvs59cDfkNecF3O8lYEOJQDNX1wJH11uysUXqdks3-5bTeWu6zKzUw8vJegTXqyyplDTiAPZXi3V6jHho5Z9kIf8Df5oIqw
  priority: 102
  providerName: Scholars Portal
Title ParaMed: a parallel corpus for English–Chinese translation in the biomedical domain
URI https://www.ncbi.nlm.nih.gov/pubmed/34488734
https://www.proquest.com/docview/2574477824
https://www.proquest.com/docview/2570111711
https://pubmed.ncbi.nlm.nih.gov/PMC8422666
https://doaj.org/article/ea72f7a4d387413d988d9c3bbe1a9133
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfR3bihMxNOx2QXwR73ZdSwTfZNjOJE0ygoiVrYvQsiwWii8ht9FCnam9vO8_-Id-iedkLlqRfTkwkzNMSE7OLedCyCuQiVyMMKyCD1nCuTeJCgaD_YWyhbepd-jQn87E5Zx_WowWR2TW5sJgWGXLEyOj9pVDH_k5kBbnEuQZf7f-kWDXKLxdbVtomKa1gn8bS4wdkxNgyaNhj5yML2ZX1929Avo52tQZJc63IO3QRZihSS0AqgPxFKv4_0_1_DeC8i-RNLlP7jW6JH1fb_4DchTKh-TOtLktf0TmV2Zj4PENNRRLfK9WYUXB2lzvtxR0Vdqk8P66-YldtMM20B1Krjo6ji5LCsohrfPzcSupr76bZfmYzCcXnz9cJk0bhcTxXOwSldswtBlnHrhToULugrEhs2konDdCFoHnnsvMGmF5SIXLRzmMeA-aV-qFY09Ir6zK8IxQVMUds4AJhmFmGLaQZdY4z3wBf8j7JG1XT7umxji2uljpaGsooesV17DiOq64Vn3yuvtmXVfYuBV7jJvSYWJ17Pii2nzVzWHTwciskIZ7pkBhYj5Xysd5h9TkYJT3yVm7pbo5slv9h8D65GU3DIcNb1BMGap9xAF-mMo07ZOnNQV0M2Fg6CrJ4Gt5QBsHUz0cKZffYkFvhenMQpzePq3n5G4WKRXLRpyR3m6zDy9AI9rZATmWCwlQTT4OGpIfRO8CwClXAK_HX34DficQzw
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtR3LbtQwcFS2EnBBvAkUMBKcUNQm9joOUoUotNrS7qpCXak3144dWGlJln0IceMf-B8-hi9hJi9YhHrrJVJiJxmNx573DMBz5IlC9imsQuzwUAhnQuUNBftLZXNnI5eRQX84koOxeH_WP9uAn20uDIVVtmdidVC7MiMb-TaSlhAJ8jPxevYlpK5R5F1tW2iYprWC261KjDWJHUf-21dU4Ra7h-9wvV_E8cH-6dtB2HQZCDORymWoUut3bCy4w82bK59m3lgf28jnmTMyyb1InUhia6QVPpJZ2k9xxDkUTCInM47fvQKbggwoPdjc2x-dfOj8GGRXaVN1lNxeIHclk2RMKrzEq1pjh1XXgP-Juv9GbP7FAg9uwo1GdmVvamK7BRu-uA1Xh413_g6MT8zc4O0rZhiVFJ9O_ZShdjtbLRjKxqxJGf71_Qd17fYLz5bEKetoPDYpGAqjrK4HQKTDXPnZTIq7ML4UhN6DXlEW_gEwEv0zbnEmKqKx4dSylluTOe5y_EMaQNRiT2dNTXNqrTHVlW6jpK4xrhHjusK4VgG87N6Z1RU9Lpy9R4vSzaRq3NWDcv5RN5tbe5PEeWKE4woFNO5SpVwFt49MGnEewFa7pLo5Ihb6D0EH8Kwbxs1NHhtT-HJVzcHzN0qiKID7NQV0kHBUrFXC8e1kjTbWQF0fKSafqgLiitKnpXx4MVhP4drgdHisjw9HR4_gelxRLZWs2ILecr7yj1EaW9onDckzOL_sXfYby4lL_Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ParaMed%3A+a+parallel+corpus+for+English%E2%80%93Chinese+translation+in+the+biomedical+domain&rft.jtitle=BMC+medical+informatics+and+decision+making&rft.au=Liu%2C+Boxiang&rft.au=Huang%2C+Liang&rft.date=2021-09-06&rft.pub=BioMed+Central&rft.eissn=1472-6947&rft.volume=21&rft.spage=1&rft_id=info:doi/10.1186%2Fs12911-021-01621-8
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1472-6947&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1472-6947&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1472-6947&client=summon