Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis

We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained...

Full description

Saved in:

Bibliographic Details
Published in	2014 IEEE 28th International Parallel and Distributed Processing Symposium Vol. 2014; pp. 1063 - 1072
Main Authors	Teodoro, George, Kurc, Tahsin, Jun Kong, Cooper, Lee, Saltz, Joel
Format	Conference Proceeding Journal Article
Language	English
Published	United States IEEE 01.05.2014
Subjects	Graphics processing units Image analysis Image segmentation Instruction sets Microscopy Microwave integrated circuits Vegetation
Online Access	Get full text
ISBN	1479937991 9781479937998
ISSN	1530-2075 1045-9219 1558-2183
DOI	10.1109/IPDPS.2014.111

Cover

Abstract	We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
AbstractList	We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs). We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
Author	Saltz, Joel Jun Kong Cooper, Lee Kurc, Tahsin Teodoro, George
AuthorAffiliation	3 Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA 1 Department of Computer Science, University of Brasília, Brasília, DF, Brazil 4 Department of Biomedical Informatics, Emory University, Atlanta, GA, USA 2 Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
AuthorAffiliation_xml	– name: 2 Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA – name: 1 Department of Computer Science, University of Brasília, Brasília, DF, Brazil – name: 3 Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA – name: 4 Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
Author_xml	– sequence: 1 givenname: George surname: Teodoro fullname: Teodoro, George email: teodoro@unb.br organization: Dept. of Comput. Sci., Univ. of Brasilia, Brasilia, Brazil – sequence: 2 givenname: Tahsin surname: Kurc fullname: Kurc, Tahsin email: tkurc@emory.edu organization: Dept. of Biomed. Inf., Stony Brook Univ., Stony Brook, NY, USA – sequence: 3 surname: Jun Kong fullname: Jun Kong email: jun.kong@emory.edu organization: Dept. of Biomed. Inf., Emory Univ., Atlanta, GA, USA – sequence: 4 givenname: Lee surname: Cooper fullname: Cooper, Lee email: lee.cooper@emory.edu organization: Dept. of Biomed. Inf., Emory Univ., Atlanta, GA, USA – sequence: 5 givenname: Joel surname: Saltz fullname: Saltz, Joel email: jhsaltz@emory.edu organization: Dept. of Biomed. Inf., Stony Brook Univ., Stony Brook, NY, USA
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/25419088$$D View this record in MEDLINE/PubMed
BookMark	eNpVUVtr1EAUHrViu7WvvghlHrfQtHPLXHwQlqh1ocVgu-DbMEkm7UgyEzPJSv69Wba2-nA4HL4bfGcBDnzwFoB3GF1gjNTlOv-U314QhNl84xfgRAmJmVCKSoTwS3CE01QmBEv6Ciz2wDz4YAdQlBAk0kOwiPEnQgRRpt6AQ5IyrJCUR2CbhbYzvRnc1sLc9nXoW-NLC1feNFN0EYYarv1gG7j8fgZ_2OBh_uDg8u7m7Bxe5ZtzaHwFs3zzAa5gZqKFt8NYTbDuQwtvXNmHWIZuguvW3D-7vgWva9NEe_K4j8Hmy-e77Gty_e1qna2uE0cVGRJOkJK8ptbUBheiYoZQLmzNCakwTytUcc5ZWTDBSGGQZISLolZWFaokdWrpMbjc-46-M9Nv0zS6611r-kljpHf1atdVXdS7eucbz4qPe0U3Fq2tSuuH3jyrgnH6f8S7B30ftpoRNhfMZ4Plo0Effo02Drp1sbRNY7wNY9RYEs5RSgSZqaf_Zj2F_H3PTHi_Jzhr7RPMpRCUpvQPqhqfPQ
CODEN	IEEPAD
ContentType	Conference Proceeding Journal Article
DBID	6IE 6IL CBEJK RIE RIL NPM 7X8 5PM ADTOC UNPAY
DOI	10.1109/IPDPS.2014.111
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present PubMed MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall
DatabaseTitle	PubMed MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic PubMed
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science Engineering
EISBN	9781479938001 1479938009
EISSN	1558-2183
EndPage	1072
ExternalDocumentID	oai:pubmedcentral.nih.gov:4240026 PMC4240026 25419088 6877335
Genre	orig-research Journal Article
GrantInformation_xml	– fundername: NIBIB NIH HHS grantid: P20 EB000591 – fundername: NLM NIH HHS grantid: R01 LM011119 – fundername: NLM NIH HHS grantid: R01 LM009239 – fundername: NCI NIH HHS grantid: U54 CA113001 – fundername: NHLBI NIH HHS grantid: R24 HL085343 – fundername: NIMHD NIH HHS grantid: RC4 MD005964
GroupedDBID	29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL --Z -~X .DC 0R~ 29I 4.4 5GY 5VS 97E AASAJ AAYOK ABAZT ABFSI ABQJQ ACGFO ACIWK AENEX AETIX AGSQL AHBIQ AI. AIBXA AKJIK ALLEH ASUFR ATWAV CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH JAVBF LAI M43 MS~ NPM O9- P2P PQQKQ RIA RIG RNI RNS RZB TN5 TWZ UHB VH1 7X8 ABVLG AGQYO AKQYR 5PM ADTOC UNPAY
ID	FETCH-LOGICAL-i392t-620986f3eafa1b7d4a2367ef622d165d0d6664cb4742ba084267bf9e9b9c2f5e3
IEDL.DBID	UNPAY
ISBN	1479937991 9781479937998
ISSN	1530-2075 1045-9219
IngestDate	Sun Oct 26 04:00:19 EDT 2025 Tue Sep 30 16:39:07 EDT 2025 Sun Sep 28 01:28:22 EDT 2025 Thu Apr 03 07:06:11 EDT 2025 Wed Aug 27 04:20:15 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i392t-620986f3eafa1b7d4a2367ef622d165d0d6664cb4742ba084267bf9e9b9c2f5e3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 teodoro@unb.br, tkurc@emory.edu, lee.cooper@emory.edu, jun.kong@emory.edu, jhsaltz@emory.edu
OpenAccessLink	https://proxy.k.utb.cz/login?url=http://doi.org/10.1109/IPDPS.2014.111
PMID	25419088
PQID	1826605272
PQPubID	23479
PageCount	10
ParticipantIDs	unpaywall_primary_10_1109_ipdps_2014_111 proquest_miscellaneous_1826605272 ieee_primary_6877335 pubmedcentral_primary_oai_pubmedcentral_nih_gov_4240026 pubmed_primary_25419088
PublicationCentury	2000
PublicationDate	20140501
PublicationDateYYYYMMDD	2014-05-01
PublicationDate_xml	– month: 5 year: 2014 text: 20140501 day: 1
PublicationDecade	2010
PublicationPlace	United States
PublicationPlace_xml	– name: United States
PublicationTitle	2014 IEEE 28th International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev	IPDPS
PublicationTitleAlternate	IEEE Trans Parallel Distrib Syst
PublicationYear	2014
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0020349 ssib026764574 ssj0014504
Score	2.3102264
Snippet	We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is...
SourceID	unpaywall pubmedcentral proquest pubmed ieee
SourceType	Open Access Repository Aggregation Database Index Database Publisher
StartPage	1063
SubjectTerms	Graphics processing units Image analysis Image segmentation Instruction sets Microscopy Microwave integrated circuits Vegetation
SummonAdditionalLinks	– databaseName: IEEE Electronic Library (IEL) dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9NAEB21vcCpQAuYFjRIHFIpTm1nvWv3hgKlRQqyoJFys3a9azVqsSMatwq_vjv-CoUeuNmy1_J-zc6z33sL8CHKPO0HOflp89xlWoduLLR0jc4yoXgcMkni5Ok3fjZjX-fhfAuGvRbGGFOTz8yIDut_-brMKvpUdswjIcbjcBu2RcQbrVY3dgIuOAsplWnBFvmuNF6pnh0JIqxFXYJWY5sRdV5P7XnUujn6Xnx8nnxKfhDli1E0afdceSz9_JdF-aQqlnJ9J6-v_1iiTndh2lWuYaZcjaqVGmW___J9_N_aP4P9jRgQk36Zew5bpngBu91uENgGhz24nWyMxDHZ6BGw8z3BMsdatYKD70c4N2WByeUCBxfToyF-SWZDlIXGSTI7wY84sasrEslxjaSAwSkRB0lCs8bznzYG9k_dh9np54vJmdvu6uAubC62cnngxRHPx0bm0ldCM0kmcibnQaB9HmpPW0TFMsUsaFfSi2wKIVQem1jFWZCHZvwSdoqyMK8BbYSMYiOVhag5s7mGssHG45ozW4BFnnRgj5oxXTbGHWnbgg6873oztZOJ_pDIwpTVTUpgy-K7QAQOvGp6ty9skbRPpDAHxIN-728go-6HV4rFZW3YzYioG3AHBv0I6UvVIMyL08VSL29SGncEyt48_uYH8JTuaAiXh7Cz-lWZtzYpWql39Wy4B5A-BDE priority: 102 providerName: IEEE
Title	Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis
URI	https://ieeexplore.ieee.org/document/6877335 https://www.ncbi.nlm.nih.gov/pubmed/25419088 https://www.proquest.com/docview/1826605272 https://pubmed.ncbi.nlm.nih.gov/PMC4240026 http://doi.org/10.1109/IPDPS.2014.111
UnpaywallVersion	submittedVersion
Volume	2014
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1RT9swED5t5WF7GhOwhTF0k3goEmFJ6jgxb6jQwaSiaBCpe4rs2BYVkFaj3dT9enxpm4KYJt4i5RxFuovvPuX7PgPspWWgw8iSnza3PtM69kWipW90WSaKi5hJEif3L_hZzr4P4sGKRPP4930YiK_n2Ul2SQQsRt_2a1jjsRu5W7CWX2THPxcmjBQ4HOvxfRO4OCrlX1Pjc_Ljm2k1lrM_8vb2UWfpvYPeUp8zJ5TcHE4n6rD8-9yu8f8vvQ6bK_EeZk1beg-vTLUBv7srl2_MVmIBXJqS4MhiLSnB9o99HJhRhdn1ENtX_f0D_JblBygrjd0sP8Jj7LrWh8RAnCHJU7BPrD7St8zw_M5tUM1TNyHvnV51z_zFkQv-0A1KE59HgUi57RhpZagSzSQ5vBnLo0iHPNaBdnCHlYo5RK1kkLr-nigrjFCijGxsOlvQqkaV-Qjotq9UGKkcfrTMDQLK7QQB15y5BSwNpAcblKNiPHfVKHiaJJ1O7MGXZc4KV-n0-0JWZjS9LwgJOfAVJZEHH-Y5bBY7mBsSY8uD5El2mwBy0X56pxpe127ajFi0Efeg3dRBs6pGSIEo6uoqKKOEmLZfHvoJ3tLlnCG5A63Jr6n57KaYidqtpYa7i0p-AFMY8A4
linkProvider	Unpaywall
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LbtNAFL0qZVFWBVqoy2uQWKRSnNrOPGx2KFASaCoLEik7a8YzViOKHbUxKHw9c_0KhS7Y2bLH8rzu3GOfcwbgTZh62g8y9NPmmUu1Zm4ktHSNTlOheMSoRHHy9IKP5_TTgi12oN9pYYwxFfnMDPCw-pevi7TET2WnPBRiOGT34D6jlLJardWOnoALThkmMw3cQueV2i3Vs2NBsErWJXA9tjlR6_bUnIeNn6PvRaeT-H38FUlfFONJs-vKXQnovzzKvTJfyc1PeXX1xyJ1tg_Ttno1N-XboFyrQfrrL-fH_63_QzjcygFJ3C10j2DH5I9hv90PgjTh4QB-jLZW4iTeKhJI63xCioxUuhXS-3JCFqbISXy5JL3Z9KRPPsbzPpG5JqN4_pa8IyO7vhKkOW4IamDIFKmDKKLZkMl3GwW7px7C_OzDbDR2m30d3KXNxtYuD7wo5NnQyEz6Smgq0UbOZDwItM-Z9rTFVDRV1MJ2Jb3QJhFCZZGJVJQGGTPDJ7CbF7k5AmJjZBgZqSxIzajNNpQNNx7XnNoCNPSkAwfYjMmqtu5ImhZ04HXbm4mdTviPROamKG8ShFsW4QUicOBp3btdYYulfaSFOSBu9Xt3A1p1376SLy8ry26KVN2AO9DrRkhXqoJhXpQsV3p1k-C4Q1h2fPebv4K98Wx6npxPLj4_gwd4d02_fA676-vSvLAp0lq9rGbGb1QJB34
linkToUnpaywall	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Na9tAEB1a-9CempK0UZOUKfTgQJRK8mqlzS24deKCg2gjcE9iV7uLTVPZJHaC--u74w85ISH0JtCsEMxoZx567y3A57QMdBhZ8tPm1mdax75ItPSNLstEcREzSeLk_gU_z9n3QTzYkGju_74PA_Gll33NfhIBi9G3_RKaPHYjdwOa-UV2-mtlwkiBo4me3NSBq6NSnpoaH5MfX82qiZzfyaure52l-wa6a33OklDy-3g2Vcfl38d2jc-_9BbsbMR7mNVt6S28MNU23HY2Lt-YbcQCuDYlwbHFhaQEWz8OcWDGFWbDEbYu-4dHeJblRygrjZ0sP8FT7LjWh8RAnCPJU7BPrD7St8yx98dtUPVTdyDvfrvsnPurIxf8kRuUpj6PApFy2zbSylAlmklyeDOWR5EOeawD7eAOKxVziFrJIHX9PVFWGKFEGdnYtN9BoxpXZhfQbV-pMFI5_GiZGwSU2wkCrjlzC1gaSA-2KUfFZOmqUfA0Sdrt2INP65wVrtLp94WszHh2UxAScuArSiIP3i9zWC92MDckxpYHyYPs1gHkov3wTjUaLty0GbFoI-5Bq66DetUCIQWiWFRXQRklxPTh_0P34DVdLhmS-9CYXs_MgZtipurjqob_Abf77w0
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Comparative+Performance+Analysis+of+Intel+%28R%29+Xeon+Phi+%28TM%29%2C+GPU%2C+and+CPU%3A+A+Case+Study+from+Microscopy+Image+Analysis&rft.au=Teodoro%2C+George&rft.au=Kurc%2C+Tahsin&rft.au=Jun+Kong&rft.au=Cooper%2C+Lee&rft.date=2014-05-01&rft.pub=IEEE&rft.isbn=1479937991&rft.issn=1530-2075&rft.spage=1063&rft.epage=1072&rft_id=info:doi/10.1109%2FIPDPS.2014.111&rft.externalDocID=6877335
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon