Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis

We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained...

Full description

Saved in:
Bibliographic Details
Published in2014 IEEE 28th International Parallel and Distributed Processing Symposium Vol. 2014; pp. 1063 - 1072
Main Authors Teodoro, George, Kurc, Tahsin, Jun Kong, Cooper, Lee, Saltz, Joel
Format Conference Proceeding Journal Article
LanguageEnglish
Published United States IEEE 01.05.2014
Subjects
Online AccessGet full text
ISBN1479937991
9781479937998
ISSN1530-2075
1045-9219
1558-2183
DOI10.1109/IPDPS.2014.111

Cover

Abstract We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
AbstractList We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
Author Saltz, Joel
Jun Kong
Cooper, Lee
Kurc, Tahsin
Teodoro, George
AuthorAffiliation 3 Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
1 Department of Computer Science, University of Brasília, Brasília, DF, Brazil
4 Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
2 Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
AuthorAffiliation_xml – name: 2 Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
– name: 1 Department of Computer Science, University of Brasília, Brasília, DF, Brazil
– name: 3 Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
– name: 4 Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
Author_xml – sequence: 1
  givenname: George
  surname: Teodoro
  fullname: Teodoro, George
  email: teodoro@unb.br
  organization: Dept. of Comput. Sci., Univ. of Brasilia, Brasilia, Brazil
– sequence: 2
  givenname: Tahsin
  surname: Kurc
  fullname: Kurc, Tahsin
  email: tkurc@emory.edu
  organization: Dept. of Biomed. Inf., Stony Brook Univ., Stony Brook, NY, USA
– sequence: 3
  surname: Jun Kong
  fullname: Jun Kong
  email: jun.kong@emory.edu
  organization: Dept. of Biomed. Inf., Emory Univ., Atlanta, GA, USA
– sequence: 4
  givenname: Lee
  surname: Cooper
  fullname: Cooper, Lee
  email: lee.cooper@emory.edu
  organization: Dept. of Biomed. Inf., Emory Univ., Atlanta, GA, USA
– sequence: 5
  givenname: Joel
  surname: Saltz
  fullname: Saltz, Joel
  email: jhsaltz@emory.edu
  organization: Dept. of Biomed. Inf., Stony Brook Univ., Stony Brook, NY, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/25419088$$D View this record in MEDLINE/PubMed
BookMark eNpVUVtr1EAUHrViu7WvvghlHrfQtHPLXHwQlqh1ocVgu-DbMEkm7UgyEzPJSv69Wba2-nA4HL4bfGcBDnzwFoB3GF1gjNTlOv-U314QhNl84xfgRAmJmVCKSoTwS3CE01QmBEv6Ciz2wDz4YAdQlBAk0kOwiPEnQgRRpt6AQ5IyrJCUR2CbhbYzvRnc1sLc9nXoW-NLC1feNFN0EYYarv1gG7j8fgZ_2OBh_uDg8u7m7Bxe5ZtzaHwFs3zzAa5gZqKFt8NYTbDuQwtvXNmHWIZuguvW3D-7vgWva9NEe_K4j8Hmy-e77Gty_e1qna2uE0cVGRJOkJK8ptbUBheiYoZQLmzNCakwTytUcc5ZWTDBSGGQZISLolZWFaokdWrpMbjc-46-M9Nv0zS6611r-kljpHf1atdVXdS7eucbz4qPe0U3Fq2tSuuH3jyrgnH6f8S7B30ftpoRNhfMZ4Plo0Effo02Drp1sbRNY7wNY9RYEs5RSgSZqaf_Zj2F_H3PTHi_Jzhr7RPMpRCUpvQPqhqfPQ
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IL
CBEJK
RIE
RIL
NPM
7X8
5PM
ADTOC
UNPAY
DOI 10.1109/IPDPS.2014.111
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList

MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Engineering
EISBN 9781479938001
1479938009
EISSN 1558-2183
EndPage 1072
ExternalDocumentID oai:pubmedcentral.nih.gov:4240026
PMC4240026
25419088
6877335
Genre orig-research
Journal Article
GrantInformation_xml – fundername: NIBIB NIH HHS
  grantid: P20 EB000591
– fundername: NLM NIH HHS
  grantid: R01 LM011119
– fundername: NLM NIH HHS
  grantid: R01 LM009239
– fundername: NCI NIH HHS
  grantid: U54 CA113001
– fundername: NHLBI NIH HHS
  grantid: R24 HL085343
– fundername: NIMHD NIH HHS
  grantid: RC4 MD005964
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
--Z
-~X
.DC
0R~
29I
4.4
5GY
5VS
97E
AASAJ
AAYOK
ABAZT
ABFSI
ABQJQ
ACGFO
ACIWK
AENEX
AETIX
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
ALLEH
ASUFR
ATWAV
CS3
DU5
E.L
EBS
EJD
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
JAVBF
LAI
M43
MS~
NPM
O9-
P2P
PQQKQ
RIA
RIG
RNI
RNS
RZB
TN5
TWZ
UHB
VH1
7X8
ABVLG
AGQYO
AKQYR
5PM
ADTOC
UNPAY
ID FETCH-LOGICAL-i392t-620986f3eafa1b7d4a2367ef622d165d0d6664cb4742ba084267bf9e9b9c2f5e3
IEDL.DBID UNPAY
ISBN 1479937991
9781479937998
ISSN 1530-2075
1045-9219
IngestDate Sun Oct 26 04:00:19 EDT 2025
Tue Sep 30 16:39:07 EDT 2025
Sun Sep 28 01:28:22 EDT 2025
Thu Apr 03 07:06:11 EDT 2025
Wed Aug 27 04:20:15 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i392t-620986f3eafa1b7d4a2367ef622d165d0d6664cb4742ba084267bf9e9b9c2f5e3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
teodoro@unb.br, tkurc@emory.edu, lee.cooper@emory.edu, jun.kong@emory.edu, jhsaltz@emory.edu
OpenAccessLink https://proxy.k.utb.cz/login?url=http://doi.org/10.1109/IPDPS.2014.111
PMID 25419088
PQID 1826605272
PQPubID 23479
PageCount 10
ParticipantIDs unpaywall_primary_10_1109_ipdps_2014_111
proquest_miscellaneous_1826605272
ieee_primary_6877335
pubmedcentral_primary_oai_pubmedcentral_nih_gov_4240026
pubmed_primary_25419088
PublicationCentury 2000
PublicationDate 20140501
PublicationDateYYYYMMDD 2014-05-01
PublicationDate_xml – month: 5
  year: 2014
  text: 20140501
  day: 1
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle 2014 IEEE 28th International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev IPDPS
PublicationTitleAlternate IEEE Trans Parallel Distrib Syst
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020349
ssib026764574
ssj0014504
Score 2.3102264
Snippet We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is...
SourceID unpaywall
pubmedcentral
proquest
pubmed
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 1063
SubjectTerms Graphics processing units
Image analysis
Image segmentation
Instruction sets
Microscopy
Microwave integrated circuits
Vegetation
SummonAdditionalLinks – databaseName: IEEE Electronic Library (IEL)
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9NAEB21vcCpQAuYFjRIHFIpTm1nvWv3hgKlRQqyoJFys3a9azVqsSMatwq_vjv-CoUeuNmy1_J-zc6z33sL8CHKPO0HOflp89xlWoduLLR0jc4yoXgcMkni5Ok3fjZjX-fhfAuGvRbGGFOTz8yIDut_-brMKvpUdswjIcbjcBu2RcQbrVY3dgIuOAsplWnBFvmuNF6pnh0JIqxFXYJWY5sRdV5P7XnUujn6Xnx8nnxKfhDli1E0afdceSz9_JdF-aQqlnJ9J6-v_1iiTndh2lWuYaZcjaqVGmW___J9_N_aP4P9jRgQk36Zew5bpngBu91uENgGhz24nWyMxDHZ6BGw8z3BMsdatYKD70c4N2WByeUCBxfToyF-SWZDlIXGSTI7wY84sasrEslxjaSAwSkRB0lCs8bznzYG9k_dh9np54vJmdvu6uAubC62cnngxRHPx0bm0ldCM0kmcibnQaB9HmpPW0TFMsUsaFfSi2wKIVQem1jFWZCHZvwSdoqyMK8BbYSMYiOVhag5s7mGssHG45ozW4BFnnRgj5oxXTbGHWnbgg6873oztZOJ_pDIwpTVTUpgy-K7QAQOvGp6ty9skbRPpDAHxIN-728go-6HV4rFZW3YzYioG3AHBv0I6UvVIMyL08VSL29SGncEyt48_uYH8JTuaAiXh7Cz-lWZtzYpWql39Wy4B5A-BDE
  priority: 102
  providerName: IEEE
Title Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis
URI https://ieeexplore.ieee.org/document/6877335
https://www.ncbi.nlm.nih.gov/pubmed/25419088
https://www.proquest.com/docview/1826605272
https://pubmed.ncbi.nlm.nih.gov/PMC4240026
http://doi.org/10.1109/IPDPS.2014.111
UnpaywallVersion submittedVersion
Volume 2014
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1RT9swED5t5WF7GhOwhTF0k3goEmFJ6jgxb6jQwaSiaBCpe4rs2BYVkFaj3dT9enxpm4KYJt4i5RxFuovvPuX7PgPspWWgw8iSnza3PtM69kWipW90WSaKi5hJEif3L_hZzr4P4sGKRPP4930YiK_n2Ul2SQQsRt_2a1jjsRu5W7CWX2THPxcmjBQ4HOvxfRO4OCrlX1Pjc_Ljm2k1lrM_8vb2UWfpvYPeUp8zJ5TcHE4n6rD8-9yu8f8vvQ6bK_EeZk1beg-vTLUBv7srl2_MVmIBXJqS4MhiLSnB9o99HJhRhdn1ENtX_f0D_JblBygrjd0sP8Jj7LrWh8RAnCHJU7BPrD7St8zw_M5tUM1TNyHvnV51z_zFkQv-0A1KE59HgUi57RhpZagSzSQ5vBnLo0iHPNaBdnCHlYo5RK1kkLr-nigrjFCijGxsOlvQqkaV-Qjotq9UGKkcfrTMDQLK7QQB15y5BSwNpAcblKNiPHfVKHiaJJ1O7MGXZc4KV-n0-0JWZjS9LwgJOfAVJZEHH-Y5bBY7mBsSY8uD5El2mwBy0X56pxpe127ajFi0Efeg3dRBs6pGSIEo6uoqKKOEmLZfHvoJ3tLlnCG5A63Jr6n57KaYidqtpYa7i0p-AFMY8A4
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LbtNAFL0qZVFWBVqoy2uQWKRSnNrOPGx2KFASaCoLEik7a8YzViOKHbUxKHw9c_0KhS7Y2bLH8rzu3GOfcwbgTZh62g8y9NPmmUu1Zm4ktHSNTlOheMSoRHHy9IKP5_TTgi12oN9pYYwxFfnMDPCw-pevi7TET2WnPBRiOGT34D6jlLJardWOnoALThkmMw3cQueV2i3Vs2NBsErWJXA9tjlR6_bUnIeNn6PvRaeT-H38FUlfFONJs-vKXQnovzzKvTJfyc1PeXX1xyJ1tg_Ttno1N-XboFyrQfrrL-fH_63_QzjcygFJ3C10j2DH5I9hv90PgjTh4QB-jLZW4iTeKhJI63xCioxUuhXS-3JCFqbISXy5JL3Z9KRPPsbzPpG5JqN4_pa8IyO7vhKkOW4IamDIFKmDKKLZkMl3GwW7px7C_OzDbDR2m30d3KXNxtYuD7wo5NnQyEz6Smgq0UbOZDwItM-Z9rTFVDRV1MJ2Jb3QJhFCZZGJVJQGGTPDJ7CbF7k5AmJjZBgZqSxIzajNNpQNNx7XnNoCNPSkAwfYjMmqtu5ImhZ04HXbm4mdTviPROamKG8ShFsW4QUicOBp3btdYYulfaSFOSBu9Xt3A1p1376SLy8ry26KVN2AO9DrRkhXqoJhXpQsV3p1k-C4Q1h2fPebv4K98Wx6npxPLj4_gwd4d02_fA676-vSvLAp0lq9rGbGb1QJB34
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Na9tAEB1a-9CempK0UZOUKfTgQJRK8mqlzS24deKCg2gjcE9iV7uLTVPZJHaC--u74w85ISH0JtCsEMxoZx567y3A57QMdBhZ8tPm1mdax75ItPSNLstEcREzSeLk_gU_z9n3QTzYkGju_74PA_Gll33NfhIBi9G3_RKaPHYjdwOa-UV2-mtlwkiBo4me3NSBq6NSnpoaH5MfX82qiZzfyaure52l-wa6a33OklDy-3g2Vcfl38d2jc-_9BbsbMR7mNVt6S28MNU23HY2Lt-YbcQCuDYlwbHFhaQEWz8OcWDGFWbDEbYu-4dHeJblRygrjZ0sP8FT7LjWh8RAnCPJU7BPrD7St8yx98dtUPVTdyDvfrvsnPurIxf8kRuUpj6PApFy2zbSylAlmklyeDOWR5EOeawD7eAOKxVziFrJIHX9PVFWGKFEGdnYtN9BoxpXZhfQbV-pMFI5_GiZGwSU2wkCrjlzC1gaSA-2KUfFZOmqUfA0Sdrt2INP65wVrtLp94WszHh2UxAScuArSiIP3i9zWC92MDckxpYHyYPs1gHkov3wTjUaLty0GbFoI-5Bq66DetUCIQWiWFRXQRklxPTh_0P34DVdLhmS-9CYXs_MgZtipurjqob_Abf77w0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Comparative+Performance+Analysis+of+Intel+%28R%29+Xeon+Phi+%28TM%29%2C+GPU%2C+and+CPU%3A+A+Case+Study+from+Microscopy+Image+Analysis&rft.au=Teodoro%2C+George&rft.au=Kurc%2C+Tahsin&rft.au=Jun+Kong&rft.au=Cooper%2C+Lee&rft.date=2014-05-01&rft.pub=IEEE&rft.isbn=1479937991&rft.issn=1530-2075&rft.spage=1063&rft.epage=1072&rft_id=info:doi/10.1109%2FIPDPS.2014.111&rft.externalDocID=6877335
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon