Landscape of High-Performance Python to Develop Data Science and Machine Learning Applications

Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes...

Full description

Saved in:
Bibliographic Details
Published inACM computing surveys Vol. 56; no. 3; pp. 1 - 30
Main Authors Castro, Oscar, Bruneau, Pierrick, Sottet, Jean-Sébastien, Torregrossa, Dario
Format Journal Article
LanguageEnglish
Published New York, NY ACM 31.03.2024
Association for Computing Machinery
Subjects
Online AccessGet full text
ISSN0360-0300
1557-7341
1557-7341
DOI10.1145/3617588

Cover

Abstract Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes inevitable. Thus, harnessing high-performance devices such as multi-core processors and graphical processing units to their potential is generally not trivial. The present narrative survey can be thought of as a reference document for such practitioners to help them make their way in the wealth of tools and techniques available for the Python language. Our document revolves around user scenarios, which are meant to cover most situations they may face. We believe that this document may also be of practical use to tool developers, who may use our work to identify potential lacks in existing tools and help them motivate their contributions.
AbstractList Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes inevitable. Thus, harnessing high-performance devices such as multi-core processors and graphical processing units to their potential is generally not trivial. The present narrative survey can be thought of as a reference document for such practitioners to help them make their way in the wealth of tools and techniques available for the Python language. Our document revolves around user scenarios, which are meant to cover most situations they may face. We believe that this document may also be of practical use to tool developers, who may use our work to identify potential lacks in existing tools and help them motivate their contributions.
ArticleNumber 65
Author Sottet, Jean-Sébastien
Bruneau, Pierrick
Torregrossa, Dario
Castro, Oscar
Author_xml – sequence: 1
  givenname: Oscar
  orcidid: 0000-0003-4025-7903
  surname: Castro
  fullname: Castro, Oscar
  email: oscar.castro@list.lu
  organization: Luxembourg Institute of Science and Technology, Luxembourg
– sequence: 2
  givenname: Pierrick
  orcidid: 0000-0002-7725-512X
  surname: Bruneau
  fullname: Bruneau, Pierrick
  email: pierrick.bruneau@list.lu
  organization: Luxembourg Institute of Science and Technology, Luxembourg
– sequence: 3
  givenname: Jean-Sébastien
  orcidid: 0000-0002-3071-6371
  surname: Sottet
  fullname: Sottet, Jean-Sébastien
  email: jean-sebastien.sottet@list.lu
  organization: Luxembourg Institute of Science and Technology, Luxembourg
– sequence: 4
  givenname: Dario
  orcidid: 0000-0002-5863-1628
  surname: Torregrossa
  fullname: Torregrossa, Dario
  email: dario_torregrossa@goodyear.com
  organization: Goodyear Innovation Center, Luxembourg
BookMark eNp1kM9PwjAUxxuDiYDGu6cmHvQybenasSMBFROMJHJ2eeveoGS0sxsa_nuHQw9GT-_lfT_v17dHOtZZJOScsxvOQ3krFI_kcHhEulzKKIhEyDuky4RiAROMnZBeVa0ZY4OQqy55nYHNKg0lUpfTqVmugjn63PkNWI10vqtXztLa0Qm-Y-FKOoEa6Is2uJebXvoEemUs0hmCt8Yu6agsC6OhNs5Wp-Q4h6LCs0Psk8X93WI8DWbPD4_j0SwAwWQdhAq5iuMsZgoBYsGEFhBJiNQQJHCVNhkiiEinXEuVZalOIZRNRac55qJPrtuxW1vC7gOKIim92YDfJZwle1uSgy0NetmipXdvW6zqZO223jbHJYNYCDmIuQobKmgp7V1VecwTbeqvl2oPpvhj6tUv_v_9Fy0JevMDfYufTX6HHw
CitedBy_id crossref_primary_10_1016_j_softx_2024_101897
crossref_primary_10_3390_astronomy3020009
crossref_primary_10_1007_s41870_023_01559_2
crossref_primary_10_1016_j_procs_2024_09_016
Cites_doi 10.1109/PACT.2004.1342537
10.1016/j.jocs.2011.06.002
10.1109/PDP.2012.89
10.1109/IEMCON53756.2021.9623197
10.1109/MCSE.2021.3128806
10.1145/2833157.2833162
10.1016/j.ascom.2014.12.001
10.1016/j.jpdc.2005.03.010
10.1177/1094342020937050
10.1109/TPDS.2021.3097283
10.1145/2616498.2616565
10.1145/3448016.3457244
10.1109/CCGRID.2008.104
10.1145/3447818.3460376
10.14778/3407790.3407807
10.1109/CLUSTER.2018.00059
10.1177/1094342015594678
10.1145/3426422.3426980
10.1145/3315454.3329956
10.1038/s41592-019-0686-2
10.1145/2020373.2020388
10.1109/99.660313
10.1088/1749-4680/8/1/014001
10.1145/197405.197406
10.1145/3447818.3460376
10.1016/j.softx.2020.100517
10.1145/3426422.3426980
10.1145/165854.165874
10.1145/3315454.3329956
10.5555/2946645.2946679
10.1109/IISWC.2018.8573512
10.1109/PAW-ATM49560.2019.00011
10.1038/s41586-020-2649-2
10.1145/2833157.2833162
10.1145/3448016.3457244
10.1145/2020373.2020388
10.1145/3284358
10.1186/s40537-016-0052-5
10.1051/0004-6361/201732493
10.1145/3307681.3325400
10.1145/3295500.3356173
10.5555/1953048.2078195
10.1145/2616498.2616565
10.1145/3581807.3581878
10.1016/j.scico.2021.102759
10.1145/1327452.1327492
10.5334/jors.161
ContentType Journal Article
Copyright Copyright held by the owner/author(s).
Copyright Association for Computing Machinery Mar 2024
Copyright_xml – notice: Copyright held by the owner/author(s).
– notice: Copyright Association for Computing Machinery Mar 2024
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ADTOC
UNPAY
DOI 10.1145/3617588
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
CrossRef
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
DocumentTitleAlternate Landscape of High-Performance Python
EISSN 1557-7341
EndPage 30
ExternalDocumentID 10.1145/3617588
10_1145_3617588
3617588
GroupedDBID --Z
-DZ
-~X
.4S
.DC
23M
4.4
5GY
5VS
6J9
85S
8US
8VB
AAIKC
AAKMM
AALFJ
AAMNW
AAYFX
ABPPZ
ACGFO
ACGOD
ACM
ACNCT
ADBCU
ADL
ADMLS
ADPZR
AEBYY
AEGXH
AEMOZ
AENEX
AENSD
AFWIH
AFWXC
AGHSJ
AHQJS
AIAGR
AIKLT
AKVCP
ALMA_UNASSIGNED_HOLDINGS
ARCSS
ASPBG
AVWKF
BDXCO
CCLIF
CS3
EBE
EBR
EBU
EDO
EMK
FEDTE
GUFHI
HGAVV
H~9
IAO
ICD
IEA
IGS
IOF
K1G
LHSKQ
N95
P1C
P2P
PQQKQ
QWB
RNS
ROL
RXW
TAE
TH9
U5U
UKR
UPT
VQA
W7O
WH7
X6Y
XH6
XSW
XZL
YXB
Z5M
ZCA
ZL0
77I
AAYXX
AEFXT
AEJOY
AETEA
AKRVB
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
41~
4R4
9M8
AAFWJ
ACBNA
ADMHC
ADTOC
ADXHL
AFFNX
AI.
BAAKF
EBS
EJD
HF~
ITC
MVM
OHT
TAF
UNPAY
VH1
XJT
XOL
YR5
ZCG
ID FETCH-LOGICAL-a305t-46e1699d906eaa9303c3a75a768a5a16ba76eea37cb1c56ddbcba45ea3cbfef3
IEDL.DBID UNPAY
ISSN 0360-0300
1557-7341
IngestDate Tue Aug 19 17:23:54 EDT 2025
Tue Aug 12 18:16:02 EDT 2025
Wed Oct 01 05:53:07 EDT 2025
Thu Apr 24 22:54:13 EDT 2025
Fri Feb 21 01:28:38 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords data science
code acceleration
Python
Language English
License This work is licensed under a Creative Commons Attribution International 4.0 License.
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a305t-46e1699d906eaa9303c3a75a768a5a16ba76eea37cb1c56ddbcba45ea3cbfef3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-7725-512X
0000-0003-4025-7903
0000-0002-3071-6371
0000-0002-5863-1628
OpenAccessLink https://proxy.k.utb.cz/login?url=https://dl.acm.org/doi/pdf/10.1145/3617588
PQID 2933529164
PQPubID 47570
PageCount 30
ParticipantIDs unpaywall_primary_10_1145_3617588
proquest_journals_2933529164
crossref_citationtrail_10_1145_3617588
crossref_primary_10_1145_3617588
acm_primary_3617588
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-03-31
PublicationDateYYYYMMDD 2024-03-31
PublicationDate_xml – month: 03
  year: 2024
  text: 2024-03-31
  day: 31
PublicationDecade 2020
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: Baltimore
PublicationTitle ACM computing surveys
PublicationTitleAbbrev ACM CSUR
PublicationYear 2024
Publisher ACM
Association for Computing Machinery
Publisher_xml – name: ACM
– name: Association for Computing Machinery
References (Bib0059) 2020; 53
(Bib0075) 2018
(Bib0060) 2011; 12
(Bib0082) 2017; 31
(Bib0045) 2015
(Bib0009) 2014
(Bib0008) 2004
(Bib0029) 2020
(Bib0079) 2022
(Bib0014) 2018; 618
(Bib0042) 2013
(Bib0052) 2014
(Bib0003) 2021
(Bib0023) 2022
(Bib0025) 2005; 65
(Bib0041) 1993
(Bib0053) 2023
(Bib0056) 2022
(Bib0066) 2022
(Bib0081) 2008
(Bib0083) 2022
(Bib0085) 2020; 17
(Bib0036) 2014
(Bib0035) 2022
(Bib0055) 2022
(Bib0063) 2023
(Bib0078) 2016; 3
(Bib0044) 2015
(Bib0011) 2018; 51
(Bib0013) 2018
(Bib0047) 2016; 17
(Bib0024) 1998; 5
(Bib0080) 2022
(Bib0086) 2022; 215
(Bib0019) 2015
(Bib0033) 2012
(Bib0046) 2021; 23
(Bib0002) 2015; 10
(Bib0007) 1994; 26
(Bib0071) 2015; 130
(Bib0039) 2018
(Bib0027) 2022
(Bib0016) 2018
(Bib0076) 2019
(Bib0064) 2020
(Bib0001) 2016; abs/1603.04467
(Bib0037) 2017
(Bib0030) 2018
(Bib0050) 2018
(Bib0072) 2013; 4
(Bib0010) 2019
(Bib0020) 2017; 5
(Bib0040) 2019
(Bib0021) 2022
(Bib0074) 2020
(Bib0004) 2016
(Bib0015) 2021; 54
(Bib0038) 2022
(Bib0048) 2022
(Bib0051) 2020
(Bib0069) 2019; 51
(Bib0087) 2019
(Bib0061) 2020
(Bib0068) 2022
(Bib0031) 2015; 8
(Bib0067) 2020; 34
(Bib0032) 2020; 12
(Bib0012) 2007
(Bib0073) 2022
(Bib0054) 2017
(Bib0018) 2015
(Bib0005) 2021
(Bib0065) 2022
(Bib0084) 2022; 33
(Bib0058) 2019
(Bib0028) 2008; 51
(Bib0070) 2022
(Bib0057) 2022
(Bib0049) 2018
(Bib0043) 2010
(Bib0006) 2019
(Bib0017) 2022
(Bib0022) 2022
(Bib0034) 2020; 585
(Bib0077) 2021
(Bib0062) 2022
(Bib0026) 2016
e_1_3_2_26_2
e_1_3_2_49_2
e_1_3_2_28_2
Moritz Philipp (e_1_3_2_50_2) 2018
e_1_3_2_41_2
e_1_3_2_64_2
e_1_3_2_87_2
Rocklin Matthew (e_1_3_2_72_2) 2015; 130
e_1_3_2_20_2
e_1_3_2_62_2
e_1_3_2_85_2
e_1_3_2_22_2
e_1_3_2_45_2
e_1_3_2_68_2
e_1_3_2_24_2
e_1_3_2_47_2
e_1_3_2_66_2
Abadi Martín (e_1_3_2_2_2) 2016; 1603
Paszke A. (e_1_3_2_59_2) 2019
e_1_3_2_83_2
e_1_3_2_81_2
e_1_3_2_9_2
e_1_3_2_37_2
e_1_3_2_7_2
e_1_3_2_18_2
e_1_3_2_39_2
Boito F. Z. (e_1_3_2_12_2) 2018; 51
Nishino Royud (e_1_3_2_55_2) 2017
e_1_3_2_54_2
e_1_3_2_75_2
e_1_3_2_31_2
e_1_3_2_52_2
e_1_3_2_73_2
e_1_3_2_5_2
e_1_3_2_33_2
e_1_3_2_58_2
e_1_3_2_79_2
e_1_3_2_3_2
e_1_3_2_14_2
e_1_3_2_35_2
e_1_3_2_56_2
e_1_3_2_77_2
e_1_3_2_71_2
Bauer Michael Edward (e_1_3_2_10_2) 2014
e_1_3_2_48_2
e_1_3_2_29_2
e_1_3_2_40_2
Müller Stefan C. (e_1_3_2_53_2) 2014
e_1_3_2_65_2
e_1_3_2_86_2
Moritz Philipp (e_1_3_2_51_2) 2018
e_1_3_2_21_2
e_1_3_2_42_2
e_1_3_2_63_2
e_1_3_2_84_2
e_1_3_2_23_2
e_1_3_2_44_2
e_1_3_2_69_2
e_1_3_2_25_2
e_1_3_2_46_2
e_1_3_2_67_2
Bondhugula Uday (e_1_3_2_13_2) 2007
e_1_3_2_61_2
e_1_3_2_82_2
e_1_3_2_80_2
Sergeev Alexander (e_1_3_2_76_2) 2018
e_1_3_2_15_2
e_1_3_2_38_2
e_1_3_2_8_2
e_1_3_2_6_2
Bustio-Martínez L. (e_1_3_2_16_2) 2021; 54
e_1_3_2_30_2
Chen Tianqi (e_1_3_2_19_2) 2015
Team Dask Development (e_1_3_2_27_2) 2016
e_1_3_2_32_2
e_1_3_2_74_2
e_1_3_2_11_2
e_1_3_2_34_2
e_1_3_2_57_2
Paulino N. (e_1_3_2_60_2) 2020; 53
e_1_3_2_4_2
e_1_3_2_36_2
e_1_3_2_78_2
e_1_3_2_70_2
Kristensen Mads R. B. (e_1_3_2_43_2) 2013
Bysiek Mateusz (e_1_3_2_17_2) 2018
Cid-Fuentes Javier Álvarez (e_1_3_2_88_2) 2019
References_xml – year: 2015
  ident: Bib0018
  article-title: MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems
  publication-title: arXiv preprint arXiv:1512.01274
– volume: 17
  start-page: 1235
  issue: 1
  year: 2016
  end-page: 1241
  ident: Bib0047
  article-title: MLlib: Machine learning in Apache Spark
  publication-title: Journal of Machine Learning Research
– year: 2022
  ident: Bib0048
  article-title: Pyston
– year: 2022
  ident: Bib0065
  article-title: PopularitY of Programming Language
– year: 2022
  ident: Bib0070
  article-title: PyPar
– start-page: 7
  year: 2004
  end-page: 16
  ident: Bib0008
  article-title: Code generation in the polyhedral model is easier than you think
  publication-title: Proceedings of the International Conference on Parallel Architecture and Compilation Techniques
  doi: 10.1109/PACT.2004.1342537
– start-page: 561
  year: 2018
  end-page: 577
  ident: Bib0050
  article-title: Ray: A distributed framework for emerging AI applications
  publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
– year: 2019
  ident: Bib0010
  article-title: Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures
  publication-title: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis
– year: 2022
  ident: Bib0021
  article-title: Celery—Distributed Task Queue
– volume: 12
  start-page: 100517
  year: 2020
  ident: Bib0032
  article-title: torcpy: Supporting task parallelism in Python
  publication-title: SoftwareX
– year: 2022
  ident: Bib0023
  article-title: NumExpr: Fast Numerical Expression Evaluator for NumPy
– year: 2022
  ident: Bib0080
  article-title: Using IPython for Parallel Computing
– year: 2014
  ident: Bib0009
  publication-title: Legion: Programming Distributed Heterogeneous Architectures with Logical Regions
– year: 2023
  ident: Bib0053
  article-title: IronPython
– year: 2016
  ident: Bib0004
  article-title: Theano: A Python framework for fast computation of mathematical expressions
  publication-title: arXiv e-prints arXiv:1605.02688
– volume: 4
  start-page: 352
  issue: 5
  year: 2013
  end-page: 359
  ident: Bib0072
  article-title: Playdoh: A lightweight Python library for distributed computing and optimisation
  publication-title: Journal of Computational Science
  doi: 10.1016/j.jocs.2011.06.002
– start-page: 229
  year: 2012
  end-page: 236
  ident: Bib0033
  article-title: A runtime library for platform-independent task parallelism
  publication-title: Proceedings of the Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
  doi: 10.1109/PDP.2012.89
– year: 2020
  ident: Bib0074
  article-title: XLA: Compiling Machine Learning for Peak Performance
– year: 2018
  ident: Bib0013
  article-title: JAX: Composable transformations of Python+NumPy programs
– volume: 26
  start-page: 345
  issue: 4
  year: 1994
  end-page: 420
  ident: Bib0007
  article-title: Compiler transformations for high-performance computing
  publication-title: ACM Computing Surveys
– year: 2022
  ident: Bib0022
  article-title: Shed Skin: An Experimental (Restricted-Python)-to-C++ Compiler
– year: 2015
  ident: Bib0019
  article-title: Keras
– start-page: 8024
  year: 2019
  end-page: 8035
  ident: Bib0058
  article-title: PyTorch: An imperative style, high-performance deep learning library
  publication-title: Advances in Neural Information Processing Systems
– year: 2022
  ident: Bib0066
  article-title: How Fast is PyPy3.9? Retrieved September 9, 2023 from
– volume: 51
  start-page: 107
  issue: 1
  year: 2008
  end-page: 113
  ident: Bib0028
  article-title: MapReduce: Simplified data processing on large clusters
  publication-title: Communications of the ACM
– start-page: 229
  year: 2021
  end-page: 238
  ident: Bib0005
  article-title: PyPacho: A Python library that implements parallel basic operations on GPUs
  publication-title: Proceedings of the IEEE Annual Information Technology, Electronics, and Mobile Communication Conference
  doi: 10.1109/IEMCON53756.2021.9623197
– volume: 54
  start-page: Article 179, 35 pages
  issue: 9
  year: 2021
  ident: Bib0015
  article-title: FPGA/GPU-based acceleration for frequent itemsets mining: A comprehensive review
  publication-title: ACM Computing Surveys
– volume: 23
  start-page: 77
  issue: 6
  year: 2021
  end-page: 80
  ident: Bib0046
  article-title: PyOMP: Multithreaded parallel programming in Python
  publication-title: Computing in Science & Engineering
  doi: 10.1109/MCSE.2021.3128806
– year: 2015
  ident: Bib0044
  article-title: Numba: A LLVM-based Python JIT compiler
  publication-title: Proceedings of the 2nd Workshop on the LLVM Compiler Infrastructure in HPC (LLVM’15)
  doi: 10.1145/2833157.2833162
– volume: 10
  start-page: 1
  year: 2015
  end-page: 8
  ident: Bib0002
  article-title: HOPE: A Python just-in-time compiler for astrophysical computations
  publication-title: Astronomy and Computing
  doi: 10.1016/j.ascom.2014.12.001
– volume: 65
  start-page: 1108
  issue: 9
  year: 2005
  end-page: 1115
  ident: Bib0025
  article-title: MPI for Python
  publication-title: Journal of Parallel and Distributed Computing
  doi: 10.1016/j.jpdc.2005.03.010
– year: 2022
  ident: Bib0027
  article-title: Datatable: Python Library for Manipulating Tabular Data
– volume: 34
  start-page: 659
  issue: 6
  year: 2020
  end-page: 675
  ident: Bib0067
  article-title: AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python
  publication-title: International Journal of High Performance Computing Applications
  doi: 10.1177/1094342020937050
– volume: 33
  start-page: 805
  issue: 4
  year: 2022
  end-page: 817
  ident: Bib0084
  article-title: Kokkos 3: Programming model extensions for the exascale era
  publication-title: IEEE Transactions on Parallel and Distributed Systems
  doi: 10.1109/TPDS.2021.3097283
– year: 2014
  ident: Bib0036
  article-title: Once you SCOOP, no need to fork
  publication-title: Proceedings of the Annual Conference on Extreme Science and Engineering Discovery Environment (XSEDE’14)
  doi: 10.1145/2616498.2616565
– start-page: 1718
  year: 2021
  end-page: 1731
  ident: Bib0077
  publication-title: Tuplex: Data Science in Python at Native Code Speed
  doi: 10.1145/3448016.3457244
– volume: 51
  start-page: Article 23, 35 pages
  issue: 2
  year: 2018
  ident: Bib0011
  article-title: A checkpoint of research on parallel I/O for high-performance computing
  publication-title: ACM Computing Surveys
– volume: abs/1603.04467
  year: 2016
  ident: Bib0001
  article-title: TensorFlow: Large-scale machine learning on heterogeneous distributed systems
  publication-title: CoRR
– start-page: 561
  year: 2018
  end-page: 577
  ident: Bib0049
  article-title: Ray: A distributed framework for emerging AI applications
  publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
– year: 2020
  ident: Bib0064
  article-title: Aesara
– start-page: 96
  year: 2019
  end-page: 105
  ident: Bib0087
  article-title: dislib: Large scale high performance machine learning in Python
  publication-title: Proceedings of the 15th International Conference on eScience
– year: 2007
  ident: Bib0012
  publication-title: PLuTo: A Practical and Fully Automatic Polyhedral Parallelizer and Locality Optimizer
– volume: 215
  start-page: 102759
  year: 2022
  ident: Bib0086
  article-title: Quantifying the interpretation overhead of Python
  publication-title: Science of Computer Programming
– volume: 3
  start-page: 1
  issue: 1
  year: 2016
  end-page: 34
  ident: Bib0078
  article-title: D2O: A distributed data object for parallel high-performance computing in Python
  publication-title: Journal of Big Data
– start-page: 185
  year: 2008
  end-page: 193
  ident: Bib0081
  article-title: COMP Superscalar: Bringing GRID Superscalar and GCM together
  publication-title: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid
  doi: 10.1109/CCGRID.2008.104
– year: 2022
  ident: Bib0083
  article-title: TIOBE Index
– year: 2022
  ident: Bib0079
  article-title: SymPy: Python Library for Symbolic Mathematics
– year: 2022
  ident: Bib0062
  article-title: Polars Lightning-Fast DataFrame Library for Rust and Python
– start-page: 467
  year: 2021
  end-page: 478
  ident: Bib0003
  article-title: A performance portability framework for Python
  publication-title: Proceedings of the ACM International Conference on Supercomputing
  doi: 10.1145/3447818.3460376
– start-page: 2033
  year: 2020
  end-page: 2046
  ident: Bib0061
  article-title: Towards scalable dataframe systems
  publication-title: Proceedings of the VLDB Endowment
  doi: 10.14778/3407790.3407807
– volume: 51
  start-page: Article 126, 36 pages
  issue: 6
  year: 2019
  ident: Bib0069
  article-title: A survey of communication performance models for high-performance computing
  publication-title: ACM Computing Surveys
– start-page: 151
  year: 2017
  ident: Bib0054
  article-title: CuPy: A NumPy-compatible library for NVIDIA GPU calculations
  publication-title: Proceedings of the Conference on Neural Information Processing Systems
– start-page: 645
  year: 2014
  end-page: 659
  ident: Bib0052
  article-title: Pydron: Semi-automatic parallelization for multi-core and the cloud
  publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
– year: 2022
  ident: Bib0038
  article-title: DistArray: Think Globally, Act Locally.
– year: 2013
  ident: Bib0042
  article-title: Bohrium: Unmodified NumPy code on CPU, GPU, and cluster
  publication-title: Proceedings of the Workshop on Python for High Performance and Scientific Computing
– start-page: 58
  year: 2019
  end-page: 72
  ident: Bib0076
  article-title: Pygion: Flexible, scalable task-based parallelism with Python
  publication-title: Proceedings of the IEEE/ACM Parallel Applications Workshop, Alternatives to MPI (PAW-ATM’19)
– volume: 12
  start-page: 2825
  year: 2011
  end-page: 2830
  ident: Bib0060
  article-title: Scikit-learn: Machine learning in Python
  publication-title: Journal of Machine Learning Research
– year: 2018
  ident: Bib0016
  publication-title: Towards Portable High Performance in Python: Transpilation, High-Level IR, Code Transformations and Compiler Directives
– start-page: 423
  year: 2018
  end-page: 433
  ident: Bib0030
  article-title: CharmPy: A Python parallel programming model
  publication-title: Proceedings of the IEEE International Conference on Cluster Computing
  doi: 10.1109/CLUSTER.2018.00059
– year: 2017
  ident: Bib0037
  article-title: Gloo
– year: 2022
  ident: Bib0055
  article-title: RAPIDS: Open GPU Data Science
– year: 2022
  ident: Bib0073
  article-title: PyViennaCL
– year: 2022
  ident: Bib0035
  article-title: Nuitka the Python Compiler
– volume: 130
  start-page: 136
  year: 2015
  ident: Bib0071
  article-title: Dask: Parallel computation with blocked algorithms and task scheduling
  publication-title: Proceedings of the Python in Science Conference
– volume: 618
  start-page: A13
  year: 2018
  ident: Bib0014
  article-title: Vaex: Big data exploration in the era of Gaia
  publication-title: Astronomy & Astrophysics
– year: 2020
  ident: Bib0029
  article-title: Cinder: Meta’s Internal Performance-Oriented Production Version of CPython 3.8
– volume: 31
  start-page: 66
  issue: 1
  year: 2017
  end-page: 82
  ident: Bib0082
  article-title: PyCOMPSs: Parallel computational workflows in Python
  publication-title: International Journal of High Performance Computing Applications
  doi: 10.1177/1094342015594678
– volume: 53
  start-page: Article 6, 36 pages
  issue: 1
  year: 2020
  ident: Bib0059
  article-title: Improving performance and energy consumption in embedded systems via binary acceleration: A survey
  publication-title: ACM Computing Surveys
– start-page: 25
  year: 2019
  end-page: 36
  ident: Bib0006
  article-title: Parsl: Pervasive parallel programming in Python
  publication-title: Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing
– start-page: 36
  year: 2018
  end-page: 47
  ident: Bib0039
  article-title: Quantitative overhead analysis for Python
  publication-title: Proceedings of the IEEE International Symposium on Workload Characterization
– start-page: 43
  year: 2020
  end-page: 56
  ident: Bib0051
  article-title: DelayRepay: Delayed execution for kernel fusion in Python
  publication-title: Proceedings of the ACM SIGPLAN International Symposium on Dynamic Languages (DLS’20)
  doi: 10.1145/3426422.3426980
– volume: 5
  issue: 1
  year: 2017
  ident: Bib0020
  article-title: Jug: Software for parallel reproducible computation in Python
  publication-title: Journal of Open Research Software
– start-page: 483
  year: 2022
  end-page: 490
  ident: Bib0017
  article-title: Parallelization of data science tasks, an experimental overview
  publication-title: Proceedings of the International Conference on Computing and Pattern Recognition
– year: 2022
  ident: Bib0056
  article-title: pandas-dev/pandas: Pandas
– year: 2015
  ident: Bib0045
  article-title: Pymp
– year: 2022
  ident: Bib0068
  article-title: cuDF—GPU DataFrame Library
– start-page: 91
  year: 1993
  end-page: 108
  ident: Bib0041
  article-title: Charm++: A portable concurrent object oriented system based on C++
  publication-title: Proceedings of the 8th Annual Conference on Object-Oriented Programming Systems, Languages, and Applications
– year: 2023
  ident: Bib0063
  article-title: Pyjion: A Drop-in JIT Compiler for Python
– start-page: 25
  year: 2019
  end-page: 34
  ident: Bib0040
  article-title: ALPyNA: Acceleration of loops in Python for novel architectures
  publication-title: Proceedings of the ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY’19)
  doi: 10.1145/3315454.3329956
– year: 2016
  ident: Bib0026
  publication-title: Dask: Library for Dynamic Task Scheduling
– volume: 17
  start-page: 261
  year: 2020
  end-page: 272
  ident: Bib0085
  article-title: SciPy 1.0: Fundamental algorithms for scientific computing in Python
  publication-title: Nature Methods
  doi: 10.1038/s41592-019-0686-2
– year: 2022
  ident: Bib0057
  article-title: Parallel Python
– year: 2010
  ident: Bib0043
  article-title: Numerical Python for scalable architectures
  publication-title: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model (PGAS’10)
  doi: 10.1145/2020373.2020388
– volume: 585
  start-page: 357
  issue: 7825
  year: 2020
  end-page: 362
  ident: Bib0034
  article-title: Array programming with NumPy
  publication-title: Nature
– volume: 5
  start-page: 46
  issue: 1
  year: 1998
  end-page: 55
  ident: Bib0024
  article-title: OpenMP: An industry-standard API for shared-memory programming
  publication-title: IEEE Computational Science and Engineering
  doi: 10.1109/99.660313
– year: 2018
  ident: Bib0075
  article-title: Horovod: Fast and easy distributed deep learning in TensorFlow
  publication-title: arXiv preprint arXiv:1802.05799
– volume: 8
  start-page: 014001
  issue: 1
  year: 2015
  ident: Bib0031
  article-title: Pythran: Enabling static optimization of scientific Python programs
  publication-title: Computational Science & Discovery
  doi: 10.1088/1749-4680/8/1/014001
– ident: e_1_3_2_31_2
  doi: 10.1109/CLUSTER.2018.00059
– ident: e_1_3_2_57_2
– start-page: 645
  volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
  year: 2014
  ident: e_1_3_2_53_2
– ident: e_1_3_2_20_2
– ident: e_1_3_2_65_2
– ident: e_1_3_2_49_2
– ident: e_1_3_2_8_2
  doi: 10.1145/197405.197406
– volume: 130
  start-page: 136
  volume-title: Proceedings of the Python in Science Conference
  year: 2015
  ident: e_1_3_2_72_2
– year: 2015
  ident: e_1_3_2_19_2
  article-title: MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems
  publication-title: arXiv preprint arXiv:1512.01274
– start-page: 96
  volume-title: Proceedings of the 15th International Conference on eScience
  year: 2019
  ident: e_1_3_2_88_2
– ident: e_1_3_2_4_2
  doi: 10.1145/3447818.3460376
– ident: e_1_3_2_56_2
– volume: 1603
  year: 2016
  ident: e_1_3_2_2_2
  article-title: TensorFlow: Large-scale machine learning on heterogeneous distributed systems
  publication-title: CoRR
– ident: e_1_3_2_33_2
  doi: 10.1016/j.softx.2020.100517
– ident: e_1_3_2_5_2
– ident: e_1_3_2_52_2
  doi: 10.1145/3426422.3426980
– ident: e_1_3_2_22_2
– ident: e_1_3_2_36_2
– ident: e_1_3_2_30_2
– volume-title: Proceedings of the Workshop on Python for High Performance and Scientific Computing
  year: 2013
  ident: e_1_3_2_43_2
– ident: e_1_3_2_42_2
  doi: 10.1145/165854.165874
– ident: e_1_3_2_39_2
– ident: e_1_3_2_26_2
  doi: 10.1016/j.jpdc.2005.03.010
– ident: e_1_3_2_14_2
– ident: e_1_3_2_28_2
– ident: e_1_3_2_3_2
  doi: 10.1016/j.ascom.2014.12.001
– ident: e_1_3_2_41_2
  doi: 10.1145/3315454.3329956
– start-page: 8024
  volume-title: Advances in Neural Information Processing Systems
  year: 2019
  ident: e_1_3_2_59_2
– ident: e_1_3_2_69_2
– ident: e_1_3_2_66_2
– ident: e_1_3_2_83_2
  doi: 10.1177/1094342015594678
– ident: e_1_3_2_62_2
  doi: 10.14778/3407790.3407807
– ident: e_1_3_2_63_2
– start-page: 561
  volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
  year: 2018
  ident: e_1_3_2_51_2
– ident: e_1_3_2_48_2
  doi: 10.5555/2946645.2946679
– ident: e_1_3_2_74_2
– ident: e_1_3_2_71_2
– volume: 51
  start-page: Article 23, 35
  issue: 2
  year: 2018
  ident: e_1_3_2_12_2
  article-title: A checkpoint of research on parallel I/O for high-performance computing
  publication-title: ACM Computing Surveys
– ident: e_1_3_2_32_2
  doi: 10.1088/1749-4680/8/1/014001
– ident: e_1_3_2_40_2
  doi: 10.1109/IISWC.2018.8573512
– ident: e_1_3_2_58_2
– ident: e_1_3_2_77_2
  doi: 10.1109/PAW-ATM49560.2019.00011
– volume-title: Dask: Library for Dynamic Task Scheduling
  year: 2016
  ident: e_1_3_2_27_2
– ident: e_1_3_2_68_2
  doi: 10.1177/1094342020937050
– ident: e_1_3_2_35_2
  doi: 10.1038/s41586-020-2649-2
– ident: e_1_3_2_54_2
– ident: e_1_3_2_86_2
  doi: 10.1038/s41592-019-0686-2
– ident: e_1_3_2_85_2
  doi: 10.1109/TPDS.2021.3097283
– ident: e_1_3_2_23_2
– ident: e_1_3_2_6_2
  doi: 10.1109/IEMCON53756.2021.9623197
– ident: e_1_3_2_25_2
  doi: 10.1109/99.660313
– ident: e_1_3_2_38_2
– ident: e_1_3_2_73_2
  doi: 10.1016/j.jocs.2011.06.002
– ident: e_1_3_2_45_2
  doi: 10.1145/2833157.2833162
– ident: e_1_3_2_46_2
– ident: e_1_3_2_78_2
  doi: 10.1145/3448016.3457244
– ident: e_1_3_2_9_2
  doi: 10.1109/PACT.2004.1342537
– ident: e_1_3_2_47_2
  doi: 10.1109/MCSE.2021.3128806
– ident: e_1_3_2_44_2
  doi: 10.1145/2020373.2020388
– start-page: 561
  volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation
  year: 2018
  ident: e_1_3_2_50_2
– ident: e_1_3_2_80_2
– ident: e_1_3_2_84_2
– volume-title: Legion: Programming Distributed Heterogeneous Architectures with Logical Regions
  year: 2014
  ident: e_1_3_2_10_2
– ident: e_1_3_2_70_2
  doi: 10.1145/3284358
– ident: e_1_3_2_79_2
  doi: 10.1186/s40537-016-0052-5
– ident: e_1_3_2_15_2
  doi: 10.1051/0004-6361/201732493
– ident: e_1_3_2_75_2
– year: 2018
  ident: e_1_3_2_76_2
  article-title: Horovod: Fast and easy distributed deep learning in TensorFlow
  publication-title: arXiv preprint arXiv:1802.05799
– ident: e_1_3_2_7_2
  doi: 10.1145/3307681.3325400
– ident: e_1_3_2_64_2
– ident: e_1_3_2_67_2
– volume: 53
  start-page: Article 6, 36 p
  issue: 1
  year: 2020
  ident: e_1_3_2_60_2
  article-title: Improving performance and energy consumption in embedded systems via binary acceleration: A survey
  publication-title: ACM Computing Surveys
– ident: e_1_3_2_11_2
  doi: 10.1145/3295500.3356173
– ident: e_1_3_2_61_2
  doi: 10.5555/1953048.2078195
– ident: e_1_3_2_34_2
  doi: 10.1109/PDP.2012.89
– volume: 54
  start-page: Article 179, 35
  issue: 9
  year: 2021
  ident: e_1_3_2_16_2
  article-title: FPGA/GPU-based acceleration for frequent itemsets mining: A comprehensive review
  publication-title: ACM Computing Surveys
– ident: e_1_3_2_81_2
– ident: e_1_3_2_37_2
  doi: 10.1145/2616498.2616565
– start-page: 151
  volume-title: Proceedings of the Conference on Neural Information Processing Systems
  year: 2017
  ident: e_1_3_2_55_2
– ident: e_1_3_2_24_2
– ident: e_1_3_2_18_2
  doi: 10.1145/3581807.3581878
– volume-title: PLuTo: A Practical and Fully Automatic Polyhedral Parallelizer and Locality Optimizer
  year: 2007
  ident: e_1_3_2_13_2
– ident: e_1_3_2_87_2
  doi: 10.1016/j.scico.2021.102759
– ident: e_1_3_2_29_2
  doi: 10.1145/1327452.1327492
– volume-title: Towards Portable High Performance in Python: Transpilation, High-Level IR, Code Transformations and Compiler Directives
  year: 2018
  ident: e_1_3_2_17_2
– ident: e_1_3_2_82_2
  doi: 10.1109/CCGRID.2008.104
– ident: e_1_3_2_21_2
  doi: 10.5334/jors.161
SSID ssj0002416
Score 2.4698331
Snippet Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily...
SourceID unpaywall
proquest
crossref
acm
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Algorithms
Computer science
Computing methodologies
Data science
Documents
Graphics processing units
Machine learning
Microprocessors
Parallel programming languages
Python
Software and its engineering
Software development techniques
SubjectTermsDisplay Computing methodologies -- Machine learning
Computing methodologies -- Parallel programming languages
Software and its engineering -- Software development techniques
Title Landscape of High-Performance Python to Develop Data Science and Machine Learning Applications
URI https://dl.acm.org/doi/10.1145/3617588
https://www.proquest.com/docview/2933529164
https://dl.acm.org/doi/pdf/10.1145/3617588
UnpaywallVersion publishedVersion
Volume 56
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1557-7341
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002416
  issn: 0360-0300
  databaseCode: ADMLS
  dateStart: 20040301
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwED50e9AXp1NxOkcE8a2ztU3WPhbnENnGwA3miyVNUx_cL1yHzL_eS5tuUxH2VspdKL1c7sLd9x3ANV79Q8EdalgsslSZUaBLeaYhYox9zOQiogo73Omyx4HzNKRDTZOjsDDRCNcZpyV85dOzKNaEtvTWxmBLXXcXioxi3l2A4qDb81-yYqRp4G5N0Y-UNowGns0ZQnZTUwUgMf4ZgNZZ5d5iMuPLTz4abQSYVimbVDRPeQlVX8l7fZGEdfH1i7Vxu28_hAOdZxI_2xhHsCMnZSjlMxyIduljeG0rrK_qgiLTmKiuD6O3xhKQ3lJxC5BkSnR3EWnyhOfqBHVJJ23HlEQztb4Rf6MmfgL91kP__tHQMxcMjp6fGA6TFvO8yDOZ5NzDACds3qAcbyWccouF-CQltxsitARlURSKEE2Nb0QYy9g-hcJkOpFnQJgrPVeY0kJ1R5rMpR6epR6NXDfmdkwrUMa_FcwyUo1A_6EK3OTWCYRmKVfDMkZBhqCma0GyEszX-CNSzc0baM-cB5jeYM6JSbFTgauVyf9b4nwLmQvYv8NMJwMqVqGQfCzkJWYqSViDot_stJ9reqt-AyH24y0
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1RS8MwEA66PeiL06k4nRJBfMtsbZO2j0MdQ9zYwwbzxZKkqQ923XAdMn-9lzbdpiL4VspdKL1c7sLd9x1CV3D1F5K7lNgssnWZUYJLBRaRMcQ-ZnEZUY0d7vVZd-Q-junY0ORoLEyUwDqTvISvfXoWxYbQlt44EGyp72-jKqOQd1dQddQftJ-LYqRFYLfm6EdKPeLB2VwgZDc1dQCSk-8BaJ1V7izSGV9-8CTZCDCdWjGpaJ7zEuq-krfWIhMt-fmDtfF_376P9kyeidvFxjhAWyqto1o5wwEblz5EL08a66u7oPA0xrrrgwzWWAI8WGpuAZxNsekuwvc846U6Bl3cy9sxFTZMra-4vVETP0LDzsPwrkvMzAXCwfMz4jJlsyCIAospzgMIcNLhHuVwK-GU20zAk1Lc8aSwJWVRJKQAU8MbKWIVO8eokk5TdYIw81XgS0vZoO4qi_k0gLM0oJHvx9yJaQPV4W-Fs4JUIzR_qIGuS-uE0rCU62EZSVggqOlaEK8EyzV-iTRL84bGM-chpDeQc0JS7DbQ5crkfy1x-g-ZM7R7C5lOAVRsokr2vlDnkKlk4sJs0S8QcOGZ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Landscape+of+High-Performance+Python+to+Develop+Data+Science+and+Machine+Learning+Applications&rft.jtitle=ACM+computing+surveys&rft.au=Castro%2C+Oscar&rft.au=Bruneau%2C+Pierrick&rft.au=Sottet%2C+Jean-S%C3%A9bastien&rft.au=Torregrossa%2C+Dario&rft.date=2024-03-31&rft.issn=0360-0300&rft.eissn=1557-7341&rft.volume=56&rft.issue=3&rft.spage=1&rft.epage=30&rft_id=info:doi/10.1145%2F3617588&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3617588
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0360-0300&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0360-0300&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0360-0300&client=summon