Landscape of High-Performance Python to Develop Data Science and Machine Learning Applications
Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes...
Saved in:
| Published in | ACM computing surveys Vol. 56; no. 3; pp. 1 - 30 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York, NY
ACM
31.03.2024
Association for Computing Machinery |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0360-0300 1557-7341 1557-7341 |
| DOI | 10.1145/3617588 |
Cover
| Abstract | Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes inevitable. Thus, harnessing high-performance devices such as multi-core processors and graphical processing units to their potential is generally not trivial. The present narrative survey can be thought of as a reference document for such practitioners to help them make their way in the wealth of tools and techniques available for the Python language. Our document revolves around user scenarios, which are meant to cover most situations they may face. We believe that this document may also be of practical use to tool developers, who may use our work to identify potential lacks in existing tools and help them motivate their contributions. |
|---|---|
| AbstractList | Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily experienced programmers. Although Python lets them quickly implement their algorithms, when moving at scale, computation efficiency becomes inevitable. Thus, harnessing high-performance devices such as multi-core processors and graphical processing units to their potential is generally not trivial. The present narrative survey can be thought of as a reference document for such practitioners to help them make their way in the wealth of tools and techniques available for the Python language. Our document revolves around user scenarios, which are meant to cover most situations they may face. We believe that this document may also be of practical use to tool developers, who may use our work to identify potential lacks in existing tools and help them motivate their contributions. |
| ArticleNumber | 65 |
| Author | Sottet, Jean-Sébastien Bruneau, Pierrick Torregrossa, Dario Castro, Oscar |
| Author_xml | – sequence: 1 givenname: Oscar orcidid: 0000-0003-4025-7903 surname: Castro fullname: Castro, Oscar email: oscar.castro@list.lu organization: Luxembourg Institute of Science and Technology, Luxembourg – sequence: 2 givenname: Pierrick orcidid: 0000-0002-7725-512X surname: Bruneau fullname: Bruneau, Pierrick email: pierrick.bruneau@list.lu organization: Luxembourg Institute of Science and Technology, Luxembourg – sequence: 3 givenname: Jean-Sébastien orcidid: 0000-0002-3071-6371 surname: Sottet fullname: Sottet, Jean-Sébastien email: jean-sebastien.sottet@list.lu organization: Luxembourg Institute of Science and Technology, Luxembourg – sequence: 4 givenname: Dario orcidid: 0000-0002-5863-1628 surname: Torregrossa fullname: Torregrossa, Dario email: dario_torregrossa@goodyear.com organization: Goodyear Innovation Center, Luxembourg |
| BookMark | eNp1kM9PwjAUxxuDiYDGu6cmHvQybenasSMBFROMJHJ2eeveoGS0sxsa_nuHQw9GT-_lfT_v17dHOtZZJOScsxvOQ3krFI_kcHhEulzKKIhEyDuky4RiAROMnZBeVa0ZY4OQqy55nYHNKg0lUpfTqVmugjn63PkNWI10vqtXztLa0Qm-Y-FKOoEa6Is2uJebXvoEemUs0hmCt8Yu6agsC6OhNs5Wp-Q4h6LCs0Psk8X93WI8DWbPD4_j0SwAwWQdhAq5iuMsZgoBYsGEFhBJiNQQJHCVNhkiiEinXEuVZalOIZRNRac55qJPrtuxW1vC7gOKIim92YDfJZwle1uSgy0NetmipXdvW6zqZO223jbHJYNYCDmIuQobKmgp7V1VecwTbeqvl2oPpvhj6tUv_v_9Fy0JevMDfYufTX6HHw |
| CitedBy_id | crossref_primary_10_1016_j_softx_2024_101897 crossref_primary_10_3390_astronomy3020009 crossref_primary_10_1007_s41870_023_01559_2 crossref_primary_10_1016_j_procs_2024_09_016 |
| Cites_doi | 10.1109/PACT.2004.1342537 10.1016/j.jocs.2011.06.002 10.1109/PDP.2012.89 10.1109/IEMCON53756.2021.9623197 10.1109/MCSE.2021.3128806 10.1145/2833157.2833162 10.1016/j.ascom.2014.12.001 10.1016/j.jpdc.2005.03.010 10.1177/1094342020937050 10.1109/TPDS.2021.3097283 10.1145/2616498.2616565 10.1145/3448016.3457244 10.1109/CCGRID.2008.104 10.1145/3447818.3460376 10.14778/3407790.3407807 10.1109/CLUSTER.2018.00059 10.1177/1094342015594678 10.1145/3426422.3426980 10.1145/3315454.3329956 10.1038/s41592-019-0686-2 10.1145/2020373.2020388 10.1109/99.660313 10.1088/1749-4680/8/1/014001 10.1145/197405.197406 10.1145/3447818.3460376 10.1016/j.softx.2020.100517 10.1145/3426422.3426980 10.1145/165854.165874 10.1145/3315454.3329956 10.5555/2946645.2946679 10.1109/IISWC.2018.8573512 10.1109/PAW-ATM49560.2019.00011 10.1038/s41586-020-2649-2 10.1145/2833157.2833162 10.1145/3448016.3457244 10.1145/2020373.2020388 10.1145/3284358 10.1186/s40537-016-0052-5 10.1051/0004-6361/201732493 10.1145/3307681.3325400 10.1145/3295500.3356173 10.5555/1953048.2078195 10.1145/2616498.2616565 10.1145/3581807.3581878 10.1016/j.scico.2021.102759 10.1145/1327452.1327492 10.5334/jors.161 |
| ContentType | Journal Article |
| Copyright | Copyright held by the owner/author(s). Copyright Association for Computing Machinery Mar 2024 |
| Copyright_xml | – notice: Copyright held by the owner/author(s). – notice: Copyright Association for Computing Machinery Mar 2024 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D ADTOC UNPAY |
| DOI | 10.1145/3617588 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts CrossRef |
| Database_xml | – sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| DocumentTitleAlternate | Landscape of High-Performance Python |
| EISSN | 1557-7341 |
| EndPage | 30 |
| ExternalDocumentID | 10.1145/3617588 10_1145_3617588 3617588 |
| GroupedDBID | --Z -DZ -~X .4S .DC 23M 4.4 5GY 5VS 6J9 85S 8US 8VB AAIKC AAKMM AALFJ AAMNW AAYFX ABPPZ ACGFO ACGOD ACM ACNCT ADBCU ADL ADMLS ADPZR AEBYY AEGXH AEMOZ AENEX AENSD AFWIH AFWXC AGHSJ AHQJS AIAGR AIKLT AKVCP ALMA_UNASSIGNED_HOLDINGS ARCSS ASPBG AVWKF BDXCO CCLIF CS3 EBE EBR EBU EDO EMK FEDTE GUFHI HGAVV H~9 IAO ICD IEA IGS IOF K1G LHSKQ N95 P1C P2P PQQKQ QWB RNS ROL RXW TAE TH9 U5U UKR UPT VQA W7O WH7 X6Y XH6 XSW XZL YXB Z5M ZCA ZL0 77I AAYXX AEFXT AEJOY AETEA AKRVB CITATION 7SC 8FD JQ2 L7M L~C L~D 41~ 4R4 9M8 AAFWJ ACBNA ADMHC ADTOC ADXHL AFFNX AI. BAAKF EBS EJD HF~ ITC MVM OHT TAF UNPAY VH1 XJT XOL YR5 ZCG |
| ID | FETCH-LOGICAL-a305t-46e1699d906eaa9303c3a75a768a5a16ba76eea37cb1c56ddbcba45ea3cbfef3 |
| IEDL.DBID | UNPAY |
| ISSN | 0360-0300 1557-7341 |
| IngestDate | Tue Aug 19 17:23:54 EDT 2025 Tue Aug 12 18:16:02 EDT 2025 Wed Oct 01 05:53:07 EDT 2025 Thu Apr 24 22:54:13 EDT 2025 Fri Feb 21 01:28:38 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | data science code acceleration Python |
| Language | English |
| License | This work is licensed under a Creative Commons Attribution International 4.0 License. cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a305t-46e1699d906eaa9303c3a75a768a5a16ba76eea37cb1c56ddbcba45ea3cbfef3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-7725-512X 0000-0003-4025-7903 0000-0002-3071-6371 0000-0002-5863-1628 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://dl.acm.org/doi/pdf/10.1145/3617588 |
| PQID | 2933529164 |
| PQPubID | 47570 |
| PageCount | 30 |
| ParticipantIDs | unpaywall_primary_10_1145_3617588 proquest_journals_2933529164 crossref_citationtrail_10_1145_3617588 crossref_primary_10_1145_3617588 acm_primary_3617588 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2024-03-31 |
| PublicationDateYYYYMMDD | 2024-03-31 |
| PublicationDate_xml | – month: 03 year: 2024 text: 2024-03-31 day: 31 |
| PublicationDecade | 2020 |
| PublicationPlace | New York, NY |
| PublicationPlace_xml | – name: New York, NY – name: Baltimore |
| PublicationTitle | ACM computing surveys |
| PublicationTitleAbbrev | ACM CSUR |
| PublicationYear | 2024 |
| Publisher | ACM Association for Computing Machinery |
| Publisher_xml | – name: ACM – name: Association for Computing Machinery |
| References | (Bib0059) 2020; 53 (Bib0075) 2018 (Bib0060) 2011; 12 (Bib0082) 2017; 31 (Bib0045) 2015 (Bib0009) 2014 (Bib0008) 2004 (Bib0029) 2020 (Bib0079) 2022 (Bib0014) 2018; 618 (Bib0042) 2013 (Bib0052) 2014 (Bib0003) 2021 (Bib0023) 2022 (Bib0025) 2005; 65 (Bib0041) 1993 (Bib0053) 2023 (Bib0056) 2022 (Bib0066) 2022 (Bib0081) 2008 (Bib0083) 2022 (Bib0085) 2020; 17 (Bib0036) 2014 (Bib0035) 2022 (Bib0055) 2022 (Bib0063) 2023 (Bib0078) 2016; 3 (Bib0044) 2015 (Bib0011) 2018; 51 (Bib0013) 2018 (Bib0047) 2016; 17 (Bib0024) 1998; 5 (Bib0080) 2022 (Bib0086) 2022; 215 (Bib0019) 2015 (Bib0033) 2012 (Bib0046) 2021; 23 (Bib0002) 2015; 10 (Bib0007) 1994; 26 (Bib0071) 2015; 130 (Bib0039) 2018 (Bib0027) 2022 (Bib0016) 2018 (Bib0076) 2019 (Bib0064) 2020 (Bib0001) 2016; abs/1603.04467 (Bib0037) 2017 (Bib0030) 2018 (Bib0050) 2018 (Bib0072) 2013; 4 (Bib0010) 2019 (Bib0020) 2017; 5 (Bib0040) 2019 (Bib0021) 2022 (Bib0074) 2020 (Bib0004) 2016 (Bib0015) 2021; 54 (Bib0038) 2022 (Bib0048) 2022 (Bib0051) 2020 (Bib0069) 2019; 51 (Bib0087) 2019 (Bib0061) 2020 (Bib0068) 2022 (Bib0031) 2015; 8 (Bib0067) 2020; 34 (Bib0032) 2020; 12 (Bib0012) 2007 (Bib0073) 2022 (Bib0054) 2017 (Bib0018) 2015 (Bib0005) 2021 (Bib0065) 2022 (Bib0084) 2022; 33 (Bib0058) 2019 (Bib0028) 2008; 51 (Bib0070) 2022 (Bib0057) 2022 (Bib0049) 2018 (Bib0043) 2010 (Bib0006) 2019 (Bib0017) 2022 (Bib0022) 2022 (Bib0034) 2020; 585 (Bib0077) 2021 (Bib0062) 2022 (Bib0026) 2016 e_1_3_2_26_2 e_1_3_2_49_2 e_1_3_2_28_2 Moritz Philipp (e_1_3_2_50_2) 2018 e_1_3_2_41_2 e_1_3_2_64_2 e_1_3_2_87_2 Rocklin Matthew (e_1_3_2_72_2) 2015; 130 e_1_3_2_20_2 e_1_3_2_62_2 e_1_3_2_85_2 e_1_3_2_22_2 e_1_3_2_45_2 e_1_3_2_68_2 e_1_3_2_24_2 e_1_3_2_47_2 e_1_3_2_66_2 Abadi Martín (e_1_3_2_2_2) 2016; 1603 Paszke A. (e_1_3_2_59_2) 2019 e_1_3_2_83_2 e_1_3_2_81_2 e_1_3_2_9_2 e_1_3_2_37_2 e_1_3_2_7_2 e_1_3_2_18_2 e_1_3_2_39_2 Boito F. Z. (e_1_3_2_12_2) 2018; 51 Nishino Royud (e_1_3_2_55_2) 2017 e_1_3_2_54_2 e_1_3_2_75_2 e_1_3_2_31_2 e_1_3_2_52_2 e_1_3_2_73_2 e_1_3_2_5_2 e_1_3_2_33_2 e_1_3_2_58_2 e_1_3_2_79_2 e_1_3_2_3_2 e_1_3_2_14_2 e_1_3_2_35_2 e_1_3_2_56_2 e_1_3_2_77_2 e_1_3_2_71_2 Bauer Michael Edward (e_1_3_2_10_2) 2014 e_1_3_2_48_2 e_1_3_2_29_2 e_1_3_2_40_2 Müller Stefan C. (e_1_3_2_53_2) 2014 e_1_3_2_65_2 e_1_3_2_86_2 Moritz Philipp (e_1_3_2_51_2) 2018 e_1_3_2_21_2 e_1_3_2_42_2 e_1_3_2_63_2 e_1_3_2_84_2 e_1_3_2_23_2 e_1_3_2_44_2 e_1_3_2_69_2 e_1_3_2_25_2 e_1_3_2_46_2 e_1_3_2_67_2 Bondhugula Uday (e_1_3_2_13_2) 2007 e_1_3_2_61_2 e_1_3_2_82_2 e_1_3_2_80_2 Sergeev Alexander (e_1_3_2_76_2) 2018 e_1_3_2_15_2 e_1_3_2_38_2 e_1_3_2_8_2 e_1_3_2_6_2 Bustio-Martínez L. (e_1_3_2_16_2) 2021; 54 e_1_3_2_30_2 Chen Tianqi (e_1_3_2_19_2) 2015 Team Dask Development (e_1_3_2_27_2) 2016 e_1_3_2_32_2 e_1_3_2_74_2 e_1_3_2_11_2 e_1_3_2_34_2 e_1_3_2_57_2 Paulino N. (e_1_3_2_60_2) 2020; 53 e_1_3_2_4_2 e_1_3_2_36_2 e_1_3_2_78_2 e_1_3_2_70_2 Kristensen Mads R. B. (e_1_3_2_43_2) 2013 Bysiek Mateusz (e_1_3_2_17_2) 2018 Cid-Fuentes Javier Álvarez (e_1_3_2_88_2) 2019 |
| References_xml | – year: 2015 ident: Bib0018 article-title: MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems publication-title: arXiv preprint arXiv:1512.01274 – volume: 17 start-page: 1235 issue: 1 year: 2016 end-page: 1241 ident: Bib0047 article-title: MLlib: Machine learning in Apache Spark publication-title: Journal of Machine Learning Research – year: 2022 ident: Bib0048 article-title: Pyston – year: 2022 ident: Bib0065 article-title: PopularitY of Programming Language – year: 2022 ident: Bib0070 article-title: PyPar – start-page: 7 year: 2004 end-page: 16 ident: Bib0008 article-title: Code generation in the polyhedral model is easier than you think publication-title: Proceedings of the International Conference on Parallel Architecture and Compilation Techniques doi: 10.1109/PACT.2004.1342537 – start-page: 561 year: 2018 end-page: 577 ident: Bib0050 article-title: Ray: A distributed framework for emerging AI applications publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation – year: 2019 ident: Bib0010 article-title: Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures publication-title: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis – year: 2022 ident: Bib0021 article-title: Celery—Distributed Task Queue – volume: 12 start-page: 100517 year: 2020 ident: Bib0032 article-title: torcpy: Supporting task parallelism in Python publication-title: SoftwareX – year: 2022 ident: Bib0023 article-title: NumExpr: Fast Numerical Expression Evaluator for NumPy – year: 2022 ident: Bib0080 article-title: Using IPython for Parallel Computing – year: 2014 ident: Bib0009 publication-title: Legion: Programming Distributed Heterogeneous Architectures with Logical Regions – year: 2023 ident: Bib0053 article-title: IronPython – year: 2016 ident: Bib0004 article-title: Theano: A Python framework for fast computation of mathematical expressions publication-title: arXiv e-prints arXiv:1605.02688 – volume: 4 start-page: 352 issue: 5 year: 2013 end-page: 359 ident: Bib0072 article-title: Playdoh: A lightweight Python library for distributed computing and optimisation publication-title: Journal of Computational Science doi: 10.1016/j.jocs.2011.06.002 – start-page: 229 year: 2012 end-page: 236 ident: Bib0033 article-title: A runtime library for platform-independent task parallelism publication-title: Proceedings of the Euromicro International Conference on Parallel, Distributed, and Network-Based Processing doi: 10.1109/PDP.2012.89 – year: 2020 ident: Bib0074 article-title: XLA: Compiling Machine Learning for Peak Performance – year: 2018 ident: Bib0013 article-title: JAX: Composable transformations of Python+NumPy programs – volume: 26 start-page: 345 issue: 4 year: 1994 end-page: 420 ident: Bib0007 article-title: Compiler transformations for high-performance computing publication-title: ACM Computing Surveys – year: 2022 ident: Bib0022 article-title: Shed Skin: An Experimental (Restricted-Python)-to-C++ Compiler – year: 2015 ident: Bib0019 article-title: Keras – start-page: 8024 year: 2019 end-page: 8035 ident: Bib0058 article-title: PyTorch: An imperative style, high-performance deep learning library publication-title: Advances in Neural Information Processing Systems – year: 2022 ident: Bib0066 article-title: How Fast is PyPy3.9? Retrieved September 9, 2023 from – volume: 51 start-page: 107 issue: 1 year: 2008 end-page: 113 ident: Bib0028 article-title: MapReduce: Simplified data processing on large clusters publication-title: Communications of the ACM – start-page: 229 year: 2021 end-page: 238 ident: Bib0005 article-title: PyPacho: A Python library that implements parallel basic operations on GPUs publication-title: Proceedings of the IEEE Annual Information Technology, Electronics, and Mobile Communication Conference doi: 10.1109/IEMCON53756.2021.9623197 – volume: 54 start-page: Article 179, 35 pages issue: 9 year: 2021 ident: Bib0015 article-title: FPGA/GPU-based acceleration for frequent itemsets mining: A comprehensive review publication-title: ACM Computing Surveys – volume: 23 start-page: 77 issue: 6 year: 2021 end-page: 80 ident: Bib0046 article-title: PyOMP: Multithreaded parallel programming in Python publication-title: Computing in Science & Engineering doi: 10.1109/MCSE.2021.3128806 – year: 2015 ident: Bib0044 article-title: Numba: A LLVM-based Python JIT compiler publication-title: Proceedings of the 2nd Workshop on the LLVM Compiler Infrastructure in HPC (LLVM’15) doi: 10.1145/2833157.2833162 – volume: 10 start-page: 1 year: 2015 end-page: 8 ident: Bib0002 article-title: HOPE: A Python just-in-time compiler for astrophysical computations publication-title: Astronomy and Computing doi: 10.1016/j.ascom.2014.12.001 – volume: 65 start-page: 1108 issue: 9 year: 2005 end-page: 1115 ident: Bib0025 article-title: MPI for Python publication-title: Journal of Parallel and Distributed Computing doi: 10.1016/j.jpdc.2005.03.010 – year: 2022 ident: Bib0027 article-title: Datatable: Python Library for Manipulating Tabular Data – volume: 34 start-page: 659 issue: 6 year: 2020 end-page: 675 ident: Bib0067 article-title: AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python publication-title: International Journal of High Performance Computing Applications doi: 10.1177/1094342020937050 – volume: 33 start-page: 805 issue: 4 year: 2022 end-page: 817 ident: Bib0084 article-title: Kokkos 3: Programming model extensions for the exascale era publication-title: IEEE Transactions on Parallel and Distributed Systems doi: 10.1109/TPDS.2021.3097283 – year: 2014 ident: Bib0036 article-title: Once you SCOOP, no need to fork publication-title: Proceedings of the Annual Conference on Extreme Science and Engineering Discovery Environment (XSEDE’14) doi: 10.1145/2616498.2616565 – start-page: 1718 year: 2021 end-page: 1731 ident: Bib0077 publication-title: Tuplex: Data Science in Python at Native Code Speed doi: 10.1145/3448016.3457244 – volume: 51 start-page: Article 23, 35 pages issue: 2 year: 2018 ident: Bib0011 article-title: A checkpoint of research on parallel I/O for high-performance computing publication-title: ACM Computing Surveys – volume: abs/1603.04467 year: 2016 ident: Bib0001 article-title: TensorFlow: Large-scale machine learning on heterogeneous distributed systems publication-title: CoRR – start-page: 561 year: 2018 end-page: 577 ident: Bib0049 article-title: Ray: A distributed framework for emerging AI applications publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation – year: 2020 ident: Bib0064 article-title: Aesara – start-page: 96 year: 2019 end-page: 105 ident: Bib0087 article-title: dislib: Large scale high performance machine learning in Python publication-title: Proceedings of the 15th International Conference on eScience – year: 2007 ident: Bib0012 publication-title: PLuTo: A Practical and Fully Automatic Polyhedral Parallelizer and Locality Optimizer – volume: 215 start-page: 102759 year: 2022 ident: Bib0086 article-title: Quantifying the interpretation overhead of Python publication-title: Science of Computer Programming – volume: 3 start-page: 1 issue: 1 year: 2016 end-page: 34 ident: Bib0078 article-title: D2O: A distributed data object for parallel high-performance computing in Python publication-title: Journal of Big Data – start-page: 185 year: 2008 end-page: 193 ident: Bib0081 article-title: COMP Superscalar: Bringing GRID Superscalar and GCM together publication-title: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid doi: 10.1109/CCGRID.2008.104 – year: 2022 ident: Bib0083 article-title: TIOBE Index – year: 2022 ident: Bib0079 article-title: SymPy: Python Library for Symbolic Mathematics – year: 2022 ident: Bib0062 article-title: Polars Lightning-Fast DataFrame Library for Rust and Python – start-page: 467 year: 2021 end-page: 478 ident: Bib0003 article-title: A performance portability framework for Python publication-title: Proceedings of the ACM International Conference on Supercomputing doi: 10.1145/3447818.3460376 – start-page: 2033 year: 2020 end-page: 2046 ident: Bib0061 article-title: Towards scalable dataframe systems publication-title: Proceedings of the VLDB Endowment doi: 10.14778/3407790.3407807 – volume: 51 start-page: Article 126, 36 pages issue: 6 year: 2019 ident: Bib0069 article-title: A survey of communication performance models for high-performance computing publication-title: ACM Computing Surveys – start-page: 151 year: 2017 ident: Bib0054 article-title: CuPy: A NumPy-compatible library for NVIDIA GPU calculations publication-title: Proceedings of the Conference on Neural Information Processing Systems – start-page: 645 year: 2014 end-page: 659 ident: Bib0052 article-title: Pydron: Semi-automatic parallelization for multi-core and the cloud publication-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation – year: 2022 ident: Bib0038 article-title: DistArray: Think Globally, Act Locally. – year: 2013 ident: Bib0042 article-title: Bohrium: Unmodified NumPy code on CPU, GPU, and cluster publication-title: Proceedings of the Workshop on Python for High Performance and Scientific Computing – start-page: 58 year: 2019 end-page: 72 ident: Bib0076 article-title: Pygion: Flexible, scalable task-based parallelism with Python publication-title: Proceedings of the IEEE/ACM Parallel Applications Workshop, Alternatives to MPI (PAW-ATM’19) – volume: 12 start-page: 2825 year: 2011 end-page: 2830 ident: Bib0060 article-title: Scikit-learn: Machine learning in Python publication-title: Journal of Machine Learning Research – year: 2018 ident: Bib0016 publication-title: Towards Portable High Performance in Python: Transpilation, High-Level IR, Code Transformations and Compiler Directives – start-page: 423 year: 2018 end-page: 433 ident: Bib0030 article-title: CharmPy: A Python parallel programming model publication-title: Proceedings of the IEEE International Conference on Cluster Computing doi: 10.1109/CLUSTER.2018.00059 – year: 2017 ident: Bib0037 article-title: Gloo – year: 2022 ident: Bib0055 article-title: RAPIDS: Open GPU Data Science – year: 2022 ident: Bib0073 article-title: PyViennaCL – year: 2022 ident: Bib0035 article-title: Nuitka the Python Compiler – volume: 130 start-page: 136 year: 2015 ident: Bib0071 article-title: Dask: Parallel computation with blocked algorithms and task scheduling publication-title: Proceedings of the Python in Science Conference – volume: 618 start-page: A13 year: 2018 ident: Bib0014 article-title: Vaex: Big data exploration in the era of Gaia publication-title: Astronomy & Astrophysics – year: 2020 ident: Bib0029 article-title: Cinder: Meta’s Internal Performance-Oriented Production Version of CPython 3.8 – volume: 31 start-page: 66 issue: 1 year: 2017 end-page: 82 ident: Bib0082 article-title: PyCOMPSs: Parallel computational workflows in Python publication-title: International Journal of High Performance Computing Applications doi: 10.1177/1094342015594678 – volume: 53 start-page: Article 6, 36 pages issue: 1 year: 2020 ident: Bib0059 article-title: Improving performance and energy consumption in embedded systems via binary acceleration: A survey publication-title: ACM Computing Surveys – start-page: 25 year: 2019 end-page: 36 ident: Bib0006 article-title: Parsl: Pervasive parallel programming in Python publication-title: Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing – start-page: 36 year: 2018 end-page: 47 ident: Bib0039 article-title: Quantitative overhead analysis for Python publication-title: Proceedings of the IEEE International Symposium on Workload Characterization – start-page: 43 year: 2020 end-page: 56 ident: Bib0051 article-title: DelayRepay: Delayed execution for kernel fusion in Python publication-title: Proceedings of the ACM SIGPLAN International Symposium on Dynamic Languages (DLS’20) doi: 10.1145/3426422.3426980 – volume: 5 issue: 1 year: 2017 ident: Bib0020 article-title: Jug: Software for parallel reproducible computation in Python publication-title: Journal of Open Research Software – start-page: 483 year: 2022 end-page: 490 ident: Bib0017 article-title: Parallelization of data science tasks, an experimental overview publication-title: Proceedings of the International Conference on Computing and Pattern Recognition – year: 2022 ident: Bib0056 article-title: pandas-dev/pandas: Pandas – year: 2015 ident: Bib0045 article-title: Pymp – year: 2022 ident: Bib0068 article-title: cuDF—GPU DataFrame Library – start-page: 91 year: 1993 end-page: 108 ident: Bib0041 article-title: Charm++: A portable concurrent object oriented system based on C++ publication-title: Proceedings of the 8th Annual Conference on Object-Oriented Programming Systems, Languages, and Applications – year: 2023 ident: Bib0063 article-title: Pyjion: A Drop-in JIT Compiler for Python – start-page: 25 year: 2019 end-page: 34 ident: Bib0040 article-title: ALPyNA: Acceleration of loops in Python for novel architectures publication-title: Proceedings of the ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY’19) doi: 10.1145/3315454.3329956 – year: 2016 ident: Bib0026 publication-title: Dask: Library for Dynamic Task Scheduling – volume: 17 start-page: 261 year: 2020 end-page: 272 ident: Bib0085 article-title: SciPy 1.0: Fundamental algorithms for scientific computing in Python publication-title: Nature Methods doi: 10.1038/s41592-019-0686-2 – year: 2022 ident: Bib0057 article-title: Parallel Python – year: 2010 ident: Bib0043 article-title: Numerical Python for scalable architectures publication-title: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model (PGAS’10) doi: 10.1145/2020373.2020388 – volume: 585 start-page: 357 issue: 7825 year: 2020 end-page: 362 ident: Bib0034 article-title: Array programming with NumPy publication-title: Nature – volume: 5 start-page: 46 issue: 1 year: 1998 end-page: 55 ident: Bib0024 article-title: OpenMP: An industry-standard API for shared-memory programming publication-title: IEEE Computational Science and Engineering doi: 10.1109/99.660313 – year: 2018 ident: Bib0075 article-title: Horovod: Fast and easy distributed deep learning in TensorFlow publication-title: arXiv preprint arXiv:1802.05799 – volume: 8 start-page: 014001 issue: 1 year: 2015 ident: Bib0031 article-title: Pythran: Enabling static optimization of scientific Python programs publication-title: Computational Science & Discovery doi: 10.1088/1749-4680/8/1/014001 – ident: e_1_3_2_31_2 doi: 10.1109/CLUSTER.2018.00059 – ident: e_1_3_2_57_2 – start-page: 645 volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation year: 2014 ident: e_1_3_2_53_2 – ident: e_1_3_2_20_2 – ident: e_1_3_2_65_2 – ident: e_1_3_2_49_2 – ident: e_1_3_2_8_2 doi: 10.1145/197405.197406 – volume: 130 start-page: 136 volume-title: Proceedings of the Python in Science Conference year: 2015 ident: e_1_3_2_72_2 – year: 2015 ident: e_1_3_2_19_2 article-title: MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems publication-title: arXiv preprint arXiv:1512.01274 – start-page: 96 volume-title: Proceedings of the 15th International Conference on eScience year: 2019 ident: e_1_3_2_88_2 – ident: e_1_3_2_4_2 doi: 10.1145/3447818.3460376 – ident: e_1_3_2_56_2 – volume: 1603 year: 2016 ident: e_1_3_2_2_2 article-title: TensorFlow: Large-scale machine learning on heterogeneous distributed systems publication-title: CoRR – ident: e_1_3_2_33_2 doi: 10.1016/j.softx.2020.100517 – ident: e_1_3_2_5_2 – ident: e_1_3_2_52_2 doi: 10.1145/3426422.3426980 – ident: e_1_3_2_22_2 – ident: e_1_3_2_36_2 – ident: e_1_3_2_30_2 – volume-title: Proceedings of the Workshop on Python for High Performance and Scientific Computing year: 2013 ident: e_1_3_2_43_2 – ident: e_1_3_2_42_2 doi: 10.1145/165854.165874 – ident: e_1_3_2_39_2 – ident: e_1_3_2_26_2 doi: 10.1016/j.jpdc.2005.03.010 – ident: e_1_3_2_14_2 – ident: e_1_3_2_28_2 – ident: e_1_3_2_3_2 doi: 10.1016/j.ascom.2014.12.001 – ident: e_1_3_2_41_2 doi: 10.1145/3315454.3329956 – start-page: 8024 volume-title: Advances in Neural Information Processing Systems year: 2019 ident: e_1_3_2_59_2 – ident: e_1_3_2_69_2 – ident: e_1_3_2_66_2 – ident: e_1_3_2_83_2 doi: 10.1177/1094342015594678 – ident: e_1_3_2_62_2 doi: 10.14778/3407790.3407807 – ident: e_1_3_2_63_2 – start-page: 561 volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation year: 2018 ident: e_1_3_2_51_2 – ident: e_1_3_2_48_2 doi: 10.5555/2946645.2946679 – ident: e_1_3_2_74_2 – ident: e_1_3_2_71_2 – volume: 51 start-page: Article 23, 35 issue: 2 year: 2018 ident: e_1_3_2_12_2 article-title: A checkpoint of research on parallel I/O for high-performance computing publication-title: ACM Computing Surveys – ident: e_1_3_2_32_2 doi: 10.1088/1749-4680/8/1/014001 – ident: e_1_3_2_40_2 doi: 10.1109/IISWC.2018.8573512 – ident: e_1_3_2_58_2 – ident: e_1_3_2_77_2 doi: 10.1109/PAW-ATM49560.2019.00011 – volume-title: Dask: Library for Dynamic Task Scheduling year: 2016 ident: e_1_3_2_27_2 – ident: e_1_3_2_68_2 doi: 10.1177/1094342020937050 – ident: e_1_3_2_35_2 doi: 10.1038/s41586-020-2649-2 – ident: e_1_3_2_54_2 – ident: e_1_3_2_86_2 doi: 10.1038/s41592-019-0686-2 – ident: e_1_3_2_85_2 doi: 10.1109/TPDS.2021.3097283 – ident: e_1_3_2_23_2 – ident: e_1_3_2_6_2 doi: 10.1109/IEMCON53756.2021.9623197 – ident: e_1_3_2_25_2 doi: 10.1109/99.660313 – ident: e_1_3_2_38_2 – ident: e_1_3_2_73_2 doi: 10.1016/j.jocs.2011.06.002 – ident: e_1_3_2_45_2 doi: 10.1145/2833157.2833162 – ident: e_1_3_2_46_2 – ident: e_1_3_2_78_2 doi: 10.1145/3448016.3457244 – ident: e_1_3_2_9_2 doi: 10.1109/PACT.2004.1342537 – ident: e_1_3_2_47_2 doi: 10.1109/MCSE.2021.3128806 – ident: e_1_3_2_44_2 doi: 10.1145/2020373.2020388 – start-page: 561 volume-title: Proceedings of the USENIX Symposium on Operating Systems Design and Implementation year: 2018 ident: e_1_3_2_50_2 – ident: e_1_3_2_80_2 – ident: e_1_3_2_84_2 – volume-title: Legion: Programming Distributed Heterogeneous Architectures with Logical Regions year: 2014 ident: e_1_3_2_10_2 – ident: e_1_3_2_70_2 doi: 10.1145/3284358 – ident: e_1_3_2_79_2 doi: 10.1186/s40537-016-0052-5 – ident: e_1_3_2_15_2 doi: 10.1051/0004-6361/201732493 – ident: e_1_3_2_75_2 – year: 2018 ident: e_1_3_2_76_2 article-title: Horovod: Fast and easy distributed deep learning in TensorFlow publication-title: arXiv preprint arXiv:1802.05799 – ident: e_1_3_2_7_2 doi: 10.1145/3307681.3325400 – ident: e_1_3_2_64_2 – ident: e_1_3_2_67_2 – volume: 53 start-page: Article 6, 36 p issue: 1 year: 2020 ident: e_1_3_2_60_2 article-title: Improving performance and energy consumption in embedded systems via binary acceleration: A survey publication-title: ACM Computing Surveys – ident: e_1_3_2_11_2 doi: 10.1145/3295500.3356173 – ident: e_1_3_2_61_2 doi: 10.5555/1953048.2078195 – ident: e_1_3_2_34_2 doi: 10.1109/PDP.2012.89 – volume: 54 start-page: Article 179, 35 issue: 9 year: 2021 ident: e_1_3_2_16_2 article-title: FPGA/GPU-based acceleration for frequent itemsets mining: A comprehensive review publication-title: ACM Computing Surveys – ident: e_1_3_2_81_2 – ident: e_1_3_2_37_2 doi: 10.1145/2616498.2616565 – start-page: 151 volume-title: Proceedings of the Conference on Neural Information Processing Systems year: 2017 ident: e_1_3_2_55_2 – ident: e_1_3_2_24_2 – ident: e_1_3_2_18_2 doi: 10.1145/3581807.3581878 – volume-title: PLuTo: A Practical and Fully Automatic Polyhedral Parallelizer and Locality Optimizer year: 2007 ident: e_1_3_2_13_2 – ident: e_1_3_2_87_2 doi: 10.1016/j.scico.2021.102759 – ident: e_1_3_2_29_2 doi: 10.1145/1327452.1327492 – volume-title: Towards Portable High Performance in Python: Transpilation, High-Level IR, Code Transformations and Compiler Directives year: 2018 ident: e_1_3_2_17_2 – ident: e_1_3_2_82_2 doi: 10.1109/CCGRID.2008.104 – ident: e_1_3_2_21_2 doi: 10.5334/jors.161 |
| SSID | ssj0002416 |
| Score | 2.4698331 |
| Snippet | Python has become the prime language for application development in the data science and machine learning domains. However, data scientists are not necessarily... |
| SourceID | unpaywall proquest crossref acm |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Algorithms Computer science Computing methodologies Data science Documents Graphics processing units Machine learning Microprocessors Parallel programming languages Python Software and its engineering Software development techniques |
| SubjectTermsDisplay | Computing methodologies -- Machine learning Computing methodologies -- Parallel programming languages Software and its engineering -- Software development techniques |
| Title | Landscape of High-Performance Python to Develop Data Science and Machine Learning Applications |
| URI | https://dl.acm.org/doi/10.1145/3617588 https://www.proquest.com/docview/2933529164 https://dl.acm.org/doi/pdf/10.1145/3617588 |
| UnpaywallVersion | publishedVersion |
| Volume | 56 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1557-7341 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002416 issn: 0360-0300 databaseCode: ADMLS dateStart: 20040301 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwED50e9AXp1NxOkcE8a2ztU3WPhbnENnGwA3miyVNUx_cL1yHzL_eS5tuUxH2VspdKL1c7sLd9x3ANV79Q8EdalgsslSZUaBLeaYhYox9zOQiogo73Omyx4HzNKRDTZOjsDDRCNcZpyV85dOzKNaEtvTWxmBLXXcXioxi3l2A4qDb81-yYqRp4G5N0Y-UNowGns0ZQnZTUwUgMf4ZgNZZ5d5iMuPLTz4abQSYVimbVDRPeQlVX8l7fZGEdfH1i7Vxu28_hAOdZxI_2xhHsCMnZSjlMxyIduljeG0rrK_qgiLTmKiuD6O3xhKQ3lJxC5BkSnR3EWnyhOfqBHVJJ23HlEQztb4Rf6MmfgL91kP__tHQMxcMjp6fGA6TFvO8yDOZ5NzDACds3qAcbyWccouF-CQltxsitARlURSKEE2Nb0QYy9g-hcJkOpFnQJgrPVeY0kJ1R5rMpR6epR6NXDfmdkwrUMa_FcwyUo1A_6EK3OTWCYRmKVfDMkZBhqCma0GyEszX-CNSzc0baM-cB5jeYM6JSbFTgauVyf9b4nwLmQvYv8NMJwMqVqGQfCzkJWYqSViDot_stJ9reqt-AyH24y0 |
| linkProvider | Unpaywall |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1RS8MwEA66PeiL06k4nRJBfMtsbZO2j0MdQ9zYwwbzxZKkqQ923XAdMn-9lzbdpiL4VspdKL1c7sLd9x1CV3D1F5K7lNgssnWZUYJLBRaRMcQ-ZnEZUY0d7vVZd-Q-junY0ORoLEyUwDqTvISvfXoWxYbQlt44EGyp72-jKqOQd1dQddQftJ-LYqRFYLfm6EdKPeLB2VwgZDc1dQCSk-8BaJ1V7izSGV9-8CTZCDCdWjGpaJ7zEuq-krfWIhMt-fmDtfF_376P9kyeidvFxjhAWyqto1o5wwEblz5EL08a66u7oPA0xrrrgwzWWAI8WGpuAZxNsekuwvc846U6Bl3cy9sxFTZMra-4vVETP0LDzsPwrkvMzAXCwfMz4jJlsyCIAospzgMIcNLhHuVwK-GU20zAk1Lc8aSwJWVRJKQAU8MbKWIVO8eokk5TdYIw81XgS0vZoO4qi_k0gLM0oJHvx9yJaQPV4W-Fs4JUIzR_qIGuS-uE0rCU62EZSVggqOlaEK8EyzV-iTRL84bGM-chpDeQc0JS7DbQ5crkfy1x-g-ZM7R7C5lOAVRsokr2vlDnkKlk4sJs0S8QcOGZ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Landscape+of+High-Performance+Python+to+Develop+Data+Science+and+Machine+Learning+Applications&rft.jtitle=ACM+computing+surveys&rft.au=Castro%2C+Oscar&rft.au=Bruneau%2C+Pierrick&rft.au=Sottet%2C+Jean-S%C3%A9bastien&rft.au=Torregrossa%2C+Dario&rft.date=2024-03-31&rft.issn=0360-0300&rft.eissn=1557-7341&rft.volume=56&rft.issue=3&rft.spage=1&rft.epage=30&rft_id=info:doi/10.1145%2F3617588&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3617588 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0360-0300&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0360-0300&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0360-0300&client=summon |