GraphPyRec: A novel graph-based approach for fine-grained Python code recommendation

Artificial intelligence has been widely applied in software engineering areas such as code recommendation. Significant progress has been made in code recommendation for static languages in recent years, but it remains challenging for dynamic languages like Python as accurately determining data flows...

Full description

Saved in:
Bibliographic Details
Published inScience of computer programming Vol. 238; p. 103166
Main Authors Zong, Xing, Zheng, Shang, Zou, Haitao, Yu, Hualong, Gao, Shang
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.12.2024
Subjects
Online AccessGet full text
ISSN0167-6423
DOI10.1016/j.scico.2024.103166

Cover

Abstract Artificial intelligence has been widely applied in software engineering areas such as code recommendation. Significant progress has been made in code recommendation for static languages in recent years, but it remains challenging for dynamic languages like Python as accurately determining data flows before runtime is difficult. This limitation hinders data flow analysis, affecting the performance of code recommendation methods that rely on code analysis. In this study, a graph-based Python recommendation approach (GraphPyRec) is proposed by converting source code into a graph representation that captures both semantic and dynamic information. Nodes represent semantic information, with unique rules defined for various code statements. Edges depict control flow and data flow, utilizing a child-sibling-like process and a dedicated algorithm for data transfer extraction. Alongside the graph, a bag of words is created to include essential names, and a pre-trained BERT model transforms it into vectors. These vectors are integrated into a Gated Graph Neural Network (GGNN) process of the code recommendation model, enhancing its effectiveness and accuracy. To validate the proposed method, we crawled over a million lines of code from GitHub. Experimental results show that GraphPyRec outperforms existing mainstream Python code recommendation methods, achieving Top-1, 5, and 10 accuracy rates of 68.52%, 88.92%, and 94.05%, respectively, along with a Mean Reciprocal Rank (MRR) of 0.772. •Custom node definition rules are proposed to achieve a full representation of the code semantics.•A data flow analysis algorithm is given to analyze variable transfer pathways.•A graph-based Python code recommendation model was constructed based on GGNN and BERT.•A Python code corpus was constructed by collecting a million lines of Python programs from GitHub.
AbstractList Artificial intelligence has been widely applied in software engineering areas such as code recommendation. Significant progress has been made in code recommendation for static languages in recent years, but it remains challenging for dynamic languages like Python as accurately determining data flows before runtime is difficult. This limitation hinders data flow analysis, affecting the performance of code recommendation methods that rely on code analysis. In this study, a graph-based Python recommendation approach (GraphPyRec) is proposed by converting source code into a graph representation that captures both semantic and dynamic information. Nodes represent semantic information, with unique rules defined for various code statements. Edges depict control flow and data flow, utilizing a child-sibling-like process and a dedicated algorithm for data transfer extraction. Alongside the graph, a bag of words is created to include essential names, and a pre-trained BERT model transforms it into vectors. These vectors are integrated into a Gated Graph Neural Network (GGNN) process of the code recommendation model, enhancing its effectiveness and accuracy. To validate the proposed method, we crawled over a million lines of code from GitHub. Experimental results show that GraphPyRec outperforms existing mainstream Python code recommendation methods, achieving Top-1, 5, and 10 accuracy rates of 68.52%, 88.92%, and 94.05%, respectively, along with a Mean Reciprocal Rank (MRR) of 0.772. •Custom node definition rules are proposed to achieve a full representation of the code semantics.•A data flow analysis algorithm is given to analyze variable transfer pathways.•A graph-based Python code recommendation model was constructed based on GGNN and BERT.•A Python code corpus was constructed by collecting a million lines of Python programs from GitHub.
ArticleNumber 103166
Author Zou, Haitao
Gao, Shang
Yu, Hualong
Zong, Xing
Zheng, Shang
Author_xml – sequence: 1
  givenname: Xing
  surname: Zong
  fullname: Zong, Xing
– sequence: 2
  givenname: Shang
  surname: Zheng
  fullname: Zheng, Shang
  email: szheng@just.edu.cn
– sequence: 3
  givenname: Haitao
  surname: Zou
  fullname: Zou, Haitao
– sequence: 4
  givenname: Hualong
  surname: Yu
  fullname: Yu, Hualong
– sequence: 5
  givenname: Shang
  surname: Gao
  fullname: Gao, Shang
BookMark eNp9kM1KAzEUhbOoYFt9Ajd5gan5m0wruChFq1CwSF2HTHJjU9pkSIZC396Mde3qwDn3XO79JmgUYgCEHiiZUULl42GWjTdxxggTxeFUyhEal6SppGD8Fk1yPhBCpGjoGO3WSXf77eUTzBNe4hDPcMTfg1e1OoPFuutS1GaPXUzY-QBVSYtYvL30-xiwiRZwAhNPJwhW9z6GO3Tj9DHD_Z9O0dfry271Vm0-1u-r5aYyrOZ9ZZ2QpNGEMqEJk8xYwrmhtRTzxsKitpRxmNPWLFpjtBOWtqzWdc1c44Rgmk8Rv-41KeacwKku-ZNOF0WJGmCog_qFoQYY6gqjtJ6vLSinnT2kYQaCAevLG72y0f_b_wGPAm0n
Cites_doi 10.1016/j.engappai.2023.106304
10.1145/2714064.2660223
10.1145/3360588
10.1109/TSE.2021.3074309
10.1145/3212695
10.1007/s11432-018-9821-9
10.1145/2950290.2983985
10.1007/s10515-022-00326-0
ContentType Journal Article
Copyright 2024 Elsevier B.V.
Copyright_xml – notice: 2024 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.scico.2024.103166
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
ExternalDocumentID 10_1016_j_scico_2024_103166
S0167642324000893
GroupedDBID --K
--M
.DC
.~1
0R~
0SF
123
1B1
1RT
1~.
1~5
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABJNI
ABMAC
ABTAH
ABVKL
ABXDB
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADHUB
ADMUD
ADVLN
AEBSH
AEKER
AENEX
AEXQZ
AFFNX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
DU5
E.L
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG9
M26
M41
MO0
N9A
NCXOZ
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TN5
WUQ
XPP
ZMT
ZY4
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
CITATION
EFKBS
EFLBG
~HD
ID FETCH-LOGICAL-c253t-df4607a0124a0262cd033c156487de95d123e81bc9bccaf4d1b25a552f7f442a3
IEDL.DBID .~1
ISSN 0167-6423
IngestDate Wed Oct 01 02:59:17 EDT 2025
Sat Aug 31 16:02:56 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Code graph
Code recommendation
Software development
Abstract syntax tree
Software engineering
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c253t-df4607a0124a0262cd033c156487de95d123e81bc9bccaf4d1b25a552f7f442a3
ParticipantIDs crossref_primary_10_1016_j_scico_2024_103166
elsevier_sciencedirect_doi_10_1016_j_scico_2024_103166
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate December 2024
2024-12-00
PublicationDateYYYYMMDD 2024-12-01
PublicationDate_xml – month: 12
  year: 2024
  text: December 2024
PublicationDecade 2020
PublicationTitle Science of computer programming
PublicationYear 2024
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Guo, Ren, Lu, Feng, Tang, Liu, Zhou, Duan, Svyatkovskiy, Fu (br0240) 2020
Ben-Nun, Jakobovits, Hoefler (br0410) Jun 2018
Wan, Shu, Sui, Xu, Zhao, Wu, Yu (br0470) 2019
Tufano, Watson, Bavota, Di Penta, White, Poshyvanyk (br0430) 2018
Nadim, Mondal, Roy (br0600) 2022; 29
Hellendoorn, Devanbu (br0350) 2017
H. Dam, T. Tran, J. Grundy, A. Ghose, Deepsoft: A vision for a deep model of software, arXiv: Software Engineering, Jul 2016.
Gorbovitski, Liu, Stoller, Rothamel, Tekle (br0130) 2010
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (br0220) 2017; 30
Ho (br0200) 1995
Efstathiou, Spinellis (br0380) 2019
Hindle, Barr, Gabel, Su, Devanbu (br0250) 2016; vol. 59
Nguyen, Nguyen, Nguyen (br0270) 2013
Nguyen, Nguyen, Nguyen (br0280) 2014
Zhang, Wang, Zhang, Sun, Wang, Liu (br0460) 2019
LeClair, Haque, Wu, McMillan (br0580) 2020
Zhang, Larsen, Brunthaler, Franz (br0100) 2014; 49
Joulin, Grave, Bojanowski, Mikolov (br0170) 2016
Wang, Li, Ma, Xia, Jin (br0560) 2020
Fritz, Hage (br0060) 2017
Chen, Peng, Xing, Sun, Wang, Zhao, Zhao (br0590) 2021; 48
Y. Wainakh, M. Rauf, M. Pradel, Evaluating semantic representations of source code, Learning, arXiv:Learning, Sep 2019.
Izadi, Gismondi, Gousios (br0230) 2022
Takerngsaksiri, Tantithamthavorn, Li (br0500) 2024
Zhou, Shi, Tian, Qi, Li, Hao, Xu (br0210) 2016
Allamanis, Peng, Sutton (br0400) 2016
White, Vendome, Linares-Vásquez, Poshyvanyk (br0480) 2015
Yu, Cao, Tang, Nie, Huang, Wu (br0570) 2020; 34
Xie, Kong, Wang, Zhou, Li (br0070) 2019
He, Xu, Zhang, Hao, Feng, Xu (br0090) 2021
Nguyen, Nguyen, Nguyen, Nguyen (br0290) 2014
Feng, Vanam, Cherukupally, Zheng, Qiu, Chen (br0020) 2023
Yu, Zhu (br0180) 2020
Nguyen, Nguyen, Nguyen, Nguyen (br0260) 2013
Allamanis, Brockschmidt, Khademi (br0510) 2017
Wan, Shu, Sui, Xu, Zhao, Wu, Yu (br0540) 2019
D'Souza, Yang, Lopes (br0040) 2016
Li, Tarlow, Brockschmidt, Zemel (br0150) 2015
Allamanis, Barr, Bird, Sutton (br0330) 2015
Xu, Zhang, Chen, Pei, Xu (br0050) 2016
Bhoopchand, Rocktäschel, Barr, Riedel (br0080) 2016
Alizadehsani, Ghaemi, Shahraki, Gonzalez-Briones, Corchado (br0030) 2023; 123
Tu, Su, Devanbu (br0300) 2014
Li, Wang, Nguyen, Van Nguyen (br0550) 2019; 3
Raychev, Vechev, Yahav (br0340) 2014
Salib (br0140) 2004
Zheng, Gai, Yu, Zou, Gao (br0010) 2021; 94
U. Alon, S. Brody, O. Levy, E. Yahav, code2seq: Generating sequences from structured representations of code, arXiv: Learning, Aug 2018.
Chen, Peng, Sun, Xing, Wang, Zhao, Zhang, Zhao (br0110) 2019; 62
Cavnar, Trenkle (br0190) 1994
Ariza-Casabona, Twardowski, Wijaya (br0120) 2023
Allamanis, Barr, Devanbu, Sutton (br0360) 2018; 51
Hellendoorn, Sutton, Singh, Maniatis, Bieber (br0530) 2019
Nguyen, Nguyen, Phan, Nguyen (br0440) 2018
Allamanis, Sutton (br0320) 2013
Zhao, Huang (br0420) 2018
Zhou, Liu, Siow, Du, Liu (br0520) 2019; 32
Devlin, Chang, Lee, Toutanova (br0160) 2018
Franks, Tu, Devanbu, Hellendoorn (br0310) 2015
White, Tufano, Martinez, Monperrus, Poshyvanyk (br0450) 2019
Wan (10.1016/j.scico.2024.103166_br0540) 2019
Tu (10.1016/j.scico.2024.103166_br0300) 2014
Izadi (10.1016/j.scico.2024.103166_br0230) 2022
Zhao (10.1016/j.scico.2024.103166_br0420) 2018
Nguyen (10.1016/j.scico.2024.103166_br0280) 2014
LeClair (10.1016/j.scico.2024.103166_br0580) 2020
Nguyen (10.1016/j.scico.2024.103166_br0290) 2014
10.1016/j.scico.2024.103166_br0390
Nguyen (10.1016/j.scico.2024.103166_br0440) 2018
Alizadehsani (10.1016/j.scico.2024.103166_br0030) 2023; 123
Ben-Nun (10.1016/j.scico.2024.103166_br0410) 2018
Joulin (10.1016/j.scico.2024.103166_br0170)
Chen (10.1016/j.scico.2024.103166_br0590) 2021; 48
Nadim (10.1016/j.scico.2024.103166_br0600) 2022; 29
Fritz (10.1016/j.scico.2024.103166_br0060) 2017
Hindle (10.1016/j.scico.2024.103166_br0250) 2016; vol. 59
Zheng (10.1016/j.scico.2024.103166_br0010) 2021; 94
Efstathiou (10.1016/j.scico.2024.103166_br0380) 2019
Vaswani (10.1016/j.scico.2024.103166_br0220) 2017; 30
Allamanis (10.1016/j.scico.2024.103166_br0330) 2015
Ariza-Casabona (10.1016/j.scico.2024.103166_br0120) 2023
Zhou (10.1016/j.scico.2024.103166_br0520) 2019; 32
Allamanis (10.1016/j.scico.2024.103166_br0400) 2016
Takerngsaksiri (10.1016/j.scico.2024.103166_br0500) 2024
Allamanis (10.1016/j.scico.2024.103166_br0510)
Salib (10.1016/j.scico.2024.103166_br0140) 2004
Chen (10.1016/j.scico.2024.103166_br0110) 2019; 62
Devlin (10.1016/j.scico.2024.103166_br0160)
White (10.1016/j.scico.2024.103166_br0450) 2019
Feng (10.1016/j.scico.2024.103166_br0020) 2023
Nguyen (10.1016/j.scico.2024.103166_br0260) 2013
Bhoopchand (10.1016/j.scico.2024.103166_br0080)
He (10.1016/j.scico.2024.103166_br0090) 2021
Xie (10.1016/j.scico.2024.103166_br0070) 2019
Wan (10.1016/j.scico.2024.103166_br0470) 2019
Franks (10.1016/j.scico.2024.103166_br0310) 2015
Allamanis (10.1016/j.scico.2024.103166_br0360) 2018; 51
Zhang (10.1016/j.scico.2024.103166_br0460) 2019
D'Souza (10.1016/j.scico.2024.103166_br0040) 2016
Ho (10.1016/j.scico.2024.103166_br0200) 1995
10.1016/j.scico.2024.103166_br0370
10.1016/j.scico.2024.103166_br0490
Cavnar (10.1016/j.scico.2024.103166_br0190) 1994
Tufano (10.1016/j.scico.2024.103166_br0430) 2018
Allamanis (10.1016/j.scico.2024.103166_br0320) 2013
Zhang (10.1016/j.scico.2024.103166_br0100) 2014; 49
Raychev (10.1016/j.scico.2024.103166_br0340) 2014
White (10.1016/j.scico.2024.103166_br0480) 2015
Yu (10.1016/j.scico.2024.103166_br0570) 2020; 34
Li (10.1016/j.scico.2024.103166_br0150)
Guo (10.1016/j.scico.2024.103166_br0240)
Li (10.1016/j.scico.2024.103166_br0550) 2019; 3
Gorbovitski (10.1016/j.scico.2024.103166_br0130) 2010
Nguyen (10.1016/j.scico.2024.103166_br0270) 2013
Hellendoorn (10.1016/j.scico.2024.103166_br0530) 2019
Wang (10.1016/j.scico.2024.103166_br0560) 2020
Zhou (10.1016/j.scico.2024.103166_br0210) 2016
Xu (10.1016/j.scico.2024.103166_br0050) 2016
Yu (10.1016/j.scico.2024.103166_br0180)
Hellendoorn (10.1016/j.scico.2024.103166_br0350) 2017
References_xml – start-page: 369
  year: 2019
  end-page: 379
  ident: br0070
  article-title: Hirec: Api recommendation using hierarchical context
  publication-title: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)
– start-page: 1
  year: 2023
  end-page: 10
  ident: br0020
  article-title: Investigating code generation performance of chat-gpt with crowdsourcing social data
  publication-title: Proceedings of the 47th IEEE Computer Software and Applications Conference
– year: Jun 2018
  ident: br0410
  article-title: Neural code comprehension: a learnable representation of code semantics
  publication-title: Neural Inf. Process. Syst.
– start-page: 165
  year: 2024
  ident: br0500
  article-title: Syntax-aware on-the-fly code completion
  publication-title: Inf. Softw. Technol.
– volume: 29
  start-page: 27
  year: 2022
  ident: br0600
  article-title: Leveraging structural properties of source code graphs for just-in-time bug prediction
  publication-title: Autom. Softw. Eng.
– start-page: 763
  year: 2017
  end-page: 773
  ident: br0350
  article-title: Are deep neural networks the best choice for modeling source code?
  publication-title: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
– start-page: 1634
  year: 2021
  end-page: 1645
  ident: br0090
  article-title: Pyart: Python api recommendation in real-time
  publication-title: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)
– volume: 30
  year: 2017
  ident: br0220
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.
– start-page: 29
  year: 2019
  end-page: 33
  ident: br0380
  article-title: Semantic source code models using identifier embeddings
  publication-title: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
– start-page: 27
  year: 2010
  end-page: 42
  ident: br0130
  article-title: Alias analysis for optimization of dynamic languages
  publication-title: Proceedings of the 6th Symposium on Dynamic Languages
– start-page: 532
  year: 2013
  end-page: 542
  ident: br0260
  article-title: A statistical semantic language model for source code
  publication-title: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
– start-page: 184
  year: 2020
  end-page: 195
  ident: br0580
  article-title: Improved code summarization via a graph neural network
  publication-title: Proceedings of the 28th International Conference on Program Comprehension
– reference: Y. Wainakh, M. Rauf, M. Pradel, Evaluating semantic representations of source code, Learning, arXiv:Learning, Sep 2019.
– start-page: 49
  year: 2023
  end-page: 65
  ident: br0120
  article-title: Exploiting graph structured cross-domain representation for multi-domain recommendation
  publication-title: European Conference on Information Retrieval
– volume: 32
  year: 2019
  ident: br0520
  article-title: Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks
  publication-title: Adv. Neural Inf. Process. Syst.
– year: 2004
  ident: br0140
  article-title: Starkiller: a static type inferencer and compiler for python
– volume: 34
  start-page: 1145
  year: 2020
  end-page: 1152
  ident: br0570
  article-title: Order matters: semantic-aware neural networks for binary code similarity detection
  publication-title: Proc. AAAI Conf. Artif. Intell.
– start-page: 269
  year: 2014
  end-page: 280
  ident: br0300
  article-title: On the localness of software
  publication-title: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
– volume: vol. 59
  start-page: 122
  year: 2016
  end-page: 131
  ident: br0250
  article-title: On the Naturalness of Software
  publication-title: Communications of the ACM
– volume: 123
  year: 2023
  ident: br0030
  article-title: Dcservcg: a data-centric service code generation using deep learning
  publication-title: Eng. Appl. Artif. Intell.
– start-page: 607
  year: 2016
  end-page: 618
  ident: br0050
  article-title: Python probabilistic type inference with natural language support
  publication-title: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering
– start-page: 13
  year: 2019
  end-page: 25
  ident: br0540
  article-title: Multi-modal attention network learning for semantic source code retrieval
  publication-title: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE)
– reference: U. Alon, S. Brody, O. Levy, E. Yahav, code2seq: Generating sequences from structured representations of code, arXiv: Learning, Aug 2018.
– year: 2018
  ident: br0160
  article-title: Bert: pre-training of deep bidirectional transformers for language understanding
– year: 2020
  ident: br0180
  article-title: Hyper-parameter optimization: a review of algorithms and applications
– start-page: 207
  year: 2013
  end-page: 216
  ident: br0320
  article-title: Mining source code repositories at massive scale using language modeling
  publication-title: 2013 10th Working Conference on Mining Software Repositories (MSR)
– year: 2017
  ident: br0510
  article-title: Learning to represent programs with graphs
– start-page: 705
  year: 2015
  end-page: 708
  ident: br0310
  article-title: Cacheca: a cache language model based code suggestion tool
  publication-title: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 2
– start-page: 783
  year: 2019
  end-page: 794
  ident: br0460
  article-title: A novel neural source code representation based on abstract syntax tree
  publication-title: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)
– start-page: 207
  year: 2016
  end-page: 212
  ident: br0210
  article-title: Attention-based bidirectional long short-term memory networks for relation classification
  publication-title: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
– volume: 62
  start-page: 1
  year: 2019
  end-page: 22
  ident: br0110
  article-title: Generative api usage code recommendation with parameter concretization
  publication-title: Sci. China Inf. Sci.
– start-page: 323
  year: 2018
  end-page: 334
  ident: br0440
  article-title: A deep neural network language model with contexts for source code
  publication-title: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)
– start-page: 542
  year: 2018
  end-page: 553
  ident: br0430
  article-title: Deep learning similarities from different representations of source code
  publication-title: Proceedings of the 15th International Conference on Mining Software Repositories
– reference: H. Dam, T. Tran, J. Grundy, A. Ghose, Deepsoft: A vision for a deep model of software, arXiv: Software Engineering, Jul 2016.
– start-page: 334
  year: 2015
  end-page: 345
  ident: br0480
  article-title: Toward deep learning software repositories
  publication-title: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories
– volume: 94
  year: 2021
  ident: br0010
  article-title: Training data selection for imbalanced cross-project defect prediction
  publication-title: Comput. Electr. Eng.
– volume: 48
  start-page: 2987
  year: 2021
  end-page: 3009
  ident: br0590
  article-title: Holistic combination of structural and textual code information for context based api recommendation
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 89
  year: 2017
  end-page: 98
  ident: br0060
  article-title: Cost versus precision for approximate typing for python
  publication-title: Proceedings of the 2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation
– start-page: 51
  year: 2016
  end-page: 60
  ident: br0040
  article-title: Collective intelligence for smarter api recommendations in python
  publication-title: 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM)
– start-page: 261
  year: 2020
  end-page: 271
  ident: br0560
  article-title: Detecting code clones with graph neural network and flow-augmented abstract syntax tree
  publication-title: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)
– year: 2015
  ident: br0150
  article-title: Gated graph sequence neural networks
– year: 2020
  ident: br0240
  article-title: Graphcodebert: pre-training code representations with data flow
– year: 2016
  ident: br0170
  article-title: Bag of tricks for efficient text classification
– start-page: 544
  year: 2014
  end-page: 547
  ident: br0280
  article-title: Migrating code with statistical machine translation
  publication-title: Companion Proceedings of the 36th International Conference on Software Engineering
– start-page: 141
  year: 2018
  end-page: 151
  ident: br0420
  article-title: Deepsim: deep learning code functional similarity
  publication-title: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
– start-page: 38
  year: 2015
  end-page: 49
  ident: br0330
  article-title: Suggesting accurate method and class names
  publication-title: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering
– volume: 3
  start-page: 1
  year: 2019
  end-page: 30
  ident: br0550
  article-title: Improving bug detection via context-based code representation learning and attention-based neural networks
  publication-title: Proc. ACM Program. Lang.
– start-page: 401
  year: 2022
  end-page: 412
  ident: br0230
  article-title: Codefill: multi-token code completion by jointly learning from structure and naming sequences
  publication-title: Proceedings of the 44th International Conference on Software Engineering
– start-page: 278
  year: 1995
  end-page: 282
  ident: br0200
  article-title: Random decision forests
  publication-title: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1
– start-page: 13
  year: 2019
  end-page: 25
  ident: br0470
  article-title: Multi-modal attention network learning for semantic source code retrieval
  publication-title: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE)
– start-page: 14
  year: 1994
  ident: br0190
  article-title: N-gram-based text categorization
  publication-title: Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, vol. 161175, Las Vegas, NV
– start-page: 2091
  year: 2016
  end-page: 2100
  ident: br0400
  article-title: A convolutional attention network for extreme summarization of source code
  publication-title: International Conference on Machine Learning
– volume: 49
  start-page: 727
  year: 2014
  end-page: 743
  ident: br0100
  article-title: Accelerating iterators in optimizing ast interpreters
  publication-title: ACM SIGPLAN Not.
– volume: 51
  start-page: 1
  year: 2018
  end-page: 37
  ident: br0360
  article-title: A survey of machine learning for big code and naturalness
  publication-title: ACM Comput. Surv.
– start-page: 457
  year: 2014
  end-page: 468
  ident: br0290
  article-title: Statistical learning approach for mining api usage mappings for code migration
  publication-title: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering
– year: 2016
  ident: br0080
  article-title: Learning python code suggestion with a sparse pointer network
– start-page: 651
  year: 2013
  end-page: 654
  ident: br0270
  article-title: Lexical statistical machine translation for language migration
  publication-title: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
– year: 2019
  ident: br0530
  article-title: Global relational models of source code
  publication-title: International Conference on Learning Representations
– start-page: 479
  year: 2019
  end-page: 490
  ident: br0450
  article-title: Sorting and transforming program repair ingredients via deep learning code similarities
  publication-title: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)
– start-page: 419
  year: 2014
  end-page: 428
  ident: br0340
  article-title: Code completion with statistical language models
  publication-title: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation
– ident: 10.1016/j.scico.2024.103166_br0240
– start-page: 165
  year: 2024
  ident: 10.1016/j.scico.2024.103166_br0500
  article-title: Syntax-aware on-the-fly code completion
  publication-title: Inf. Softw. Technol.
– start-page: 207
  year: 2016
  ident: 10.1016/j.scico.2024.103166_br0210
  article-title: Attention-based bidirectional long short-term memory networks for relation classification
– year: 2019
  ident: 10.1016/j.scico.2024.103166_br0530
  article-title: Global relational models of source code
– volume: 123
  year: 2023
  ident: 10.1016/j.scico.2024.103166_br0030
  article-title: Dcservcg: a data-centric service code generation using deep learning
  publication-title: Eng. Appl. Artif. Intell.
  doi: 10.1016/j.engappai.2023.106304
– start-page: 369
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0070
  article-title: Hirec: Api recommendation using hierarchical context
– start-page: 51
  year: 2016
  ident: 10.1016/j.scico.2024.103166_br0040
  article-title: Collective intelligence for smarter api recommendations in python
– volume: 49
  start-page: 727
  issue: 10
  year: 2014
  ident: 10.1016/j.scico.2024.103166_br0100
  article-title: Accelerating iterators in optimizing ast interpreters
  publication-title: ACM SIGPLAN Not.
  doi: 10.1145/2714064.2660223
– start-page: 323
  year: 2018
  ident: 10.1016/j.scico.2024.103166_br0440
  article-title: A deep neural network language model with contexts for source code
– start-page: 542
  year: 2018
  ident: 10.1016/j.scico.2024.103166_br0430
  article-title: Deep learning similarities from different representations of source code
– start-page: 278
  year: 1995
  ident: 10.1016/j.scico.2024.103166_br0200
  article-title: Random decision forests
– start-page: 27
  year: 2010
  ident: 10.1016/j.scico.2024.103166_br0130
  article-title: Alias analysis for optimization of dynamic languages
– start-page: 29
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0380
  article-title: Semantic source code models using identifier embeddings
– ident: 10.1016/j.scico.2024.103166_br0180
– start-page: 13
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0540
  article-title: Multi-modal attention network learning for semantic source code retrieval
– start-page: 2091
  year: 2016
  ident: 10.1016/j.scico.2024.103166_br0400
  article-title: A convolutional attention network for extreme summarization of source code
– start-page: 334
  year: 2015
  ident: 10.1016/j.scico.2024.103166_br0480
  article-title: Toward deep learning software repositories
– ident: 10.1016/j.scico.2024.103166_br0080
– start-page: 532
  year: 2013
  ident: 10.1016/j.scico.2024.103166_br0260
  article-title: A statistical semantic language model for source code
– start-page: 141
  year: 2018
  ident: 10.1016/j.scico.2024.103166_br0420
  article-title: Deepsim: deep learning code functional similarity
– start-page: 607
  year: 2016
  ident: 10.1016/j.scico.2024.103166_br0050
  article-title: Python probabilistic type inference with natural language support
– ident: 10.1016/j.scico.2024.103166_br0170
– start-page: 49
  year: 2023
  ident: 10.1016/j.scico.2024.103166_br0120
  article-title: Exploiting graph structured cross-domain representation for multi-domain recommendation
– start-page: 651
  year: 2013
  ident: 10.1016/j.scico.2024.103166_br0270
  article-title: Lexical statistical machine translation for language migration
– start-page: 544
  year: 2014
  ident: 10.1016/j.scico.2024.103166_br0280
  article-title: Migrating code with statistical machine translation
– start-page: 401
  year: 2022
  ident: 10.1016/j.scico.2024.103166_br0230
  article-title: Codefill: multi-token code completion by jointly learning from structure and naming sequences
– year: 2018
  ident: 10.1016/j.scico.2024.103166_br0410
  article-title: Neural code comprehension: a learnable representation of code semantics
  publication-title: Neural Inf. Process. Syst.
– ident: 10.1016/j.scico.2024.103166_br0150
– start-page: 763
  year: 2017
  ident: 10.1016/j.scico.2024.103166_br0350
  article-title: Are deep neural networks the best choice for modeling source code?
– start-page: 14
  year: 1994
  ident: 10.1016/j.scico.2024.103166_br0190
  article-title: N-gram-based text categorization
– start-page: 1
  year: 2023
  ident: 10.1016/j.scico.2024.103166_br0020
  article-title: Investigating code generation performance of chat-gpt with crowdsourcing social data
– volume: 3
  start-page: 1
  issue: OOPSLA
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0550
  article-title: Improving bug detection via context-based code representation learning and attention-based neural networks
  publication-title: Proc. ACM Program. Lang.
  doi: 10.1145/3360588
– volume: 34
  start-page: 1145
  year: 2020
  ident: 10.1016/j.scico.2024.103166_br0570
  article-title: Order matters: semantic-aware neural networks for binary code similarity detection
  publication-title: Proc. AAAI Conf. Artif. Intell.
– start-page: 89
  year: 2017
  ident: 10.1016/j.scico.2024.103166_br0060
  article-title: Cost versus precision for approximate typing for python
– start-page: 783
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0460
  article-title: A novel neural source code representation based on abstract syntax tree
– start-page: 38
  year: 2015
  ident: 10.1016/j.scico.2024.103166_br0330
  article-title: Suggesting accurate method and class names
– start-page: 419
  year: 2014
  ident: 10.1016/j.scico.2024.103166_br0340
  article-title: Code completion with statistical language models
– year: 2004
  ident: 10.1016/j.scico.2024.103166_br0140
– start-page: 269
  year: 2014
  ident: 10.1016/j.scico.2024.103166_br0300
  article-title: On the localness of software
– start-page: 261
  year: 2020
  ident: 10.1016/j.scico.2024.103166_br0560
  article-title: Detecting code clones with graph neural network and flow-augmented abstract syntax tree
– volume: 48
  start-page: 2987
  issue: 8
  year: 2021
  ident: 10.1016/j.scico.2024.103166_br0590
  article-title: Holistic combination of structural and textual code information for context based api recommendation
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2021.3074309
– volume: 51
  start-page: 1
  issue: 4
  year: 2018
  ident: 10.1016/j.scico.2024.103166_br0360
  article-title: A survey of machine learning for big code and naturalness
  publication-title: ACM Comput. Surv.
  doi: 10.1145/3212695
– volume: 62
  start-page: 1
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0110
  article-title: Generative api usage code recommendation with parameter concretization
  publication-title: Sci. China Inf. Sci.
  doi: 10.1007/s11432-018-9821-9
– ident: 10.1016/j.scico.2024.103166_br0160
– start-page: 207
  year: 2013
  ident: 10.1016/j.scico.2024.103166_br0320
  article-title: Mining source code repositories at massive scale using language modeling
– volume: 32
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0520
  article-title: Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: 10.1016/j.scico.2024.103166_br0490
– start-page: 184
  year: 2020
  ident: 10.1016/j.scico.2024.103166_br0580
  article-title: Improved code summarization via a graph neural network
– volume: vol. 59
  start-page: 122
  year: 2016
  ident: 10.1016/j.scico.2024.103166_br0250
  article-title: On the Naturalness of Software
– start-page: 1634
  year: 2021
  ident: 10.1016/j.scico.2024.103166_br0090
  article-title: Pyart: Python api recommendation in real-time
– ident: 10.1016/j.scico.2024.103166_br0390
  doi: 10.1145/2950290.2983985
– volume: 29
  start-page: 27
  issue: 1
  year: 2022
  ident: 10.1016/j.scico.2024.103166_br0600
  article-title: Leveraging structural properties of source code graphs for just-in-time bug prediction
  publication-title: Autom. Softw. Eng.
  doi: 10.1007/s10515-022-00326-0
– ident: 10.1016/j.scico.2024.103166_br0370
– volume: 30
  year: 2017
  ident: 10.1016/j.scico.2024.103166_br0220
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.
– start-page: 479
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0450
  article-title: Sorting and transforming program repair ingredients via deep learning code similarities
– ident: 10.1016/j.scico.2024.103166_br0510
– volume: 94
  issue: 1
  year: 2021
  ident: 10.1016/j.scico.2024.103166_br0010
  article-title: Training data selection for imbalanced cross-project defect prediction
  publication-title: Comput. Electr. Eng.
– start-page: 457
  year: 2014
  ident: 10.1016/j.scico.2024.103166_br0290
  article-title: Statistical learning approach for mining api usage mappings for code migration
– start-page: 13
  year: 2019
  ident: 10.1016/j.scico.2024.103166_br0470
  article-title: Multi-modal attention network learning for semantic source code retrieval
– start-page: 705
  year: 2015
  ident: 10.1016/j.scico.2024.103166_br0310
  article-title: Cacheca: a cache language model based code suggestion tool
SSID ssj0006471
Score 2.3870673
Snippet Artificial intelligence has been widely applied in software engineering areas such as code recommendation. Significant progress has been made in code...
SourceID crossref
elsevier
SourceType Index Database
Publisher
StartPage 103166
SubjectTerms Abstract syntax tree
Code graph
Code recommendation
Software development
Software engineering
Title GraphPyRec: A novel graph-based approach for fine-grained Python code recommendation
URI https://dx.doi.org/10.1016/j.scico.2024.103166
Volume 238
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  issn: 0167-6423
  databaseCode: GBLVA
  dateStart: 20110101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: true
  ssIdentifier: ssj0006471
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier ScienceDirect
  issn: 0167-6423
  databaseCode: .~1
  dateStart: 19950101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: true
  ssIdentifier: ssj0006471
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  issn: 0167-6423
  databaseCode: ACRLP
  dateStart: 20211101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: true
  ssIdentifier: ssj0006471
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection Journals
  issn: 0167-6423
  databaseCode: AIKHN
  dateStart: 20211101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: true
  ssIdentifier: ssj0006471
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  issn: 0167-6423
  databaseCode: AKRWK
  dateStart: 19811001
  customDbUrl:
  isFulltext: true
  mediaType: online
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0006471
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5KvXjxLdZH2YNH17ab3Ty8lWKtiqVgC72FJLsrFZsWqUIv_nZnNgkqiAcvOSTZJczuzvcNmfkG4FyFxqDXs7yttMdlShqQiKMcmbNUiRZaOfnih6E_mMi7qZrWoFfVwlBaZen7C5_uvHV5p1Vas7WczVqPlEDv039GSUAWkeKnlAF1Mbj8-Erz8Iugy-l709uV8pDL8cJ5M6oAFJKKzztOKvEXdPqGOP0d2CqpIusWX7MLNZPvwXbVhoGVp3IfxjckOj1aIwO8Yl2WL97NC3NC1JwwSrNKN5whQWUWaSV_osYQ-GS0JukARnXtjELj-dyUTZYOYNK_HvcGvGyWwDOhvBXXVvrtIEG8kQnGVSLTbc_LSAomDLSJlEaIMshRsyjFRbNSd1KhEqWEDayUIvEOoZ4vcnMEzCKo20REvtFKemkSyVTjlKHVmcZr2ICLykjxstDEiKtksefY2TQmm8aFTRvgV4aMfyxtjF77r4HH_x14ApsY13aKPL1TqK9e38wZcodV2nSbowkb3dv7wfATDdfCpA
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGWDhjShPD4yYto6dB1tVAQXaqhKt1M1KYhsV0bRCBakLv507JxEgIQaWDHFsRefkvu_ku-8IOZehMeD1LGtI7TGRoAYk4CgD5ixkrLmWTr641_c7I3E_luMKaZe1MJhWWfj-3Kc7b13cqRfWrM8nk_ojJtD7eM4oEMgib4WsCskDjMAuP77yPPw86nIC3_h4KT3kkrxg4RRLALnA6vOm00r8BZ6-Qc7NFtkouCJt5a-zTSom2yGbZR8GWvyWu2R4i6rTgyVQwCvaotns3bxQp0TNEKQ0LYXDKTBUaoFXsifsDAEjgyVqB1AsbKcYG0-npuiytEdGN9fDdocV3RJYyqW3YNoKvxHEADgihsCKp7rheSlqwYSBNpHUgFEGSGoaJbBrVuhmwmUsJbeBFYLH3j6pZrPMHBBqAdVtzCPfaCm8JI5EomHJ0OpUwzWskYvSSGqei2KoMlvsWTmbKrSpym1aI35pSPVjbxW47b8mHv534hlZ6wx7XdW96z8ckXUcyVNQjkl18fpmToBILJJT96F8As3RxDc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GraphPyRec%3A+A+novel+graph-based+approach+for+fine-grained+Python+code+recommendation&rft.jtitle=Science+of+computer+programming&rft.au=Zong%2C+Xing&rft.au=Zheng%2C+Shang&rft.au=Zou%2C+Haitao&rft.au=Yu%2C+Hualong&rft.date=2024-12-01&rft.issn=0167-6423&rft.volume=238&rft.spage=103166&rft_id=info:doi/10.1016%2Fj.scico.2024.103166&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_scico_2024_103166
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6423&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6423&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6423&client=summon