PARAMETER EFFICIENT FINE-TUNING AND OVERFITTING IN GPT LARGE LANGUAGE MODELS: A METRIC-BASED COMPARISON
Background. Building upon previous research, this study conducts an exploration into Large Language Models (LLMs), with an emphasis on the fine-tuning and assessment of LLaMA-3.1 for instructional tasks. LLaMA-3.1, which is a new generation model and has gained considerable recognition based on its...
Saved in:
| Published in | Електроніка та інформаційні технологіі Vol. 30; no. 30; pp. 33 - 42 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Ivan Franko National University of Lviv
01.06.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2224-087X 2224-0888 2224-0888 |
| DOI | 10.30970/eli.30.3 |
Cover
| Abstract | Background. Building upon previous research, this study conducts an exploration into Large Language Models (LLMs), with an emphasis on the fine-tuning and assessment of LLaMA-3.1 for instructional tasks. LLaMA-3.1, which is a new generation model and has gained considerable recognition based on its superior performance on various benchmarks. Besides assessing the disparities and improvements between the base and the fine-tuned versions of LLaMA-3.1 on an instruction dataset, the study also addresses the concern of overfitting with LLaMA-3.1. Furthermore, it carries out a comparison between LLaMA-3.1 and both its predecessor, LLaMA-2, and another LLM known as Mixtral, thereby providing a more comprehensive picture of LLaMA-3.1's capabilities compared to other models. Materials and Methods. The fine-tuning of LLaMA-3.1 employed state-of-the-art techniques, such as Low-Rank Adaptation (LoRA) and Quantized Low-Rank Adaptation (QLoRA), on comprehensive instruction datasets. Acknowledging the resource-intensive nature of LLM fine-tuning, optimization measures were taken. The fine-tuning process was additionally enhanced using Parameter-Efficient Fine-tuning (PEFT) on NVIDIA A100 Tensor Core GPU (graphics processing unit) instances. All the models were fine-tuned using Hugging Face and PyTorch platforms for optimal performance. Results and Discussion. The results obtained from fine-tuning and evaluating LLaMA-3.1 offer valuable insights into how this model performs with specific tasks. The evaluation framework proved helpful in the efficient assessment assessing LLMs' performance concerning instruction tasks. The research highlights the importance of evaluation for LLM applications. It shows that not always is fine-tuning a good choice, due to the nature of the model and the specifics of the task. It highlights the overfitting problem. Conclusion. The close examination of LLaMA-3.1 contributes to the field of machine learning by offering insights into how this model works and its possible fine-tuning for special tasks. The findings of this research create opportunities for more in-depth studies around the application of LLMs. It highlights the importance of efficient evaluation with already designed metrics. |
|---|---|
| AbstractList | Background. Building upon previous research, this study conducts an exploration into Large Language Models (LLMs), with an emphasis on the fine-tuning and assessment of LLaMA-3.1 for instructional tasks. LLaMA-3.1, which is a new generation model and has gained considerable recognition based on its superior performance on various benchmarks. Besides assessing the disparities and improvements between the base and the fine-tuned versions of LLaMA-3.1 on an instruction dataset, the study also addresses the concern of overfitting with LLaMA-3.1. Furthermore, it carries out a comparison between LLaMA-3.1 and both its predecessor, LLaMA-2, and another LLM known as Mixtral, thereby providing a more comprehensive picture of LLaMA-3.1's capabilities compared to other models. Materials and Methods. The fine-tuning of LLaMA-3.1 employed state-of-the-art techniques, such as Low-Rank Adaptation (LoRA) and Quantized Low-Rank Adaptation (QLoRA), on comprehensive instruction datasets. Acknowledging the resource-intensive nature of LLM fine-tuning, optimization measures were taken. The fine-tuning process was additionally enhanced using Parameter-Efficient Fine-tuning (PEFT) on NVIDIA A100 Tensor Core GPU (graphics processing unit) instances. All the models were fine-tuned using Hugging Face and PyTorch platforms for optimal performance. Results and Discussion. The results obtained from fine-tuning and evaluating LLaMA-3.1 offer valuable insights into how this model performs with specific tasks. The evaluation framework proved helpful in the efficient assessment assessing LLMs' performance concerning instruction tasks. The research highlights the importance of evaluation for LLM applications. It shows that not always is fine-tuning a good choice, due to the nature of the model and the specifics of the task. It highlights the overfitting problem. Conclusion. The close examination of LLaMA-3.1 contributes to the field of machine learning by offering insights into how this model works and its possible fine-tuning for special tasks. The findings of this research create opportunities for more in-depth studies around the application of LLMs. It highlights the importance of efficient evaluation with already designed metrics. |
| ArticleNumber | 802 |
| Author | Pavlyshenko, Bohdan Bulka, Ivan |
| Author_xml | – sequence: 1 givenname: Bohdan surname: Pavlyshenko fullname: Pavlyshenko, Bohdan – sequence: 2 givenname: Ivan surname: Bulka fullname: Bulka, Ivan |
| BookMark | eNp9kE1Lw0AQhhepYK09-A_2qpC6H0m69bamm7iQJiVNxVvY7CalJTYlUaT_3rWVHr3MPDO8PAxzCwb7dl8BcI_RhKLZFD1VzdbShF6BISHEdRBjbHDh6fsNGPf9DiFEKaaMsCHYLHnGFyIXGRRhKAMpkhyGMhFOvk5kEkGezGH6JrJQ5vnvLBMYLXMY8ywStibRmltYpHMRr54hh9aVycB54Ssxh0G6sH65SpM7cF2rpq_Gf30E1qHIg1cnTiMZ8NjRmHrU8ZDW9i5f18bHhpoKV75LkYcNKY1SbmkQJayc0tqvqPFcWnpYuS4mtVGzmjA6AvLsNa3aFYdu-6G6Y9GqbXFatN2mUN3nVjdVUbOSaUMIM57vKq1KF82Iz4hGutSkLq3r8ez62h_U8Vs1zUWIUXH6eGE_bqmgNvxwDuuu7fuuqv_J_gAbOXoz |
| ContentType | Journal Article |
| DBID | AAYXX CITATION ADTOC UNPAY DOA |
| DOI | 10.30970/eli.30.3 |
| DatabaseName | CrossRef Unpaywall for CDI: Periodical Content Unpaywall DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| EISSN | 2224-0888 |
| EndPage | 42 |
| ExternalDocumentID | oai_doaj_org_article_f8b8cd228d564acab4092682c0cbc2fb 10.30970/eli.30.3 10_30970_eli_30_3 |
| GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ ADTOC UNPAY |
| ID | FETCH-LOGICAL-c1353-50cc8286cfd61d3de1e643051d2bdaa4bd0328b73f6e3d543b51a4412fda9f283 |
| IEDL.DBID | DOA |
| ISSN | 2224-087X 2224-0888 |
| IngestDate | Fri Oct 03 12:41:55 EDT 2025 Mon Sep 15 10:14:21 EDT 2025 Wed Oct 01 05:40:11 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 30 |
| Language | English |
| License | cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c1353-50cc8286cfd61d3de1e643051d2bdaa4bd0328b73f6e3d543b51a4412fda9f283 |
| ORCID | 0009-0003-2962-7931 0000-0001-9515-3488 |
| OpenAccessLink | https://doaj.org/article/f8b8cd228d564acab4092682c0cbc2fb |
| PageCount | 10 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_f8b8cd228d564acab4092682c0cbc2fb unpaywall_primary_10_30970_eli_30_3 crossref_primary_10_30970_eli_30_3 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2025-06-01 |
| PublicationDateYYYYMMDD | 2025-06-01 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-06-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Електроніка та інформаційні технологіі |
| PublicationYear | 2025 |
| Publisher | Ivan Franko National University of Lviv |
| Publisher_xml | – name: Ivan Franko National University of Lviv |
| SSID | ssj0003313828 |
| Score | 2.2956007 |
| Snippet | Background. Building upon previous research, this study conducts an exploration into Large Language Models (LLMs), with an emphasis on the fine-tuning and... |
| SourceID | doaj unpaywall crossref |
| SourceType | Open Website Open Access Repository Index Database |
| StartPage | 33 |
| SubjectTerms | fine-tuning gpt llama llms mixtral overfitting |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Zj9MwELaW7gPwwCFA2-WQOV5dEl9xeMu2aTdom1ZtispT5CNGiKq7Qq0Q_HrGafdCAvESTSzHjmYczTd25huE3iVxLFMdeWJT5Ql4CEZCdQeiOdXSJGnMbDjRHZfydME_LsXyAL2-zIW5cX7PojSJ3jerryD12B10KAXA7Q46XJTT7HMoGgfuh0QqWV7LSu3Yg24_e8vntNT899Hd7fpC__yhV6sb_mT48DorZ_cbybfedmN69tcfJI3_fNVH6MEeTeJsZ_7H6KBZP0FfptksG-eAUzGg0CLsIlV4WJQ5qSDaK0c4Kwd48imfDYuqCvdFiUfTCp9ls1EO13K0yEAYTwb52fwDzjCMNSv65CSb5wPcn4xh_GI-KZ-ixTCv-qdkX0-B2FDdgojI2pA1br2TsWOuiRsZGL9iR43TmhsXyPVMwrxsmBOcGRFrgEvUO516wCHPUGd9vm6OEPaCJioCg8bC8MD9y7mzto1vEq6F7qI3lxqvL3a0GTWEG62eatATSDXropNgi6sOgem6bQCt1vsPp_bKKOsoVU5Irq02EJBSqaiNrLHUmy56e2XJv091_F-9nqN7NJT3bTdZXqDO5vu2eQmYY2Ne7Vfdb-T7yD0 priority: 102 providerName: Unpaywall |
| Title | PARAMETER EFFICIENT FINE-TUNING AND OVERFITTING IN GPT LARGE LANGUAGE MODELS: A METRIC-BASED COMPARISON |
| URI | https://doi.org/10.30970/eli.30.3 https://doaj.org/article/f8b8cd228d564acab4092682c0cbc2fb |
| UnpaywallVersion | publishedVersion |
| Volume | 30 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2224-0888 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0003313828 issn: 2224-087X databaseCode: DOA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3Na9swFBelO7Q9jI1uLPsoou1Vqy3JX70piZO4xE5InJKejD6sMQheGQ2l_32f7Kxkh7FLL-ZZGMm8Z_R-T7J-P4QuI98PE-lZopPYEsgQjDh1ByI5laGKEp9pt6ObF-FkxW_WwXpP6sv9E9bRA3eOu7KxirWhNDZByKWWCgoSGsZUe1ppapWbfb042Sum3BzMmOPWa-XoIEcRL47WHa0Q85LIu6o3P8H6zv5KRi1n_wk62jb38ulRbjZ7iWb0Dr3dIUQsujd7jw7q5hT9mIuFyFPAnhiQZeZWhko8yoqUlFDBFWMsiiGe3aaLUVaW7j4r8Hhe4qlYjFO4FuOVACOfDdPp8hoLDH0tsgHpi2U6xINZDv1ny1nxAa1GaTmYkJ1GAtFOsYIEntbuJLi2JvQNM7Vfh47FyzdUGSm5Mo4wT0XMhjUzAWcq8CVAIGqNTCxgi4_osPnV1J8QtgGNYg-C5AeKOz5fzo3Wbc0ScRnIHjr_46zqvqPCqKCEaD1agUfBqlgP9Z0bXx5w7NVtA8S02sW0-l9Me-jiJQj_Hurzawz1BR1Tp-jbrqt8RYcPv7f1N4AZD-qs_aLO0JtVMRd3zyFyx3Q |
| linkProvider | Directory of Open Access Journals |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Zj9MwELaW7gPwwCFA2-WQOV5dEl9xeMu2aTdom1ZtispT5CNGiKq7Qq0Q_HrGafdCAvESTSzHjmYczTd25huE3iVxLFMdeWJT5Ql4CEZCdQeiOdXSJGnMbDjRHZfydME_LsXyAL2-zIW5cX7PojSJ3jerryD12B10KAXA7Q46XJTT7HMoGgfuh0QqWV7LSu3Yg24_e8vntNT899Hd7fpC__yhV6sb_mT48DorZ_cbybfedmN69tcfJI3_fNVH6MEeTeJsZ_7H6KBZP0FfptksG-eAUzGg0CLsIlV4WJQ5qSDaK0c4Kwd48imfDYuqCvdFiUfTCp9ls1EO13K0yEAYTwb52fwDzjCMNSv65CSb5wPcn4xh_GI-KZ-ixTCv-qdkX0-B2FDdgojI2pA1br2TsWOuiRsZGL9iR43TmhsXyPVMwrxsmBOcGRFrgEvUO516wCHPUGd9vm6OEPaCJioCg8bC8MD9y7mzto1vEq6F7qI3lxqvL3a0GTWEG62eatATSDXropNgi6sOgem6bQCt1vsPp_bKKOsoVU5Irq02EJBSqaiNrLHUmy56e2XJv091_F-9nqN7NJT3bTdZXqDO5vu2eQmYY2Ne7Vfdb-T7yD0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PARAMETER+EFFICIENT+FINE-TUNING+AND+OVERFITTING+IN+GPT+LARGE+LANGUAGE+MODELS%3A+A+METRIC-BASED+COMPARISON&rft.jtitle=%D0%95%D0%BB%D0%B5%D0%BA%D1%82%D1%80%D0%BE%D0%BD%D1%96%D0%BA%D0%B0+%D1%82%D0%B0+%D1%96%D0%BD%D1%84%D0%BE%D1%80%D0%BC%D0%B0%D1%86%D1%96%D0%B8%CC%86%D0%BD%D1%96+%D1%82%D0%B5%D1%85%D0%BD%D0%BE%D0%BB%D0%BE%D0%B3%D1%96%D1%96&rft.au=Bohdan+Pavlyshenko&rft.au=Ivan+Bulka&rft.date=2025-06-01&rft.pub=Ivan+Franko+National+University+of+Lviv&rft.issn=2224-087X&rft.eissn=2224-0888&rft.issue=30&rft.spage=33&rft.epage=42&rft_id=info:doi/10.30970%2Feli.30.3&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_f8b8cd228d564acab4092682c0cbc2fb |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2224-087X&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2224-087X&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2224-087X&client=summon |