Vyzkoušejte nový nástroj s podporou AI
Summon Research Assistant
BETA
Enhancing Fine-Tuning LLM Evaluation: A Study on Calibration and Metrics for Industry-Specific AI Alignment
Stavarache, Lucia Larise
Published in 2025 IEEE Conference on Artificial Intelligence (CAI) (05.05.2025)
Published in 2025 IEEE Conference on Artificial Intelligence (CAI) (05.05.2025)
Get full text
Conference Proceeding
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding
Azime, Israel Abebe, Tonja, Atnafu Lambebo, Belay, Tadesse Destaw, Chanie, Yonas, Balcha, Bontu Fufa, Abadi, Negasi Haile, Ademtew, Henok Biadglign, Nerea, Mulubrhan Abebe, Yadeta, Debela Desalegn, Geremew, Derartu Dagne, tesfau, Assefa Atsbiha, Slusallek, Philipp, Solorio, Thamar, Klakow, Dietrich
Year of Publication 07.11.2024
Year of Publication 07.11.2024
Get full text
Journal Article
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding
Israel Abebe Azime, Tonja, Atnafu Lambebo, Tadesse Destaw Belay, Yonas Chanie, Bontu Fufa Balcha, Negasi Haile Abadi, Ademtew, Henok Biadglign, Mulubrhan, Abebe Nerea, Yadeta, Debela Desalegn, Derartu Dagne Geremew, Assefa Atsbiha tesfau, Slusallek, Philipp, Solorio, Thamar, Klakow, Dietrich
Published in arXiv.org (16.11.2024)
Get full text
Published in arXiv.org (16.11.2024)
Paper
Evaluating Large Language Model Robustness using Combinatorial Testing
Chandrasekaran, Jaganmohan, Patel, Ankita Ramjibhai, Lanus, Erin, Freeman, Laura J.
Published in 2025 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) (31.03.2025)
Published in 2025 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) (31.03.2025)
Get full text
Conference Proceeding
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models
Ibrahim, Lujain, Akbulut, Canfer, Elasmar, Rasmi, Rastogi, Charvi, Kahng, Minsuk, Morris, Meredith Ringel, McKee, Kevin R, Rieser, Verena, Shanahan, Murray, Weidinger, Laura
Year of Publication 10.02.2025
Year of Publication 10.02.2025
Get full text
Journal Article
Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?
Ghahroodi, Omid, Nouri, Marzia, Sanian, Mohammad Vali, Sahebi, Alireza, Dastgheib, Doratossadat, Asgari, Ehsaneddin, Baghshah, Mahdieh Soleymani, Rohban, Mohammad Hossein
Year of Publication 09.04.2024
Year of Publication 09.04.2024
Get full text
Journal Article
Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?
Ghahroodi, Omid, Nouri, Marzia, Mohammad Vali Sanian, Sahebi, Alireza, Dastgheib, Doratossadat, Asgari, Ehsaneddin, Mahdieh Soleymani Baghshah, Mohammad Hossein Rohban
Published in arXiv.org (09.04.2024)
Get full text
Published in arXiv.org (09.04.2024)
Paper