A Study for Language Models as Agents for Strategic Decision-making Environments

Language Models (LMs) have proven highly effective in reasoning, understanding, and decision-making tasks. However, Large Language Models (LLMs) like GPT-4 face challenges when deployed in real-time complex environments, including high pricecosts and the need for extensive prompt engineering. These...

Full description

Saved in:
Bibliographic Details
Published inInteonet jeongbo hakoe nonmunji = Journal of Korean Society for Internet Information Vol. 26; no. 1; pp. 157 - 169
Main Authors 오지환, Jihwan Oh, 윤세영, Se-young Yun
Format Journal Article
LanguageKorean
Published 한국인터넷정보학회 28.02.2025
Subjects
Online AccessGet full text
ISSN1598-0170
2287-1136
DOI10.7472/jksii.2025.26.1.157

Cover

More Information
Summary:Language Models (LMs) have proven highly effective in reasoning, understanding, and decision-making tasks. However, Large Language Models (LLMs) like GPT-4 face challenges when deployed in real-time complex environments, including high pricecosts and the need for extensive prompt engineering. These limitations make it difficult to use LLMs in dynamic and adaptive systems like robotics, autonomous agents, and strategic decision-making environments. To address these challenges, we propose a framework that utilizes Small Language Models (SLMs) as agents in strategic decision-making environments. By employing fine-tuning methods such as supervised fine-tuning (SFT) and instruction fine-tuning (IFT) on reasoning datasets, we enhance the capabilities of smaller models like Llama-3.1-8B to handle multi-step decision processes efficiently. This approach significantly reduces computational overhead while maintaining high performance without the need for prompt engineering. Our experiments, conducted in a battlefield environment, StarCraft Multi-Agent Challenge (SMAC), show that fine-tuned SLMs outperform LLMs in terms of both performance and efficiency. Notably, SLMs performed well in real-time scenarios and handled unseen data effectively, proving their robustness in dynamic environments. The elimination of prompt engineering further simplifies their use, making SLMs a practical, scalable alternative to LLMs for real-time decision-making.
Bibliography:Korean Society for Internet Information
KISTI1.1003/JNL.JAKO202509439605810
ISSN:1598-0170
2287-1136
DOI:10.7472/jksii.2025.26.1.157