A Study for Language Models as Agents for Strategic Decision-making Environments
Language Models (LMs) have proven highly effective in reasoning, understanding, and decision-making tasks. However, Large Language Models (LLMs) like GPT-4 face challenges when deployed in real-time complex environments, including high pricecosts and the need for extensive prompt engineering. These...
Saved in:
Published in | Inteonet jeongbo hakoe nonmunji = Journal of Korean Society for Internet Information Vol. 26; no. 1; pp. 157 - 169 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | Korean |
Published |
한국인터넷정보학회
28.02.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 1598-0170 2287-1136 |
DOI | 10.7472/jksii.2025.26.1.157 |
Cover
Summary: | Language Models (LMs) have proven highly effective in reasoning, understanding, and decision-making tasks. However, Large Language Models (LLMs) like GPT-4 face challenges when deployed in real-time complex environments, including high pricecosts and the need for extensive prompt engineering. These limitations make it difficult to use LLMs in dynamic and adaptive systems like robotics, autonomous agents, and strategic decision-making environments. To address these challenges, we propose a framework that utilizes Small Language Models (SLMs) as agents in strategic decision-making environments. By employing fine-tuning methods such as supervised fine-tuning (SFT) and instruction fine-tuning (IFT) on reasoning datasets, we enhance the capabilities of smaller models like Llama-3.1-8B to handle multi-step decision processes efficiently. This approach significantly reduces computational overhead while maintaining high performance without the need for prompt engineering. Our experiments, conducted in a battlefield environment, StarCraft Multi-Agent Challenge (SMAC), show that fine-tuned SLMs outperform LLMs in terms of both performance and efficiency. Notably, SLMs performed well in real-time scenarios and handled unseen data effectively, proving their robustness in dynamic environments. The elimination of prompt engineering further simplifies their use, making SLMs a practical, scalable alternative to LLMs for real-time decision-making. |
---|---|
Bibliography: | Korean Society for Internet Information KISTI1.1003/JNL.JAKO202509439605810 |
ISSN: | 1598-0170 2287-1136 |
DOI: | 10.7472/jksii.2025.26.1.157 |