AceCoder: An Effective Prompting Technique Specialized in Code Generation
Large language models (LLMs) have shown great success in code generation. LLMs take as the input a prompt and output the code. How to make prompts (i.e., Prompting Techniques) is a key question. Existing prompting techniques are designed for natural language generation and have low accuracy in code...
        Saved in:
      
    
          | Published in | ACM transactions on software engineering and methodology Vol. 33; no. 8; pp. 1 - 26 | 
|---|---|
| Main Authors | , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York, NY
          ACM
    
        21.11.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1049-331X 1557-7392  | 
| DOI | 10.1145/3675395 | 
Cover
| Summary: | Large language models (LLMs) have shown great success in code generation. LLMs take as the input a prompt and output the code. How to make prompts (i.e., Prompting Techniques) is a key question. Existing prompting techniques are designed for natural language generation and have low accuracy in code generation. In this article, we propose a new prompting technique named AceCoder. Our motivation is that code generation meets two unique challenges (i.e., requirement understanding and code implementation). AceCoder contains two novel mechanisms (i.e., guided code generation and example retrieval) to solve these challenges. ❶ Guided code generation asks LLMs first to analyze requirements and output an intermediate preliminary (e.g., test cases). The preliminary clarifies requirements and tells LLMs “what to write.” ❷ Example retrieval selects similar programs as examples in prompts, which provide lots of relevant content (e.g., algorithms, APIs) and teach LLMs “how to write.” We apply AceCoder to four LLMs (e.g., GPT-3.5, CodeGeeX) and evaluate it on three public benchmarks using the Pass@ \(k\) . Results show that AceCoder can significantly improve the performance of LLMs on code generation. In terms of Pass@1, AceCoder outperforms the SOTA baseline by up to 56.4% in MBPP, 70.7% in MBJP, and 88.4% in MBJSP. AceCoder is effective in LLMs with different sizes (i.e., 6B–13B) and different languages (i.e., Python, Java, and JavaScript). Human evaluation shows human developers prefer programs from AceCoder. | 
|---|---|
| ISSN: | 1049-331X 1557-7392  | 
| DOI: | 10.1145/3675395 |