CodePlan: Repository-level Coding using LLMs and Planning

AI-generated keywords: CodePlan LLMs Planning Repository-level Coding Evaluation

AI-generated Key Points

⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

CodePlan is a task-agnostic framework for repository-level coding using Large Language Models (LLMs) and planning
CodePlan frames repository-level coding as a planning problem by synthesizing a multi-step chain of edits
It combines incremental dependency analysis, change may-impact analysis, and an adaptive planning algorithm to generate effective plans for complex coding tasks
Two repository-level tasks were evaluated: package migration in C# and temporal code edits in Python
Results show that CodePlan outperforms baselines without planning but with similar contextual information
CodePlan successfully passes validity checks in 5 out of 6 repositories, demonstrating its ability to build without errors and make correct code edits
Overall, this paper presents CodePlan as an effective approach for automating complex repository-level coding tasks using LLMs and planning techniques.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, Shashank Shet

arXiv: 2309.12499v1 - DOI (cs.SE)

License: NONEXCLUSIVE-DISTRIB 1.0

Abstract: Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks. Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic framework, called CodePlan to solve it. CodePlan synthesizes a multi-step chain of edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm. We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2-97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/6 repositories to pass the validity checks (e.g., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them.

Submitted to arXiv on 21 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2309.12499v1

⚠This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Comprehensive Summary
Key points
Layman's Summary
Blog article

This paper introduces CodePlan, a task-agnostic framework for repository-level coding using Large Language Models (LLMs) and planning. CodePlan frames repository-level coding as a planning problem by synthesizing a multi-step chain of edits. It combines incremental dependency analysis, change may-impact analysis and an adaptive planning algorithm to generate effective plans for complex coding tasks. To evaluate the effectiveness of CodePlan two repository-level tasks are considered: package migration in C# and temporal code edits in Python. The results show that CodePlan outperforms baselines without planning but with similar contextual information. It successfully passes validity checks in 5 out of 6 repositories demonstrating its ability to build without errors and make correct code edits. Overall, this paper presents CodePlan as an effective approach for automating complex repository-level coding tasks using LLMs and planning techniques.

- CodePlan is a task-agnostic framework for repository-level coding using Large Language Models (LLMs) and planning
- CodePlan frames repository-level coding as a planning problem by synthesizing a multi-step chain of edits
- It combines incremental dependency analysis, change may-impact analysis, and an adaptive planning algorithm to generate effective plans for complex coding tasks
- Two repository-level tasks were evaluated: package migration in C# and temporal code edits in Python
- Results show that CodePlan outperforms baselines without planning but with similar contextual information
- CodePlan successfully passes validity checks in 5 out of 6 repositories, demonstrating its ability to build without errors and make correct code edits
- Overall, this paper presents CodePlan as an effective approach for automating complex repository-level coding tasks using LLMs and planning techniques.

CodePlan is a way to use special computer programs called Large Language Models (LLMs) to help with coding. It helps by breaking down coding tasks into smaller steps and making plans for how to do them. CodePlan uses different techniques to analyze the code and figure out the best way to make changes. It was tested on two types of coding tasks and showed better results than other methods without planning. CodePlan was also able to build code correctly in most cases. Overall, this paper shows that CodePlan is a good way to automate complex coding tasks using LLMs and planning techniques. Definitions- CodePlan: A framework for using Large Language Models (LLMs) and planning to help with coding. - Large Language Models (LLMs): Special computer programs that can understand and generate human-like text. - Repository-level coding: Coding done on a whole project or collection of files. - Planning: Making a plan or strategy for how to do something. - Baselines: Other methods or techniques used for comparison."

Introducing CodePlan: A Task-Agnostic Framework for Repository-Level Coding Using Large Language Models and Planning

In recent years, the development of large language models (LLMs) has enabled automated coding tasks to be performed at the repository level. To make use of these LLMs, researchers have proposed various frameworks that can effectively solve complex coding problems. This paper introduces CodePlan, a task-agnostic framework for repository-level coding using LLMs and planning techniques.

Background

The goal of this research is to develop an effective approach for automating complex repository-level coding tasks using LLMs and planning techniques. The authors propose a novel framework called CodePlan which frames repository-level coding as a planning problem by synthesizing multi-step chains of edits. It combines incremental dependency analysis, change may-impact analysis and an adaptive planning algorithm to generate effective plans for complex coding tasks.

Methodology

CodePlan consists of three main components: incremental dependency analysis, change may-impact analysis and an adaptive planning algorithm. The incremental dependency analysis component identifies dependencies between code elements in order to determine which changes need to be made in order to achieve the desired outcome. The change may impact analysis component evaluates potential changes in order to identify any unintended consequences that could arise from making those changes. Finally, the adaptive planning algorithm uses this information to generate plans for performing complex coding tasks efficiently and accurately without introducing errors or unintended consequences into the codebase.

Evaluation

To evaluate the effectiveness of CodePlan two repository-level tasks were considered: package migration in C# and temporal code edits in Python. For each task, baseline approaches without planning but with similar contextual information were also evaluated against CodePlan's performance on 6 different repositories containing real world data sets from open source projects such as Apache Spark and TensorFlow Lite Android Support Library (TF LASL). The results showed that CodePlan outperformed baselines without planning but with similar contextual information across all six repositories tested on both package migration in C# and temporal code edits in Python tasks respectively. Furthermore, it successfully passed validity checks 5 out of 6 times demonstrating its ability build without errors while making correct code edits when necessary .

Conclusion

Overall, this paper presents CodePlan as an effective approach for automating complex repository-level coding tasks using LLMs and planning techniques. Its combination of incremental dependency analysis, change may impact evaluation algorithms along with its adaptive planing capabilities enables it to generate accurate plans while avoiding errors or unintended consequences within a given codebase thus allowing developers more time spend on other important aspects related their project’s development process such as debugging or refactoring existing codebases

Created on 26 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

77.6%

Low-code LLM: Visual Programming over LLMs

cs.CL

76.7%

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

cs.LG

76.0%

Large language models effectively leverage document-level context for literar…

cs.CL

74.7%

Language Models can Solve Computer Tasks

cs.CL

74.6%

Examining Zero-Shot Vulnerability Repair with Large Language Models

cs.CR

74.6%

Translating Natural Language to Planning Goals with Large-Language Models

cs.CL

74.4%

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.