Context Matters: Evaluating Context Strategies for Automated ADR Generation Using LLMs

AI-generated keywords: Architecture Decision Records (ADRs)

AI-generated Key Points

Architecture Decision Records (ADRs) are crucial for preserving system design rationale
Large Language Models (LLMs) can help alleviate the burden of creating and maintaining ADRs
Context-aware prompting enhances ADR generation fidelity
Recency-based context selection is recommended for automated ADR generation
Effective ADR automation relies more on context engineering than model scale alone

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Aviral Gupta, Rudra Dhar, Daniel Feitosa, Karthik Vaidhyanathan

arXiv: 2604.03826v2 - DOI (cs.SE)

11 pages, 5 diagrams, Accepted at EASE Conference 2026 Research Track

License: CC BY 4.0

Abstract: Architecture Decision Records (ADRs) play a critical role in preserving the rationale behind system design, yet their creation and maintenance are often neglected due to the associated authoring overhead. This paper investigates whether Large Language Models (LLMs) can mitigate this burden and, more importantly, how different strategies for presenting historical ADRs as context influence generation quality. We curate and validate a large corpus of sequential ADRs drawn from 750 open-source repositories and systematically evaluate five context selection strategies (no context, All-history, First-K, Last-K, and RAFG) across multiple model families. Our results show that context-aware prompting substantially improves ADR generation fidelity, with a small recency window (typically 3-5 prior records) providing the best balance between quality and efficiency. Retrieval-based context selection yields marginal gains primarily in non-sequential or cross-cutting decision scenarios, while offering no statistically significant advantage in typical linear ADR workflows. Overall, our findings demonstrate that context engineering, rather than model scale alone, is the dominant factor in effective ADR automation, and we outline practical defaults for tool builders along with targeted retrieval fallbacks for complex architectural settings.

Submitted to arXiv on 04 Apr. 2026

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2604.03826v2

Comprehensive Summary
Key points
Layman's Summary
Blog article

, , , , Architecture Decision Records (ADRs) are crucial for preserving the rationale behind system design, but their creation and maintenance often face neglect due to authoring overhead. This study explores how Large Language Models (LLMs) can alleviate this burden and examines different strategies for presenting historical ADRs as context to enhance generation quality. By analyzing a vast corpus of sequential ADRs from open-source repositories, five context selection strategies were evaluated across various model families. The results indicate that context-aware prompting significantly enhances ADR generation fidelity, with a small recency window (typically 3-5 prior records) striking the best balance between quality and efficiency. <break> Retrieval-based context selection offers marginal gains in non-sequential or cross-cutting decision scenarios but does not show significant advantages in linear ADR workflows. The study emphasizes that effective ADR automation relies more on context engineering than model scale alone. Furthermore, the longitudinal analysis reveals that foundational decisions shape system structure, while subsequent decisions evolve based on their immediate predecessors. The RAFG strategy excels in addressing cross-cutting concerns that span multiple components or reactivate dormant architectural patterns, emphasizing the importance of considering architectural scope in context selection. <break> The study also identifies common documentation issues such as external content dependency and knowledge vaporization affecting ADR quality. Practitioners are advised to prioritize recency-based context selection as a default strategy for automated ADR generation, leveraging simpler approaches like Last-K to reduce implementation barriers. Model scale is found to be less critical than previously assumed, with compact models demonstrating comparable quality when provided with appropriate context. <break> Organizations are encouraged to maintain self-contained architectural documentation to enhance both automated tool performance and long-term utility. Addressing incomplete documentation through automated generation can help recover undocumented architectural decisions and mitigate documentation debt effectively. In conclusion, this research provides valuable insights for practitioners implementing automated ADR generation, highlighting the significance of strategic factors like context selection, model scale considerations, and comprehensive documentation practices in optimizing the effectiveness of automated tools for architectural knowledge management.

- Architecture Decision Records (ADRs) are crucial for preserving system design rationale
- Large Language Models (LLMs) can help alleviate the burden of creating and maintaining ADRs
- Context-aware prompting enhances ADR generation fidelity
- Recency-based context selection is recommended for automated ADR generation
- Effective ADR automation relies more on context engineering than model scale alone

Summary1. ADRs are important for keeping track of why we design things a certain way. 2. LLMs can make it easier to create and keep ADRs up to date. 3. Context-aware prompting helps make ADRs more accurate. 4. Choosing context based on recent information is good for making ADRs automatically. 5. Making ADR automation work well depends on understanding the situation, not just using a big model. Definitions- Architecture Decision Records (ADRs): Important documents that explain why we design systems in specific ways. - Large Language Models (LLMs): Advanced computer programs that can help with writing and understanding text. - Fidelity: How accurate or true something is compared to the original. - Recency-based: Using the most recent or latest information available. - Automation: Using machines or computers to do tasks automatically without human intervention. - Context engineering: Understanding the specific situation or environment in which something is happening and using that knowledge effectively.

Introduction

Architecture Decision Records (ADRs) are essential for capturing the rationale behind system design decisions. They serve as a valuable source of information for future reference and aid in understanding the evolution of a system's architecture. However, creating and maintaining ADRs can be a time-consuming task that is often neglected due to its authoring overhead. This research paper explores how Large Language Models (LLMs) can alleviate this burden by automating ADR generation.

The Importance of Context in ADR Generation

The study focuses on the role of context in generating high-quality ADRs. Context refers to the historical records and decisions that provide background information for understanding a particular decision. In traditional manual ADR creation, authors have to manually select relevant context, which can be challenging and prone to errors. The use of LLMs allows for automated selection of context based on various strategies, which are evaluated in this study.

Context Selection Strategies

Five different context selection strategies were evaluated: Last-K, Recency-based All-Previous, Recency-based All-Previous with Filtering (RAFG), Retrieval-based Last-K, and Retrieval-based RAFG. The results showed that recency-based prompting significantly improves the quality of generated ADRs compared to other strategies. Last-K strategy selects K previous records as context without considering their recency or relevance to the current decision being made. On the other hand, Recency-based All-Previous considers all previous records within a specific time window as relevant context while filtering out irrelevant ones using natural language processing techniques. RAFG strategy takes into account both recency and relevance by selecting only those previous records that are related to the current decision being made. This approach proved to be most effective in addressing cross-cutting concerns that span multiple components or reactivate dormant architectural patterns. Retrieval-based strategies use external sources such as code comments or issue trackers to retrieve relevant context for ADR generation. While this approach showed marginal improvements in non-sequential or cross-cutting decision scenarios, it did not show significant advantages in linear ADR workflows.

Importance of Context Engineering

The study emphasizes that effective ADR automation relies more on context engineering than model scale alone. This means that the quality and relevance of selected context have a more significant impact on the fidelity of generated ADRs than the size of the LLM used. Therefore, organizations should focus on developing robust strategies for selecting relevant and timely context to optimize automated ADR generation.

The Evolution of Architectural Decisions

The longitudinal analysis conducted in this study reveals interesting patterns in how architectural decisions evolve over time. The results show that foundational decisions shape system structure, while subsequent decisions are influenced by their immediate predecessors. This highlights the importance of considering architectural scope when selecting context for automated ADR generation.

Challenges with Traditional Documentation Practices

The study also identifies common documentation issues that can affect the quality and effectiveness of automated ADR generation. These include external content dependency, where important information is stored outside of the actual record, and knowledge vaporization, where critical information is lost due to incomplete or outdated documentation. To address these challenges, practitioners are advised to prioritize recency-based context selection as a default strategy for automated ADR generation. Additionally, maintaining self-contained architectural documentation can enhance both automated tool performance and long-term utility.

Conclusion

In conclusion, this research paper provides valuable insights for practitioners looking to implement automated ADR generation in their organizations. It highlights the significance of strategic factors such as context selection, model scale considerations, and comprehensive documentation practices in optimizing the effectiveness of these tools for managing architectural knowledge. Organizations are encouraged to invest in developing robust strategies for selecting relevant and timely context, as well as maintaining comprehensive and self-contained architectural documentation. By addressing these challenges, automated ADR generation can help recover undocumented decisions and mitigate documentation debt effectively. With the use of LLMs and proper context engineering, organizations can streamline the process of creating and maintaining ADRs while preserving the rationale behind their system design decisions.

Created on 22 Apr. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

72.0%

Can LLMs Generate Architectural Design Decisions? -An Exploratory Empirical s…

cs.SE

58.7%

Seven Failure Points When Engineering a Retrieval Augmented Generation System

cs.SE

57.3%

Exploring LLM-based Agents for Root Cause Analysis

cs.SE

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.