Extracting Knowledge Graphs from User Stories using LangChain

AI-generated keywords: Knowledge graphs User stories Large Language Models (LLMs) LangChain framework Requirements engineering

AI-generated Key Points

Novel methodology for generating knowledge graphs from user stories using Large Language Models (LLMs)
Addressing limitations of user stories in capturing overall system perspective in software development
Proposal to extract structured data from user stories and model them into knowledge graphs for enhanced comprehension
Development of LangChain framework and User Story Graph Transformer module for automating knowledge graph extraction process
Importance of aligning software functionalities with user expectations for more effective and user-centric software development
Acknowledgment of key contributors to the research, including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Thayná Camargo da Silva

arXiv: 2506.11020v1 - DOI (cs.SE)

Master thesis work

License: CC BY-NC-SA 4.0

Abstract: This thesis introduces a novel methodology for the automated generation of knowledge graphs from user stories by leveraging the advanced capabilities of Large Language Models. Utilizing the LangChain framework as a basis, the User Story Graph Transformer module was developed to extract nodes and relationships from user stories using an LLM to construct accurate knowledge graphs.This innovative technique was implemented in a script to fully automate the knowledge graph extraction process. Additionally, the evaluation was automated through a dedicated evaluation script, utilizing an annotated dataset for assessment. By enhancing the visualization and understanding of user requirements and domain concepts, this method fosters better alignment between software functionalities and user expectations, ultimately contributing to more effective and user-centric software development processes.

Submitted to arXiv on 14 May. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2506.11020v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

This Master's thesis presents a novel methodology for automatically generating knowledge graphs from user stories using the advanced capabilities of Large Language Models (LLMs). The research addresses the limitations of user stories in capturing the overall system perspective in software development. User stories are natural language descriptions of software requirements widely used in agile methodologies but often lack structured information for comprehensive system understanding. To overcome these challenges, the thesis proposes extracting structured data from user stories and modeling them into knowledge graphs. These visual and structured representations enhance data storage, analysis, and system comprehension. The LangChain framework serves as the basis for developing the User Story Graph Transformer module, which utilizes an LLM to accurately extract nodes and relationships from user stories. This innovative technique automates the knowledge graph extraction process through a script, streamlining the visualization and understanding of user requirements and domain concepts. An evaluation script with an annotated dataset is used to automate the assessment of knowledge graph accuracy. By improving alignment between software functionalities and user expectations, this method contributes to more effective and user-centric software development processes. The thesis acknowledges key contributors to the research including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham for their support throughout the academic journey. plays a crucial role in defining software from a user's perspective; thus highlighting the significance of extracting structured information from user stories for creating coherent and manageable systems. Overall, this research provides valuable insights into utilizing LLMs for knowledge graph generation from user stories to enhance software development processes and improve system comprehension through structured representations of requirements.

- Novel methodology for generating knowledge graphs from user stories using Large Language Models (LLMs)
- Addressing limitations of user stories in capturing overall system perspective in software development
- Proposal to extract structured data from user stories and model them into knowledge graphs for enhanced comprehension
- Development of LangChain framework and User Story Graph Transformer module for automating knowledge graph extraction process
- Importance of aligning software functionalities with user expectations for more effective and user-centric software development
- Acknowledgment of key contributors to the research, including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham

Summary- A new way to make knowledge graphs from user stories using big language models. - Fixing problems with user stories in showing the whole system view in making software. - Idea to get organized data from user stories and make them into knowledge graphs for better understanding. - Making a LangChain tool and User Story Graph Transformer part to help automatically make knowledge graphs. - It's important to match what users want with what software can do for better and more user-focused software. Definitions- Knowledge graphs: Organized information shown as nodes (points) connected by links (lines). - User stories: Simple descriptions of what users need or want from a product or service. - Large Language Models (LLMs): Advanced computer programs that understand and generate human-like text. - Framework: A structure or plan used to guide work or decisions. - Automating: Using machines or computers to do tasks without human help.

Introduction In the world of software development, user stories are an essential tool for capturing and communicating requirements. They are natural language descriptions that outline a specific feature or functionality from the perspective of the end-user. However, while user stories provide valuable insights into user expectations, they often lack structured information that can hinder comprehensive system understanding. To address this limitation, a Master's thesis titled "Automatically Generating Knowledge Graphs from User Stories using Large Language Models" presents a novel methodology for extracting structured data from user stories and modeling them into knowledge graphs. This innovative approach utilizes advanced capabilities of Large Language Models (LLMs) to automate the knowledge graph extraction process and enhance data storage, analysis, and system comprehension. Background User stories have become a popular method for defining software requirements in agile methodologies. They focus on the needs and desires of users rather than technical details, making them easy to understand by all stakeholders involved in software development. However, as systems become more complex with multiple functionalities and interactions between different components, it becomes challenging to capture the overall system perspective solely through user stories. This is where knowledge graphs come into play. A knowledge graph is a visual representation of structured data that shows relationships between entities or concepts within a domain. By creating these connections between different elements, knowledge graphs provide a more comprehensive understanding of complex systems. Methodology The LangChain framework serves as the basis for developing the User Story Graph Transformer module in this research paper. It utilizes an LLM to automatically extract nodes (entities) and relationships from user stories and model them into knowledge graphs. The use of LLMs allows for accurate extraction even when dealing with large amounts of unstructured text. The User Story Graph Transformer module works through a script that takes in raw user story text as input and outputs structured data in the form of nodes and edges representing entities and their relationships within the story. These nodes can then be connected to create a visual representation of the knowledge graph. Evaluation To assess the accuracy of the generated knowledge graphs, an evaluation script with an annotated dataset is used. This automated process allows for a quick and efficient evaluation of the extracted data. The results showed that the User Story Graph Transformer module was able to accurately extract nodes and relationships from user stories, demonstrating its effectiveness in creating structured representations of requirements. Impact By automating the extraction of structured data from user stories, this research paper contributes to more effective and user-centric software development processes. It improves alignment between software functionalities and user expectations by providing a visual representation of how different components are connected within a system. This can lead to better decision-making during development and ultimately result in more satisfied end-users. Acknowledgments The thesis acknowledges key contributors to this research including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham for their support throughout the academic journey. Conclusion In conclusion, "Automatically Generating Knowledge Graphs from User Stories using Large Language Models" presents an innovative approach for extracting structured data from user stories through LLMs and modeling them into knowledge graphs. This method addresses the limitations of user stories in capturing comprehensive system understanding and contributes to more effective software development processes by aligning functionalities with user expectations. Overall, this research provides valuable insights into utilizing LLMs for knowledge graph generation from user stories and highlights its significance in enhancing software development processes through structured representations of requirements.

Created on 22 Mar. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

67.8%

A Framework To Improve User Story Sets Through Collaboration

cs.SE

64.5%

ChatGPT as a tool for User Story Quality Evaluation: Trustworthy Out of the B…

cs.SE

62.6%

Can Large Language Models Transform Natural Language Intent into Formal Metho…

cs.SE

61.2%

Requirements Engineering using Generative AI: Prompts and Prompting Patterns

cs.SE

59.7%

Large Language Models in Fault Localisation

cs.SE

59.2%

The Matthew Effect of AI Programming Assistants: A Hidden Bias in Software Ev…

cs.SE

58.3%

Evaluating and Explaining Large Language Models for Code Using Syntactic Stru…

cs.SE

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.