This Master's thesis presents a novel methodology for automatically generating knowledge graphs from user stories using the advanced capabilities of Large Language Models (LLMs). The research addresses the limitations of user stories in capturing the overall system perspective in software development. User stories are natural language descriptions of software requirements widely used in agile methodologies but often lack structured information for comprehensive system understanding. To overcome these challenges, the thesis proposes extracting structured data from user stories and modeling them into knowledge graphs. These visual and structured representations enhance data storage, analysis, and system comprehension. The LangChain framework serves as the basis for developing the User Story Graph Transformer module, which utilizes an LLM to accurately extract nodes and relationships from user stories. This innovative technique automates the knowledge graph extraction process through a script, streamlining the visualization and understanding of user requirements and domain concepts. An evaluation script with an annotated dataset is used to automate the assessment of knowledge graph accuracy. By improving alignment between software functionalities and user expectations, this method contributes to more effective and user-centric software development processes. The thesis acknowledges key contributors to the research including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham for their support throughout the academic journey. plays a crucial role in defining software from a user's perspective; thus highlighting the significance of extracting structured information from user stories for creating coherent and manageable systems. Overall, this research provides valuable insights into utilizing LLMs for knowledge graph generation from user stories to enhance software development processes and improve system comprehension through structured representations of requirements.
- - Novel methodology for generating knowledge graphs from user stories using Large Language Models (LLMs)
- - Addressing limitations of user stories in capturing overall system perspective in software development
- - Proposal to extract structured data from user stories and model them into knowledge graphs for enhanced comprehension
- - Development of LangChain framework and User Story Graph Transformer module for automating knowledge graph extraction process
- - Importance of aligning software functionalities with user expectations for more effective and user-centric software development
- - Acknowledgment of key contributors to the research, including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham
Summary- A new way to make knowledge graphs from user stories using big language models.
- Fixing problems with user stories in showing the whole system view in making software.
- Idea to get organized data from user stories and make them into knowledge graphs for better understanding.
- Making a LangChain tool and User Story Graph Transformer part to help automatically make knowledge graphs.
- It's important to match what users want with what software can do for better and more user-focused software.
Definitions- Knowledge graphs: Organized information shown as nodes (points) connected by links (lines).
- User stories: Simple descriptions of what users need or want from a product or service.
- Large Language Models (LLMs): Advanced computer programs that understand and generate human-like text.
- Framework: A structure or plan used to guide work or decisions.
- Automating: Using machines or computers to do tasks without human help.
Introduction
In the world of software development, user stories are an essential tool for capturing and communicating requirements. They are natural language descriptions that outline a specific feature or functionality from the perspective of the end-user. However, while user stories provide valuable insights into user expectations, they often lack structured information that can hinder comprehensive system understanding.
To address this limitation, a Master's thesis titled "Automatically Generating Knowledge Graphs from User Stories using Large Language Models" presents a novel methodology for extracting structured data from user stories and modeling them into knowledge graphs. This innovative approach utilizes advanced capabilities of Large Language Models (LLMs) to automate the knowledge graph extraction process and enhance data storage, analysis, and system comprehension.
Background
User stories have become a popular method for defining software requirements in agile methodologies. They focus on the needs and desires of users rather than technical details, making them easy to understand by all stakeholders involved in software development. However, as systems become more complex with multiple functionalities and interactions between different components, it becomes challenging to capture the overall system perspective solely through user stories.
This is where knowledge graphs come into play. A knowledge graph is a visual representation of structured data that shows relationships between entities or concepts within a domain. By creating these connections between different elements, knowledge graphs provide a more comprehensive understanding of complex systems.
Methodology
The LangChain framework serves as the basis for developing the User Story Graph Transformer module in this research paper. It utilizes an LLM to automatically extract nodes (entities) and relationships from user stories and model them into knowledge graphs. The use of LLMs allows for accurate extraction even when dealing with large amounts of unstructured text.
The User Story Graph Transformer module works through a script that takes in raw user story text as input and outputs structured data in the form of nodes and edges representing entities and their relationships within the story. These nodes can then be connected to create a visual representation of the knowledge graph.
Evaluation
To assess the accuracy of the generated knowledge graphs, an evaluation script with an annotated dataset is used. This automated process allows for a quick and efficient evaluation of the extracted data. The results showed that the User Story Graph Transformer module was able to accurately extract nodes and relationships from user stories, demonstrating its effectiveness in creating structured representations of requirements.
Impact
By automating the extraction of structured data from user stories, this research paper contributes to more effective and user-centric software development processes. It improves alignment between software functionalities and user expectations by providing a visual representation of how different components are connected within a system. This can lead to better decision-making during development and ultimately result in more satisfied end-users.
Acknowledgments
The thesis acknowledges key contributors to this research including Prof. Dr. Leen Lambers, Dr. Kate Revoredo, Dr. Sébastien Mosser, and Prof. Dr. Douglas Cunningham for their support throughout the academic journey.
Conclusion
In conclusion, "Automatically Generating Knowledge Graphs from User Stories using Large Language Models" presents an innovative approach for extracting structured data from user stories through LLMs and modeling them into knowledge graphs. This method addresses the limitations of user stories in capturing comprehensive system understanding and contributes to more effective software development processes by aligning functionalities with user expectations.
Overall, this research provides valuable insights into utilizing LLMs for knowledge graph generation from user stories and highlights its significance in enhancing software development processes through structured representations of requirements.