Can AI Serve as a Substitute for Human Subjects in Software Engineering Research?

AI-generated keywords: sociotechnical domains

AI-generated Key Points

  • Sociotechnical domains like Software Engineering face challenges with qualitative data collection methods in terms of scale, labor intensity, and participant recruitment.
  • The proposed solution is to leverage artificial intelligence (AI), specifically large language models (LLMs) such as ChatGPT, for qualitative data collection in software engineering research.
  • AI-generated synthetic text can replicate human responses and behaviors, enabling automation of data collection across various methodologies like persona-based prompting for interviews, multi-persona dialogue for focus groups, and mega-persona responses for surveys.
  • AI models offer scalable and efficient means of data generation while providing insights into human attitudes, experiences, and performance.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Marco A. Gerosa, Bianca Trinkenreich, Igor Steinmacher, Anita Sarma

License: CC BY-SA 4.0

Abstract: Research within sociotechnical domains, such as Software Engineering, fundamentally requires a thorough consideration of the human perspective. However, traditional qualitative data collection methods suffer from challenges related to scale, labor intensity, and the increasing difficulty of participant recruitment. This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI), especially large language models (LLMs) like ChatGPT. We explore the potential of AI-generated synthetic text as an alternative source of qualitative data, by discussing how LLMs can replicate human responses and behaviors in research settings. We examine the application of AI in automating data collection across various methodologies, including persona-based prompting for interviews, multi-persona dialogue for focus groups, and mega-persona responses for surveys. Additionally, we discuss the prospective development of new foundation models aimed at emulating human behavior in observational studies and user evaluations. By simulating human interaction and feedback, these AI models could offer scalable and efficient means of data generation, while providing insights into human attitudes, experiences, and performance. We discuss several open problems and research opportunities to implement this vision and conclude that while AI could augment aspects of data gathering in software engineering research, it cannot replace the nuanced, empathetic understanding inherent in human subjects in some cases, and an integrated approach where both AI and human-generated data coexist will likely yield the most effective outcomes.

Submitted to arXiv on 18 Nov. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2311.11081v1

, , , , In the realm of sociotechnical domains like Software Engineering, qualitative data collection methods face challenges in terms of scale, labor intensity, and participant recruitment. To address these issues, this vision paper proposes leveraging artificial intelligence (AI), specifically large language models (LLMs) such as ChatGPT, for qualitative data collection in software engineering research. By utilizing AI-generated synthetic text that replicates human responses and behaviors, researchers can automate data collection across various methodologies like persona-based prompting for interviews, multi-persona dialogue for focus groups, and mega-persona responses for surveys. The paper discusses how AI models could offer scalable and efficient means of data generation while providing insights into human attitudes, experiences, and performance. : In the realm of sociotechnical domains like Software Engineering : Qualitative data collection methods face challenges in terms of scale, labor intensity, and participant recruitment. : This vision paper proposes leveraging artificial intelligence (AI), specifically large language models (LLMs) such as ChatGPT. : By utilizing AI-generated synthetic text that replicates human responses and behaviors. : For qualitative data collection in software engineering research.
Created on 09 Sep. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.