Tracing and Visualizing Human-ML/AI Collaborative Processes through Artifacts of Data Work

AI-generated keywords: AutoML

AI-generated Key Points

  • The field of data work has become complex, requiring a range of skills that are often inaccessible to non-technical experts.
  • Automated Machine Learning (AutoML) technology automates certain aspects of data work, such as model selection and data preparation.
  • AutoML still requires human intervention to be functional, resulting in a complex and collaborative process that can be difficult to trace.
  • Researchers have constructed a taxonomy of data work artifacts that captures both the human and machine-generated processes involved in AutoML.
  • The taxonomy was developed through an extensive literature review spanning multiple fields.
  • The resulting taxonomy is concise, robust, comprehensive, extensible and explanatory.
  • It comprises multiple interrelated phases that leverage statistical and computational techniques for data preparation, analysis, deployment and communication.
  • To operationalize the taxonomy further, the researchers developed AutoMLTrace - a visual interactive sketch showing both the context and temporality of human-ML/AI collaboration in data work.
  • They demonstrated its utility via a usage scenario with an enterprise software development team.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jennifer Rogers and, Anamaria Crisan

CHI 2023 Best Paper Honorable Mention
License: CC BY 4.0

Abstract: Automated Machine Learning (AutoML) technology can lower barriers in data work yet still requires human intervention to be functional. However, the complex and collaborative process resulting from humans and machines trading off work makes it difficult to trace what was done, by whom (or what), and when. In this research, we construct a taxonomy of data work artifacts that captures AutoML and human processes. We present a rigorous methodology for its creation and discuss its transferability to the visual design process. We operationalize the taxonomy through the development of AutoMLTrace, a visual interactive sketch showing both the context and temporality of human-ML/AI collaboration in data work. Finally, we demonstrate the utility of our approach via a usage scenario with an enterprise software development team. Collectively, our research process and findings explore challenges and fruitful avenues for developing data visualization tools that interrogate the sociotechnical relationships in automated data work.

Submitted to arXiv on 05 Apr. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2304.02699v1

The field of data work has become increasingly complex, requiring a range of skills that are often inaccessible to non-technical experts. Automated Machine Learning (AutoML) technology has emerged as a solution to this problem by automating certain aspects of data work, such as model selection and data preparation. However, AutoML still requires human intervention to be functional, resulting in a complex and collaborative process that can be difficult to trace. To address this issue, researchers have constructed a taxonomy of data work artifacts that captures both the human and machine-generated processes involved in AutoML. The taxonomy was developed through an extensive literature review spanning Machine Learning, Human Computer Interaction, Computer Supported Collaborative Work, Information Visualization, and Visual Analytics. The research team gathered an initial set of 13 papers before identifying a systematic set of published research and pre-prints on AutoML. They then conducted eight iterations to develop the taxonomy by reading the literature sources, extracting artifacts that met the definition of the meta characteristic, classifying those items, and grouping them according to an evolving set of artifact properties. The resulting taxonomy is concise, robust, comprehensive, extensible and explanatory. It comprises multiple interrelated phases that leverage statistical and computational techniques for data preparation, analysis, deployment and communication. The taxonomy also includes artifact identification and classification methods for capturing artifacts with their properties and dependencies. To operationalize the taxonomy further, the researchers developed AutoMLTrace - a visual interactive sketch showing both the context and temporality of human-ML/AI collaboration in data work. They demonstrated its utility via a usage scenario with an enterprise software development team. Finally, reflecting on their approach to developing the taxonomy through a broad literature review but assessing its utility primarily through a collaboration with one team; they acknowledge more work is required to assess its generality but remain optimistic about its potential for developing data visualization tools that interrogate sociotechnical relationships in automated data work.
Created on 08 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.