TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter

AI-generated keywords: Generative Artificial Intelligence TWIGMA Metadata Variability Themes

AI-generated Key Points

  • Recent progress in generative artificial intelligence (gen-AI) has revolutionized the creation of photo-realistic and artistically-inspiring images
  • TWIGMA is an extensive dataset comprising over 800,000 gen-AI images collected from January 2021 to March 2023 on Twitter, along with associated metadata
  • Gen-AI images possess distinctive characteristics and exhibit lower variability compared to natural images and human artwork
  • There is an inverse correlation between the similarity of a gen-AI image to natural images and the number of likes it receives
  • The similarity measure can be utilized to identify human images that served as inspiration for gen-AI creations
  • Over time, users have increasingly shared artistically sophisticated content such as intricate human portraits while showing decreased interest in simple subjects like natural scenes and animals
  • Figure 1 showcases the curation process of TWIGMA, the growth in tweets featuring generative AI images over time, and presents a wordcloud highlighting prevalent keywords extracted from the tweets
  • Section 2 reviews relevant work on image datasets and evaluation metrics for model novelty and variation
  • Section 3 outlines the data collection and analysis process
  • Empirical results addressing the research questions are presented in Section 4.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yiqun Chen, James Zou

License: CC BY 4.0

Abstract: Recent progress in generative artificial intelligence (gen-AI) has enabled the generation of photo-realistic and artistically-inspiring photos at a single click, catering to millions of users online. To explore how people use gen-AI models such as DALLE and StableDiffusion, it is critical to understand the themes, contents, and variations present in the AI-generated photos. In this work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a comprehensive dataset encompassing over 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text, creation date, number of likes). Through a comparative analysis of TWIGMA with natural images and human artwork, we find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to their non-gen-AI counterparts. Additionally, we find that the similarity between a gen-AI image and natural images (i) is inversely correlated with the number of likes; and (ii) can be used to identify human images that served as inspiration for the gen-AI creations. Finally, we observe a longitudinal shift in the themes of AI-generated images on Twitter, with users increasingly sharing artistically sophisticated content such as intricate human portraits, whereas their interest in simple subjects such as natural scenes and animals has decreased. Our analyses and findings underscore the significance of TWIGMA as a unique data resource for studying AI-generated images.

Submitted to arXiv on 14 Jun. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2306.08310v1

Recent progress in generative artificial intelligence (gen-AI) has revolutionized the creation of photo-realistic and artistically-inspiring images, making them easily accessible to millions of users online. To gain insights into how people utilize gen-AI models like DALLE and StableDiffusion, it is crucial to understand the themes, contents, and variations present in these AI-generated photos. In this paper, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), an extensive dataset comprising over 800,000 gen-AI images collected from January 2021 to March 2023 on Twitter, along with associated metadata such as tweet text, creation date, and number of likes. To analyze TWIGMA comprehensively, we compare it with natural images and human artwork. Our findings reveal that gen-AI images possess distinctive characteristics and exhibit lower variability on average compared to their non-gen-AI counterparts. Interestingly, we observe an inverse correlation between the similarity of a gen-AI image to natural images and the number of likes it receives. Moreover, we demonstrate that the similarity measure can be utilized to identify human images that served as inspiration for gen-AI creations. Furthermore, our analysis reveals a longitudinal shift in the themes of AI-generated images shared on Twitter. Over time, users have increasingly shared artistically sophisticated content such as intricate human portraits while showing decreased interest in simple subjects like natural scenes and animals. The paper also provides additional context through Figure 1. It showcases the curation process of TWIGMA resulting in approximately 800,000 images posted from January 2021 to March 2023. The figure also depicts the steady growth in tweets featuring generative AI images over time and presents a wordcloud highlighting prevalent keywords extracted from the tweets. In terms of organization, the rest of the paper is structured as follows: Section 2 reviews relevant work on image datasets and evaluation metrics for model novelty and variation. Section 3 outlines the data collection and analysis process. Empirical results addressing the research questions are presented in Section 4.
Created on 29 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.