TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter
AI-generated Key Points
- Recent progress in generative artificial intelligence (gen-AI) has revolutionized the creation of photo-realistic and artistically-inspiring images
- TWIGMA is an extensive dataset comprising over 800,000 gen-AI images collected from January 2021 to March 2023 on Twitter, along with associated metadata
- Gen-AI images possess distinctive characteristics and exhibit lower variability compared to natural images and human artwork
- There is an inverse correlation between the similarity of a gen-AI image to natural images and the number of likes it receives
- The similarity measure can be utilized to identify human images that served as inspiration for gen-AI creations
- Over time, users have increasingly shared artistically sophisticated content such as intricate human portraits while showing decreased interest in simple subjects like natural scenes and animals
- Figure 1 showcases the curation process of TWIGMA, the growth in tweets featuring generative AI images over time, and presents a wordcloud highlighting prevalent keywords extracted from the tweets
- Section 2 reviews relevant work on image datasets and evaluation metrics for model novelty and variation
- Section 3 outlines the data collection and analysis process
- Empirical results addressing the research questions are presented in Section 4.
Authors: Yiqun Chen, James Zou
Abstract: Recent progress in generative artificial intelligence (gen-AI) has enabled the generation of photo-realistic and artistically-inspiring photos at a single click, catering to millions of users online. To explore how people use gen-AI models such as DALLE and StableDiffusion, it is critical to understand the themes, contents, and variations present in the AI-generated photos. In this work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a comprehensive dataset encompassing over 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text, creation date, number of likes). Through a comparative analysis of TWIGMA with natural images and human artwork, we find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to their non-gen-AI counterparts. Additionally, we find that the similarity between a gen-AI image and natural images (i) is inversely correlated with the number of likes; and (ii) can be used to identify human images that served as inspiration for the gen-AI creations. Finally, we observe a longitudinal shift in the themes of AI-generated images on Twitter, with users increasingly sharing artistically sophisticated content such as intricate human portraits, whereas their interest in simple subjects such as natural scenes and animals has decreased. Our analyses and findings underscore the significance of TWIGMA as a unique data resource for studying AI-generated images.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.