PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

AI-generated keywords: Large Language Models Personality Traits Linguistic Inquiry and Word Count Human Evaluators Ethical Considerations

AI-generated Key Points

  • Study evaluates behavior of Large Language Models (LLMs) aligning with specific personality traits
  • LLM personas based on Big Five model undergo personality test and story writing task
  • Some LLMs do not follow instructions to avoid mentioning assigned personality traits in stories
  • Linguistic Inquiry and Word Count (LIWC) analysis conducted on GPT-3.5 and GPT-4 personas' stories
  • Human evaluators rate stories and infer authors' personalities under two conditions: aware or unaware of AI authorship
  • Most GPT-3.5 persona stories contain explicit references to assigned traits, leading focus on GPT-4 persona stories in final evaluation
  • Researchers aim to identify linguistic patterns corresponding to personality traits through LIWC analysis
  • Features compared with human-generated writing samples from Essays dataset to assess portrayal of personalities by LLM personas
  • Study suggests extensions for evaluating LLM personas in real-life scenarios, considering ethical implications of AI authorship awareness
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Hang Jiang, Xiajie Zhang, Xubo Cao, Cynthia Breazeal, Deb Roy, Jad Kabbara

First version in 05/2023. Accepted at NAACL Findings 2024
License: CC BY-NC-SA 4.0

Abstract: Despite the many use cases for large language models (LLMs) in creating personalized chatbots, there has been limited research on evaluating the extent to which the behaviors of personalized LLMs accurately and consistently reflect specific personality traits. We consider studying the behavior of LLM-based agents which we refer to as LLM personas and present a case study with GPT-3.5 and GPT-4 to investigate whether LLMs can generate content that aligns with their assigned personality profiles. To this end, we simulate distinct LLM personas based on the Big Five personality model, have them complete the 44-item Big Five Inventory (BFI) personality test and a story writing task, and then assess their essays with automatic and human evaluations. Results show that LLM personas' self-reported BFI scores are consistent with their designated personality types, with large effect sizes observed across five traits. Additionally, LLM personas' writings have emerging representative linguistic patterns for personality traits when compared with a human writing corpus. Furthermore, human evaluation shows that humans can perceive some personality traits with an accuracy of up to 80%. Interestingly, the accuracy drops significantly when the annotators were informed of AI authorship.

Submitted to arXiv on 04 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.02547v5

This study evaluates the behavior of Large Language Models (LLMs) in generating content that aligns with specific personality traits. LLM personas are created based on the Big Five personality model and undergo a personality test and story writing task. However, some LLMs do not follow instructions to not explicitly mention their assigned personality traits in their stories. To assess the generated content, Linguistic Inquiry and Word Count (LIWC) analysis is conducted on stories from GPT-3.5 and GPT-4 personas. Additionally, human evaluators are recruited to rate the stories and infer the authors' personalities. The study design includes two conditions for human evaluators: being aware or unaware that the stories were written by an LLM. This aims to investigate how awareness of AI authorship impacts narrative evaluation and accuracy of personality predictions. The results reveal that most stories produced by GPT-3.5 personas contain explicit references to assigned personality traits, leading to a focus on stories generated by GPT-4 personas in final human evaluation. The researchers aim to identify patterns of linguistic characteristics corresponding to certain personality traits through LIWC analysis. These features are then compared with human-generated writing samples from the Essays dataset to understand if LLM personas can convincingly portray assigned personalities to human observers. In conclusion, this study suggests potential extensions for evaluating LLM personas in more real-life scenarios such as multi-round dialogues and action planning while considering ethical considerations surrounding AI authorship awareness. By providing a comprehensive evaluation of LLM personas' abilities in accurately reflecting specific personality traits, this research contributes towards understanding the capabilities and limitations of large language models in creating personalized content.
Created on 30 Jul. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.