Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Auteurs : Maarten Sap, Ronan LeBras, Daniel Fried, Yejin Choi
Résumé : Social intelligence and Theory of Mind (ToM), i.e., the ability to reason about the different mental states, intents, and reactions of all people involved, allow humans to effectively navigate and understand everyday social interactions. As NLP systems are used in increasingly complex social situations, their ability to grasp social dynamics becomes crucial. In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theory-based perspective. We show that one of today's largest language models (GPT-3; Brown et al., 2020) lacks this kind of social intelligence out-of-the box, using two tasks: SocialIQa (Sap et al., 2019), which measures models' ability to understand intents and reactions of participants of social interactions, and ToMi (Le et al., 2019), which measures whether models can infer mental states and realities of participants of situations. Our results show that models struggle substantially at these Theory of Mind tasks, with well-below-human accuracies of 55% and 60% on SocialIQa and ToMi, respectively. To conclude, we draw on theories from pragmatics to contextualize this shortcoming of large language models, by examining the limitations stemming from their data, neural architecture, and training paradigms. Challenging the prevalent narrative that only scale is needed, we posit that person-centric NLP approaches might be more effective towards neural Theory of Mind.
Explorez l'arbre d'article
Cliquez sur les nœuds de l'arborescence pour être redirigé vers un article donné et accéder à leurs résumés et assistant virtuel
Recherchez des articles similaires (en version bêta)
En cliquant sur le bouton ci-dessus, notre algorithme analysera tous les articles de notre base de données pour trouver le plus proche en fonction du contenu des articles complets et pas seulement des métadonnées. Veuillez noter que cela ne fonctionne que pour les articles pour lesquels nous avons généré des résumés et que vous pouvez le réexécuter de temps en temps pour obtenir un résultat plus précis pendant que notre base de données s'agrandit.