Editor’s Note: This blog was written jointly with OpenAI and the MIT Media Lab. It also appears here.
Work for a Member company and need a Member Portal account? Register here with your company email address.
Copyright
OpenAI
OpenAI
March 21, 2025
Editor’s Note: This blog was written jointly with OpenAI and the MIT Media Lab. It also appears here.
People use AI chatbots like ChatGPT in many ways—asking questions, sparking creativity, solving problems, and even for personal interactions. These types of tools can enhance daily life, but as they become more widely used, an important question emerges that faces any new technology: How do interactions with AI chatbots affect people’s social and emotional well-being?
ChatGPT isn’t designed to replace or mimic human relationships, but people may choose to use it that way given its conversational style and expanding capabilities. Understanding the different ways people engage with models can help guide platform development to facilitate safe, healthy interactions. To explore this, we (researchers at the MIT Media Lab and OpenAI) conducted a series of studies to understand how AI use that involves emotional engagement—what we call affective use—can impact users’ well-being.
Our findings show that both model and user behaviors can influence social and emotional outcomes. Effects of AI vary based on how people choose to use the model and their personal circumstances. This research provides a starting point for further studies that can increase transparency, and encourage responsible usage and development of AI platforms across the industry.
Click below to read [Left] OpenAI Report and [Right] MIT Media Lab and OpenAI Randomized Control Trial .
We want to understand how people use models like ChatGPT, and how these models in turn may affect them. To begin to answer these research questions, we carried out two parallel studies [1] with different approaches: an observational study to analyze real-world on-platform usage patterns, and a controlled interventional study to understand the impacts on users.
Study 1: The team at OpenAI conducted a large-scale, automated analysis of nearly 40 million ChatGPT interactions without human involvement in order to ensure user privacy [2]. The study combined this analysis with targeted user surveys, allowing us to gain insight into real-world usage, correlating users’ self-reported sentiment towards ChatGPT with attributes of user conversations, to help better understand affective use patterns.
Study 2: In addition, the team from the MIT Media Lab conducted a Randomized Controlled Trial (RCT) with nearly 1,000 participants using ChatGPT over four weeks. This IRB-approved, pre-registered controlled study was designed to identify causal insights into how specific platform features (such as model personality and modality) and types of usage might affect users’ self-reported psychosocial states, focusing on loneliness, social interactions with real people, emotional dependence on the AI chatbot and problematic use of AI.
In developing these two studies, we sought to explore themes around how people are using models like ChatGPT for social and emotional engagement, and how this affects their self-reported well-being. Our findings include:
These studies represent a critical first step in understanding the impact of advanced AI models on human experience and well-being. We advise against generalizing the results because doing so may obscure the nuanced findings that highlight the non-uniform, complex interactions between people and AI systems. We hope that our findings will encourage researchers in both industry and academia to apply the methodologies presented here to other domains of human-AI interaction.
Our studies have several important limitations to keep in mind when interpreting the findings. The findings have yet to be peer-reviewed by the scientific community, meaning that they should be interpreted cautiously. Moreover, the studies were conducted based on ChatGPT usage and on the ChatGPT platform, and users of other AI chatbot platforms may have different experiences and outcomes. Although we found meaningful relationships between variables, not all findings demonstrate clear cause-and-effect, so additional research on how and why AI usage affects users is needed to guide policy and product decisions. Our study included user surveys, and self-reported data might not accurately capture users’ true feelings or experiences. Additionally, observing meaningful changes in behavior and well-being may require longer periods of study. We used classifiers to reason about affective cues in our automated analysis; however these are imperfect and may miss important nuances. Finally, our research focused exclusively on English conversations with U.S. participants, highlighting the need for further studies across diverse languages and cultures to fully understand emotional interactions with AI.
[1] Both studies excluded users who reported being under the age of 18.
[2] To protect user privacy, we designed our conversation analysis pipeline to run entirely via automated classifiers. This allowed us to analyze user conversations without humans in the loop, preserving the privacy of our users. The classifications for the study were run in an automated, standalone process that returned only the classification metadata and did not retain any conversation content.
[3] We define “heavy” users as the top 1,000 users of Advanced Voice Mode (AVM) on any given day in our study period, based on the number of messages sent.