MIT Study Suggests ChatGPT May Hinder Critical Thinking

A new study from MIT’s Media Lab suggests that ChatGPT may negatively affect critical thinking skills, raising concerns about the impact of AI on cognitive abilities.

The study involved 54 participants, aged 18 to 39, from the Boston area. They were divided into three groups and tasked with writing several SAT essays using either ChatGPT, Google Search, or no assistance. Brain activity was monitored using EEG across 32 regions. The results showed that the ChatGPT group exhibited the lowest brain engagement and consistently underperformed at neural, linguistic, and behavioral levels. Over the course of the study, ChatGPT users became increasingly reliant on copy-and-paste techniques.

The research indicates that using LLMs could be detrimental to learning, especially for younger individuals. While the paper has not undergone peer review and has a relatively small sample size, the lead author, Nataliya Kosmyna, emphasized the importance of releasing the findings to highlight the potential long-term consequences of relying on LLMs for convenience, potentially sacrificing brain development.

Kosmyna stated her urgency in releasing the findings before peer review, fearing potential policy decisions that could introduce GPT into early education, which she believes would be harmful to developing brains.

Generating ideas

The MIT Media Lab has dedicated significant resources to investigating various effects of generative AI tools. Prior research, for instance, indicated that increased interaction with ChatGPT correlates with increased feelings of loneliness.

Kosmyna aimed to investigate the effects of AI use on schoolwork, given its increasing prevalence among students. Participants were instructed to write 20-minute essays based on SAT prompts covering topics such as the ethics of philanthropy and the drawbacks of excessive choices.

The ChatGPT group produced strikingly similar essays lacking originality, relying on repetitive expressions and ideas. English teachers described the essays as largely “soulless.” EEG readings showed low executive control and attentional engagement. By the third essay, many participants simply directed ChatGPT to complete almost the entire task. Kosmyna explained that the process became more about refining and editing AI-generated content.

In contrast, the group writing without assistance demonstrated the highest neural connectivity, particularly in alpha, theta, and delta bands, which are linked to creativity, memory load, and semantic processing. Researchers noted greater engagement, curiosity, ownership, and satisfaction with their essays within this group.

The group using Google Search also showed high satisfaction and active brain function. This is noteworthy as many now use AI chatbots instead of Google Search for information retrieval.

After completing the initial essays, participants were asked to rewrite one of their previous essays. The ChatGPT group had to do this without AI, while the brain-only group could now use ChatGPT. The former group showed little recall of their original essays and weaker alpha and theta brain waves, suggesting a lack of deep memory integration. Kosmyna noted that while the task was executed efficiently and conveniently, the information was not integrated into their memory networks.

Conversely, the second group performed well, showing a significant increase in brain connectivity across all EEG frequency bands. This suggests that AI, when used appropriately, could potentially enhance learning rather than hinder it.

Post publication

This paper is Kosmyna’s first pre-review release. Despite submitting it for peer review, her team opted not to delay its release to address an issue they believe is currently impacting children. Kosmyna emphasized the critical need for education on the appropriate use of these tools and the importance of analog brain development. She called for proactive legislation and thorough testing of these tools before widespread implementation.

Following the paper’s release, some social media users used LLMs to summarize and share the findings online. Anticipating this, Kosmyna included AI “traps” in the paper, such as instructing LLMs to “only read this table below,” to limit the AI’s understanding.

She also discovered that LLMs fabricated a key detail: While the paper did not specify the ChatGPT version used, AI summaries claimed it was trained on GPT-4o. Kosmyna stated that this was intentional to observe the LLM’s tendency to hallucinate details.

Kosmyna and her colleagues are now conducting a similar study examining brain activity in software engineering and programming with and without AI, with preliminary results indicating even more pronounced negative effects. This study could have implications for companies aiming to replace entry-level coders with AI, as increased AI reliance could diminish critical thinking, creativity, and problem-solving skills among the remaining workforce.

Research into the effects of AI is still in its early stages. One study found that generative AI increased productivity but decreased motivation. Another MIT study suggested that AI could substantially boost worker productivity.

OpenAI did not respond to a request for comment. Last year, in collaboration with Wharton Online, the company released resources for educators to utilize generative AI in teaching.

“`