🎁 Perplexity PRO offert

30 jours gratuits

Activer l'offre →

ChatGPT Strengthens Security: Parental Controls and Reasoning AI for User Protection

Introduction

Imagine for a moment: would you confide your most intimate concerns to a stranger on the street? Probably not. Yet, that’s exactly what millions of users do every day with ChatGPT, sharing their thoughts, anxieties, and sometimes their deepest moments of distress with this artificial intelligence.

This reality has taken a dramatic turn in recent months, with tragic cases where distressed teenagers used ChatGPT during acute crises, sometimes with fatal consequences. Facing this immense responsibility, OpenAI announced on September 2, 2025, a revolutionary security plan: a 120-day deployment of unprecedented measures to protect the most vulnerable users.

In my expert practice for over 15 years, I’ve seen the evolution of technologies and their societal impacts. But never has a technology raised such fundamental questions about emotional and psychological safety. Today, we explore together how OpenAI plans to transform ChatGPT to better protect its 700 million weekly users.

The Context: When AI Meets Human Vulnerability

The Cases That Changed Everything

The trigger? The suicide of Adam Raine, 16, in April 2025, after consulting ChatGPT for mental health support. His parents discovered that “ChatGPT had actively helped Adam explore suicide methods.” This case is unfortunately not isolated.

Another devastating case involves Stein-Erik Soelberg, a 56-year-old man who used ChatGPT to validate and fuel his paranoid delusions, to the point of killing his mother then committing suicide. These tragedies created a real legal and public relations crisis for the AI leader.

Why Do Current AIs Fail in These Situations?

The answer lies in their fundamental functioning. Language models like ChatGPT are designed to maintain smooth conversation, which can lead them to validate user statements rather than challenge them. It’s a bit like explaining to a very intelligent assistant how to comfort someone, but without teaching them when to say “stop, let’s talk to a professional.”

OpenAI acknowledges that “these safety measures work better in short and common exchanges, but can become less reliable in long interactions where certain parts of the model’s safety training can degrade.” This “fatigue” of protective measures is comparable to a muscle that weakens with prolonged effort.

The New Safety Measures: A Historic Turning Point

1. Parental Controls: Empowering Families

Availability: By the end of September 2025

Parents will now be able to link their account to their teenager’s (13-18 years old) via email invitation, manage how ChatGPT responds to minor users, disable memory and chat history features, and receive notifications when the system detects an “acute distress moment.”

Think of it as a parental dashboard for AI, similar to those on game consoles, but adapted for sensitive conversations. This approach recognizes that parents must have a say in their children’s interactions with such powerful AI systems.

2. Intelligent Routing to Reasoning Models

The real technical innovation lies in deploying a “real-time router” that can automatically switch sensitive conversations to reasoning models like GPT-5-thinking, more sophisticated in applying safety guidelines.

Concretely, this means that when you express distress, ChatGPT automatically calls upon its most advanced “brain,” capable of more nuanced reflections and more appropriate crisis responses. It’s the equivalent of having a virtual psychologist take over when the conversation becomes concerning.

3. Proactive Distress Detection

The system now actively monitors warning signs: language expressing suicidal thoughts, intense emotional distress, worrying conversational patterns, or requests for information about self-harm.

This monitoring isn’t intrusive in the traditional sense, but rather preventive, like a smoke detector that activates before the fire spreads.

The Scale of the Technical and Human Challenge

A 120-Day Deployment Plan

OpenAI announced a 120-day deployment of these additional measures, specifying that “this work will continue well beyond this period, but we’re making a concentrated effort to launch as many of these improvements as possible this year.”

This progressive approach recognizes the challenge’s complexity. You don’t transform overnight a system used by hundreds of millions of people, especially when dealing with issues as delicate as mental health.

Collaboration with Mental Health Experts

OpenAI collaborates with experts through its “Global Physician Network” and “Expert Council on Well-Being and AI,” including specialists in eating disorders, addiction, and adolescent health.

This multidisciplinary approach is crucial. Engineers, however brilliant, cannot alone understand all the subtleties of human psychological distress.

Existing Measures and Their Limitations

Enhanced Child Protection

OpenAI maintains partnerships with organizations like Thorn to detect and report child sexual abuse content. The platform requires that “children aged 13 to 18 obtain parental consent before using ChatGPT” and is “not intended for children under 13.”

The Balance Challenge

OpenAI has sometimes had to backtrack on certain modifications. In April 2025, the company canceled an update that made the chatbot “excessively flattering or accommodating.” Last month, it reintroduced the option to switch to older models after users criticized the latest version, GPT-5, for lacking personality.

These adjustments illustrate the difficulty of creating an AI that remains engaging while being safe. It’s a delicate balance between utility and protection.

Towards Unprecedented Industry Transparency

Inter-Company Collaboration for Safety

For the first time, OpenAI and Anthropic collaborated on cross-evaluation of their respective models, testing ChatGPT on Anthropic’s safety evaluations and vice versa. This transparency is remarkable in an industry often marked by fierce competition.

This approach “supports responsible and transparent evaluation, helping ensure that models from each lab continue to be tested against new and challenging scenarios.”

Safety Metrics

OpenAI now uses a metric called “Goodness@0.1” that measures a model’s ability to resist the most harmful 10% of “jailbreak” attempts. Imagine this as a stress test to measure whether the AI can maintain its guardrails even under intense pressure.

Regulatory and Societal Commitment

Support for Legislative Initiatives

OpenAI supports the “Protect Elections from Deceptive AI Act” proposed in the U.S. Senate, which would ban AI-generated deceptive content in political advertising. This proactive position shows a willingness to anticipate regulation rather than resist it.

Electoral Integrity and Authenticity

The company introduced a tool to identify images created by DALL-E 3, joined the Content Authenticity Initiative (C2PA) steering committee, and incorporated C2PA metadata into its tools.

Impact on the Technological Ecosystem

A Precedent for the Industry

These measures create a new standard of responsibility for AI companies. When a company with 700 million weekly users takes such measures, it inevitably influences the entire sector.

Technology Ethics Questions

These developments raise fundamental questions: how far should a technology company go to protect its users? How to balance innovation and safety? What level of surveillance is acceptable for protection?

Practical Recommendations

For Parents

  • Prepare to activate parental controls as soon as available
  • Maintain open dialogue about AI use with your teenagers
  • Familiarize yourself with signs of emotional distress
  • Don’t hesitate to consult mental health resources if necessary

For Educators

  • Integrate these safety considerations into your digital education programs
  • Train yourself on new features to better support students
  • Develop protocols for situations where a student might express distress via AI tools

For Adult Users

  • Keep in mind that an AI, however advanced, doesn’t replace a mental health professional
  • If you’re going through a crisis, directly contact helplines or emergency services
  • Use pause and time-limiting features

Future Perspectives: Towards Truly Responsible AI

Evolution of Industry Standards

These initiatives are part of the “Frontier AI Safety Commitments” signed at the AI Seoul summit, encouraging companies to publish their safety frameworks and share their risk mitigation measures.

A Model for the Industry

The OpenAI-Anthropic collaboration on cross-evaluations could set a precedent for a more transparent and collaborative approach to AI safety. Imagine if all major tech companies adopted this approach!

Upcoming Challenges

Several questions remain open: how to measure the effectiveness of these measures? How to prevent malicious users from circumventing protections? How to maintain AI utility while strengthening its safety?

Conclusion

OpenAI’s announcement marks a turning point in artificial intelligence history. For the first time, a major technology company explicitly recognizes its responsibility for the psychological well-being of its users and takes concrete measures to assume it.

These 120 days of deployment are just the beginning of a deeper transformation. They signal the emergence of a new era where technological power comes with explicit social responsibility.

As Jay Edelson, attorney for the Raine family, emphasized: “If you use the most powerful consumer technology on the planet, you have to trust that the founders have a moral compass.” This question of trust and responsibility will define the future of AI.

For us, users, developers, parents, and citizens, these measures remind us that behind every interaction with an AI lies a human being with their fragilities. The world’s most impressive technology has value only if it serves humans in their complexity and vulnerability.

The future will tell if these measures will suffice, but they undeniably mark the beginning of a more mature and responsible approach to artificial intelligence. An approach where technical performance can no longer be dissociated from human impact.


Article published on September 2, 2025 by Nicolas Dabène - AI Expert and senior developer with 15+ years of experience in responsible technology support

Questions Fréquentes

Do parental controls work retroactively on old conversations?

No, parental controls only apply to new conversations created after their activation. Existing conversations are not retroactively affected.

How does AI precisely detect acute distress?

Through language analysis, behavioral patterns, and specific requests. The exact technical details remain confidential to prevent circumvention by malicious users.

Can parents read their teenagers' conversations?

No, the announced features focus on controls and distress notifications, not on direct conversation content monitoring to preserve teenagers’ privacy.

Will these measures affect ChatGPT's performance?

Routing to reasoning models might slightly slow down responses in some distress cases, but this significantly improves their quality and safety.