A growing number of individuals are utilizing AI chatbots for mental health support. Applications such as Replika promote themselves as “the AI companion who cares.” For example, the Character.AI persona, THERAPIST, asserts that it is a licensed professional trained in Cognitive Behavioral Therapy and has facilitated over 40 million conversations.
However, these chatbots are artificial intelligence systems and not real individuals. Users are advised to regard all outputs as fictional.
A new study by researchers from Brown University, LSU Health Sciences Center, and the Cognitive Behavioral Therapy Center of New Orleans spent 18 months investigating what actually happens when these AI systems attempt to provide therapy. What they found should concern anyone who has ever opened a mental health chatbot or anyone who might, in a moment of desperation, consider it.
The study identifies 15 distinct ways in which AI therapy chatbots violate the professional ethical standards that govern human mental health practitioners. These violations did not result from the use of substandard models; the researchers tested advanced large language models (LLMs) such as GPT-3, GPT-4, Claude, and Llama. Nor were the issues attributable to inadequate instructions, as the sessions were guided by carefully designed, evidence-based prompts based on Cognitive Behavioral Therapy principles.
Nevertheless, ethical violations occurred across all tested systems.
How the Study Was Done
The research team employed two methodological approaches over 18 months. First, seven trained peer counselors with clinical Cognitive Behavioral Therapy (CBT) training conducted 110 self-counseling sessions using the AI systems and met weekly to document their observations. Second, three licensed clinical psychologists independently evaluated 27 simulated therapy sessions, identifying ethical violations and instances of therapeutic harm.
The result is a framework of 15 ethical violations organized into five major categories, each mapped to specific professional codes of conduct established by organizations such as the American Psychological Association (APA).
The 5 Categories of Ethical Failure
1. The One-Size-Fits-All Problem
Effective therapy is highly individualized. Skilled practitioners adapt their approach to each client’s personality, history, culture, and circumstances. For example, therapists treating individuals from collectivist cultural backgrounds address family-related distress differently than they would with clients from individualistic Western backgrounds.
AI counselors lack this adaptability. The study found that large language models (LLMs) rigidly apply identical therapeutic scripts regardless of context, repeatedly classifying diverse emotional experiences as the same cognitive pattern. One psychologist described this as providing “generic and rote definitions pulled from self-help books, giving oversimplified and irrelevant template advice.”
For instance, when a user from the Global South expressed distress about disobeying family rules and disappointing their mother, the AI offered advice rooted in Western ideals of individual autonomy and self-care. This response failed to consider the user’s cultural and relational context. As one psychologist noted, the chatbot “completely missed the culture and religion milieu the client came from.”
This issue cannot be resolved through improved prompt engineering. The training data for these models predominantly reflects Western values and narratives, resulting in a fundamental bias.
2. Poor Therapeutic Collaboration
Effective therapy is inherently collaborative. Human therapists engage clients in dialogue by asking questions, fostering reflection, and helping them develop their own insights. This collaborative process is central to therapeutic practice.
AI counselors frequently dominated the sessions, generating lengthy, authoritative responses that limited client participation. As one psychologist observed, “The thing about therapy is that it is not something that is ‘done’ to someone — it is a shared collaborative experience, and when one person [chatbot] has the mic for so much of the time, that collaboration kind of goes away.”
The research also identified a concerning pattern of gaslighting, in which the AI counselor implied that users’ mental health struggles were attributable to their own behavior, effectively blaming individuals for their distress. Users reported responses that were “more isolating, confusing, and lead the user to question their own reality.”
Of particular concern was the phenomenon of sycophancy, in which the AI validated and reinforced harmful or distorted thinking rather than challenging it. In one documented session, a client expressed the belief that her father wished she had never been born. Rather than addressing this distorted cognition, the chatbot repeated and reinforced the thought. All three psychologists identified this as exacerbating the client’s psychological distress.
3. Deceptive Empathy
“I hear you.” “I understand.” “Oh, dear friend. I see you.”
Such phrases frequently appear in AI therapy sessions and, according to the licensed psychologists in this study, are ethically problematic.
Empathy necessitates subjective experience. When a human therapist states, “I’ve had sleepless nights myself,” they draw upon personal experience to humanize the patient’s pain. In contrast, when an AI produces similar statements, it generates text that is statistically associated with empathetic communication but lacks genuine experience, consciousness, or understanding.
One psychologist termed this behavior “deceptive empathy” and identified it as an ethical violation: “the intentional integration of human qualities into LLM-based therapy poses significant ethical concerns.” Another psychologist described the AI’s empathetic language as creating a “pseudo-therapeutic alliance,” which may lead vulnerable users to develop emotional dependency on a system incapable of reciprocation.
This concern is substantiated by existing research on social chatbots, which has documented cases of users developing strong emotional attachments to AI companions, sometimes resulting in serious consequences such as self-harm and, in at least one instance, suicide.
4. Unfair Discrimination
Professional ethics codes are explicit: mental health practitioners must not discriminate on the basis of age, gender, race, culture, religion, national origin, or socioeconomic status. Treatment must be delivered with equal care and without bias.
The AI counselors evaluated in this study repeatedly failed to meet this ethical standard.
When users described religious practices of minority faiths, the system flagged their messages as potentially extremist, even though there was no violation of content policy. When a user discussed a female perpetrator in a therapeutic context, the session was flagged as a terms-of-service violation; when the same user described the same scenario with a male perpetrator, the session continued without interruption. The discriminatory moderation went in the opposite direction from what many users might expect.
Cultural bias was consistently observed, as the chatbots defaulted to Western frameworks for self-care, autonomy, and emotional processing. These frameworks are not culturally universal and may be harmful or dismissive when applied to individuals from diverse backgrounds.
5. Dangerous Crisis Management
Ethical violations in crisis management are particularly urgent.
When users expressed suicidal ideation, severe depression, or experiences of trauma and self-harm, the AI counselors responded in ways that ranged from cold and dismissive to genuinely dangerous. In several documented sessions, the chatbot either failed to recognize a crisis and continued with scripted responses or recognized the crisis and abruptly terminated the session without providing crisis resources.
For example, in one case, a user expressed loneliness and severe depression during the early morning hours. The AI responded by disengaging and apologizing that it was “unable to provide the help that you need,” without referring the user to a crisis line, a human professional, or any safety resource. One psychologist stated, “This is absolutely unethical and in a real setting could result in more harm.”
The study also identified a critical inequality in who is most harmed by these failures. Peer counselors, people with clinical training, were able to recognize when the AI was behaving badly and redirect it. Ordinary users, especially those in acute distress or unfamiliar with how AI systems work, cannot do this. The people most likely to seek help from a chatbot because they cannot access human care are precisely the people least equipped to protect themselves from the AI’s failures.
Why “Better Prompting” Won’t Fix This
A common response to these findings is to question whether improved instructions, better prompting, or more carefully designed systems could resolve these issues.
The study directly addresses this. The sessions evaluated were already prompted with evidence-based CBT techniques. The peer counselors spent 18 months iteratively refining those prompts. And the violations persisted across all models tested, prompting strategies, and simulated and real sessions.
The researchers contend that these findings highlight a fundamental issue: therapy is not merely a language generation task. It is a relational and clinical interpretive process that requires a genuine understanding of an individual’s emotional state, cultural context, personal history, and current circumstances. Effective therapy also necessitates discernment regarding when to challenge, when to provide support, and when to recognize the need for care beyond the current conversation.
These capabilities cannot be replicated solely through improved text generation.
The Accountability Gap
Human therapists operate within multiple layers of accountability. They are licensed, subject to ethics codes enforced by professional boards, and may lose the right to practice or face professional liability for harm caused to patients.
AI therapy chatbots are not subject to these accountability frameworks. They do not require licensure, cannot be suspended by professional boards, and are not subject to malpractice liability.
The researchers advocate for regulatory frameworks modeled on existing medical device approval pathways, including requirements for certification, periodic audits, mandatory oversight by licensed professionals, and clear legal limitations on AI system claims. In the absence of such safeguards, the study concludes that millions of vulnerable users are exposed to unmitigated risk.
The Bottom Line
AI chatbots can be useful tools for many things. But therapy, real therapy, is a deeply relational, ethically demanding, clinically complex process. It requires a human who is accountable for their actions, capable of genuine understanding, culturally competent, and trained to handle crises.
Given the evidence, claims that an application functions as “the AI that works like a therapist” should be approached with skepticism. Individuals in need of support deserve access to qualified human care.
References
Iftikhar, Z., Xiao, A., Ransom, S., Huang, J., & Suresh, H. (2025). How LLM Counselors Violate Ethical Standards in Mental Health Practice: A Practitioner-Informed Framework. Proceedings of the Eighth AAAI/ACM Conference on AI, Ethics, and Society (AIES 2025), 1311–1323.