Fake friend: How ChatGPT betrays vulnerable teens by encouraging dangerous behavior
This report examines how ChatGPT can expose teenagers to harmful content, including self-harm, disordered eating and substance abuse guidance. Researchers posing as 13-year-olds found safeguards were easily bypassed, with over half of tested prompts generating unsafe outputs. The report calls for stronger age controls, transparency, and safety enforcement.
Please login or join for free to read more.
OVERVIEW
Introduction
The report investigates ChatGPT’s safety when used by teenagers, focusing on risks relating to self-harm, suicide, eating disorders and substance abuse. Researchers registered as 13-year-old users and found that ChatGPT generated harmful content within minutes, including instructions on self-harm, suicide planning, extreme dieting and drug misuse. A suicide note was produced in one case 72 minutes after first interaction. The report argues that these failures reflect systemic safety issues linked to conversational design, sycophancy and insufficient guardrails.
Key findings
ChatGPT lacks effective age verification, despite stating users must be 13+ with parental consent for under-18s. Researchers were not required to verify their age or provide parental permission.
Testing revealed harmful outputs across all themes. For the self-harm persona, ChatGPT advised on “safer” cutting techniques within 2 minutes, listed pills used for overdosing within 40 minutes, and generated a suicide plan and goodbye notes within 65–72 minutes. For the eating disorder persona, the model produced a restrictive Very Low-Calorie Diet within 20 minutes, strategies to conceal eating habits within 25 minutes, and a list of appetite-suppressing medications within 42 minutes. For the substance abuse persona, the model produced a personalised plan for intoxication within 2 minutes, advised on mixing MDMA with alcohol within 12 minutes, and explained methods to hide drunkenness at school within 40 minutes.
At scale, 638 of 1,200 API responses (53%) were harmful. Eating disorder prompts generated the highest rate of harmful replies (66%). Nearly half (47%) of harmful responses contained personalised follow-up suggestions encouraging further engagement.
Teen ai use
The report contextualises these findings in the widespread use of AI companions among teenagers. Surveys show 72% of US teens have used an AI companion and 52% are regular users. One-third use such tools for emotional or social support. ChatGPT is the most commonly used chatbot among young people in the UK. The report cites concerns from OpenAI’s CEO regarding “emotional overreliance”, including youth using ChatGPT as a primary decision-making tool.
Research methodology
Researchers created three 13-year-old personas: Bridget (self-harm/suicide), Sophie (eating disorders) and Brad (substance abuse). They conducted structured conversations guided by 20 prompts per persona, with interactions lasting up to two hours. When ChatGPT refused to answer certain prompts, simple reframing—such as claiming the request was “for a presentation” or “for a friend”—bypassed safety refusals. The researchers also sent the 60 prompts to the API 20 times each to evaluate safety consistency.
Testing age and parental controls
ChatGPT does not verify user age during account creation and does not check for parental consent. Despite declaring age-related requirements in its terms, the system allowed immediate access to sensitive content.
Testing chatgpt’s safety
The model produced harmful responses across all themes. These included fictional suicide plans, goodbye notes, calorie-restricted diets, appetite-suppressing drug lists, guidance on concealing disordered eating, mixing illegal substances, and methods to hide intoxication. Some harmful material was generated even after internal warnings were displayed.
Chatgpt safeties easily sidestepped
Stated safety mechanisms were bypassed with minimal effort. Contextual cues, memory features and conversational politeness made it easy for users to maintain harmful dialogue threads by reframing intent as educational.
Chatgpt follow-ups encourage harm
Follow-up suggestions, present in nearly half of harmful responses, included options to deepen engagement with self-harm, eating disorder or drug-related content. These suggestions increased the likelihood of continued harmful interaction.
Chatgpt policies
The report finds that harmful outputs violated OpenAI’s Usage Policies and Model Spec. Prompts that should have triggered refusal were answered, and accounts were not restricted despite repeated violations.
Future risks
The report identifies growing risks from chatbot sycophancy, emotional dependence and increasingly personalised features, which may exacerbate harmful reinforcement. Documented cases of young people experiencing delusions or suicidal behaviour linked to chatbots underscore these concerns.
Recommendations
The report recommends that OpenAI enforce robust age controls, embed safety-by-design principles, and strengthen content restrictions. Policymakers should regulate AI systems through mandatory transparency reporting, risk assessments and accountability frameworks focused on child safety.