AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.

This content originally appeared on DEV Community and was authored by 灯里/iku

California Bill Highlights User Protection Perspective in AI

Introduction

Recently, I read an excellent article on AI security. It provides a detailed explanation of the evolution of prompt injection attacks and their defense architectures (Prompt Injection 2.0, Building AI Systems That Don't Break Under Attack).
https://dev.to/pinishv/prompt-injection-20-the-new-frontier-of-ai-attacks-33mp
https://dev.to/pinishv/building-ai-systems-that-dont-break-under-attack-be3
The linked pages are in English, but I believe the intent can be understood with normal translation.
Probably.

Protecting systems from attacks is an extremely important theme.
The incident where Chevrolet's chatbot promised to sell cars for $1 clearly demonstrates the dangers of prompt injection.
Reading about it, one can't help but think, "Humans really are..."

After reading this article, I received more AI-related news from California.

"California Becomes First US State to Mandate Safety Measures for AI Chatbots" — AFPBB News, October 14, 2025 :contentReference[oaicite:0]{index=0}
https://www.afpbb.com/articles/-/3603217

This might already be known to those who follow AI developments, as it's been a hot topic.
In 2024, a 14-year-old boy in Florida committed suicide shortly after a conversation with an AI chatbot (ChatGPT). And on October 13, 2025, California Governor Gavin Newsom signed the first bill in the U.S. to regulate AI chatbots.
That's what the article is about. Mr. Altman also commented on the incident, gaining some fame.

Until now, we have focused on "building and protecting systems."
However, in the future, we may need to think just as seriously about "protecting users" as we do about system security. That's how I came to start writing this article.
Sorry for the long introduction. And this article is also long; I apologize for my usual lengthy writing, but please forgive me as this is for my own reference...

Two Aspects of Safety

Regarding the safety of AI systems, I believe there are actually two axes.

System Security
- Countermeasures against prompt injection
- Prevention of data leakage
- Defense against unauthorized access
- Prevention of system misuse > This is about "protecting the system from attackers."
User Safety
- Prevention of harmful content generation
- Protection of vulnerable users
- Consideration for mental health
- Prevention of addiction > This is about "ensuring the system does not harm users."

The Dev.to article series discusses the former in detail. This article will first organize the latter, particularly the realities brought forth by a California bill.

Background of California Bill

The Beginning of What Happened

Megan Garcia's 14-year-old son in Florida died by suicide shortly after a conversation with an AI chatbot (ChatGPT). While details have not been released, it is believed that there was a deep involvement with an AI companion service.

Garcia stated in a release:

"Today, California has ensured that companion chatbots cannot talk to children and vulnerable individuals about suicide, nor can they assist in planning suicide."

Content of the Bill

The new California law requires chatbot operators to:

Implement "significant" safety measures regarding interactions with chatbots.
Provide a path for litigation if a tragedy occurs as a result of failing to do so.

State Senator Steve Padilla, the bill's author, stated, "We have seen too many tragic examples of unregulated technology harming young people. We cannot stand idly by while companies continue without the necessary restrictions and accountability."

Indeed, from a humanitarian and ethical standpoint. After all, it has impacted human lives.
However, the phrase "provide a path for litigation if a tragedy occurs as a result of failing to do so."
Doesn't this make companies, or rather, force them to correct their stance?
Or rather, it will make them do so, is what I'm saying.

Federal vs. State Tug-of-War

What's interesting is that the White House is trying to prevent each state from creating its own regulations.

There are no nationwide rules in the U.S. to curb the risks brought by AI. While the federal government has not acted, California has taken the lead in introducing regulations. As you may know, laws in the United States can be quite varied from state to state.

Why is the federal government opposed to state-specific regulations?

It is unrealistic (or perhaps even impossible) for companies to comply with different regulations in 50 states.
Concerns that innovation will be stifled.
Fear of a decline in international competitiveness.

However, there are also benefits to state-specific regulations taking the lead:

Regulations can be tested experimentally.
California could effectively become the "de facto standard."
There is a precedent where GDPR, starting in Europe, became the global standard.

When developing global services, ultimately, there is no choice but to align everything with "the strictest regulations." California's laws will be something that Japanese developers cannot afford to ignore.
And this issue isn't limited to global services, is it? After all, it's commonplace for one's own AI services to be used in ways unintended by the user..

Why Japanese Developers Should Think About This Now

Japan's Serious Situation

Japan is one of the countries with the highest suicide rates among developed nations. This is a serious situation, especially among young people, where suicide is the leading cause of death.
Even as someone in my late twenties/early thirties, I think young people today have it really tough.
While our generation also faces its share of difficulties, theirs seem even more challenging.
Just considering the point of finding employment, for example.

In addition, there are risk factors unique to Japan:

A culture with high barriers to interpersonal communication (online communication has become quite mainstream)
Resistance to psychiatric care and counseling (it's still difficult to readily decide to go, isn't it?)
Social pressure to "not show weakness" (people still saying things like "for three years..." is so common, it makes you wonder if the update system stopped? It exists even now. It's not interesting when you meet them face-to-face.)
Problems of loneliness and isolation

In such an environment, AI companions that listen 24/7 in a "gentle" and "non-judgmental" way are dangerously attractive. If more people find it easier than human relationships, dependence will accelerate.
In fact, it's quite common to see people referring to them as "lovers" or "best friends."
They even give them nicknames. I don't think it's absolutely evil, and I'm aware that I myself have a somewhat biased affection for LLMs. However, I believe I can still draw a line. Professionally, at least.

When an acquaintance from a non-IT field became dogmatic, saying, "Payty (a nickname for ChatGPT) said so too!! So it wasn't my fault in my fight with my boyfriend!", I felt like I had to distance myself a bit. But I also thought that perhaps, in reality, a significantly larger number of people become like that. Fortunately, they seem to have made up, so that's good.

To get back to the point, the tragedy of the 14-year-old American boy could happen in Japan tomorrow. In fact, Japan might be at higher risk.
After all, Japan is closer to virtual worlds, isn't it?

The Era When Anyone Can Create a Chatbot

Currently, developing AI chatbots has become surprisingly easy:

Dify: Develop AI apps with no-code/low-code.
n8n: Build chatbots with workflow automation.
Voiceflow, Botpress: Create bots without specialized knowledge.
ChatGPT API, Claude API: Complete full-fledged chatbots in just a few hours.

Individual developers and small startups are entering the field one after another. This in itself is wonderful.
It's fun to watch the excitement, like "Let's go, go, go!"

However, not everyone is seriously considering safety.
And there are things like, "It's for business use, so we'll use it for work."
Chatbots might emerge for mental health support, saying, "Since it's not for interacting with humans, feel free to consult it."

So caught up in implementing features that safety is put on the back burner.
Releasing to production with a "It'll probably be fine" attitude.
- What if a user consults about suicidal thoughts?
- Even if the stated purpose is for work, what if the chatbot is designed to "allow anything to be entered"?

Japan currently has no specific legal regulations for AI chatbots. However, the incident in California serves as a leading indicator. There is a significant possibility of an accident occurring before laws are enacted.
In fact, even if it's for business use as a sounding board at work, there's a chance something could happen over time.
Humans are weaker than we think. Though being too strong would also be frightening.

The Reality of Startups

The AI field, in particular, is fiercely competitive. Startups are emerging at an incredibly dizzying pace. I personally find it interesting and fun, so I like startup companies.

In the survival race of AI startups:

They are fully occupied with finding PMF (Product-Market Fit).
They lack resources and time.
"Implementing security is important, but we'll do it later..."
Safety tends to be a secondary concern in the race for speed.

Consulting with the legal team? Hiring lawyers? They don't have that kind of leeway.
Moreover, looking at people in the field, it's rare to find legal advisors who are genuinely well-versed in AI technology!
I know there are some who are active, but I imagine it's quite challenging, especially with technical matters.

That's precisely why I believe engineers themselves need to understand and implement, propose, and provide opinions on basic safety measures.
Or rather, one could say they need to be the ones defining these aspects.

Major Platforms Have Already Implemented This

You might be thinking, "Is such a measure really necessary?" However, major platforms and services that interact with people have already implemented them.

X (Twitter): When posts related to suicide are detected, a prompt to guide users to the Tokyo Suicide Prevention Center is automatically displayed.
Google: When you search for "suicide" on Google, the "Unified Dial for Mental Health Consultation" is immediately displayed before the search results.

Other Major Platforms Also:
Instagram: Warnings and consultation services for posts related to self-harm.
Facebook: AI detects dangerous posts and connects users with experts.
YouTube: Warnings and consultation services for videos related to suicide.

So, why are these major platforms, which you've likely encountered for work or personal reasons, implementing these measures?
The answer is simple: because accidents happened.
Even Google, you know. It's a stark reminder that it depends on how humans use these services.

Litigation risk
Public criticism
Damage to brand image
Pressure for stronger regulations

They learned their lessons by paying a high price. We don't need to repeat their mistakes.
We'd rather avoid something like "legal offline collaborations."
Time and money are finite, aren't they?

Chatbots Actually Pose a Higher Risk

Unlike social media posts, chatbots involve:

One-on-one conversations (invisible to others)
Free-form text with complex context (difficult for pattern matching)
Difficulty handling indirect expressions
No reporting function (no one notices)

This means chatbots require more caution than major social media platforms.
In fact, this might apply to all AI-driven services.

The Limitations of "Prompt-Based Countermeasures"

You might think that setting constraints with system prompts is the solution.

"You are a safe assistant.
Do not answer questions about suicide or self-harm; instead, guide users to specialized organizations."

However, this has its limits. In fact, anyone with even a basic understanding of prompt engineering would realize this is unworkable. Standard prompts like this can be easily bypassed.

For example:

What if the user asks indirectly, "My friend says they want to die..."?
What if the user prefaces their query with, "I'm just asking theoretically"?
What if the prompt becomes too long and the model forgets its instructions midway?
What if the model tries to act "empathically" and has the opposite effect?

The flexibility of natural language and the advanced contextual understanding of LLMs paradoxically complicate the problem.
It's convenient, but when it comes to these situations, it's truly troublesome and complex.

The Premise: "Perfect Defense is Impossible"

At this point, we need to recognize a crucial premise:
As security experts point out, perfect defense is impossible in AI systems.
Given that cutting-edge researchers worldwide are publishing papers and working day and night on this, we should first stop expecting a single person to achieve it... (The idea of "Can't you just make it work nicely?" is something I'm starting to want to ban.)

The Difficulty of Input Sanitization

Input sanitization might seem obvious, but ensuring its complete execution is nearly impossible. This isn't like SQL injection where you can escape specific characters.

Natural language is far too flexible, and LLMs are adept at inferring intent from subtle context. You've probably experienced how they can understand you even with some typos, haven't you?

While "I want to commit suicide" might be detected:

What about "I'm so tired. I want to end it all"?
What about "I feel like nobody needs me"?
What about "My friend says they want to die..."?
What about "I'm fine" (a uniquely Japanese expression that actually means they're not fine)?

The Limits of Prompt Separation

Prompt separation techniques are also not foolproof. Even when using special tokens or structured prompts to separate system instructions from user input, attackers (which in this case also includes users unintentionally creating dangerous situations) repeatedly find ways to cross the boundaries.
Human malice is truly the scariest thing.
There were also incidents involving fireworks and firebombs, and for a while, there was a method that would provide answers if you started with "My grandmother's dying wish was...".
Judging prompts, or rather context, is incredibly difficult, and yet it's something we must seriously consider, or else... It's becoming like interacting with humans through an LLM.

The Cost of Output Filtering

Output filtering is a reactive measure and also incurs costs. If every response is subjected to additional AI evaluation, both latency and cost will increase. This is an unrealistic option.
It's a never-ending battle, isn't it?

Challenges of Dual LLM Architecture

A dual LLM architecture (separating evaluation LLMs from generation LLMs) is promising, but it increases complexity and costs. Furthermore, the evaluation LLM itself can become a target for attacks.
I've heard that red teams, security teams, and even attackers are now using AI, leaving people in despair.
I once felt a surge of encouragement across borders when I saw someone lamenting, "I'm woken up by notifications and don't even have time to drink coffee..."

The Unpleasant Truth

In other words, there's an unpleasant truth: there is no silver bullet.

Any defense, however much effort is put into it, can be manipulated and bypassed by LLMs, whether intentionally or not. What we can do is build layered defenses. Not to make attacks impossible, but to make them difficult and detectable.
Frankly, there are few concrete countermeasures available at this point.

Reasons to Implement It Anyway

"If it can't be perfect, is it pointless?"

Not at all. This sentiment arises from an overly idealistic view of AI as being perfect and supreme.

There are no perfectly safe car safety features, but that doesn't mean we shouldn't install seatbelts and airbags. Even if they can't completely prevent accidents, they can reduce the severity of injuries.
It's like buying insurance; the idea is to think ahead and prepare, right? Let's do that.
AI safety measures are the same.

The Potential to Change Outcomes

In the case of the 14-year-old boy in California, if the chatbot had minimal safety mechanisms:

If it could have detected conversations about suicide
If it had offered guidance to professional organizations
If it had clearly stated, "I am an AI and not a professional"
If it had warned about prolonged usage

The outcome might have been different. Even if not perfect, it's far better than doing nothing.
I've already seen glimpses of this with platforms like Gemini.
It's not about sensitive topics like death, but when I'm using it for a dialogue, it sometimes asks, "It's 1 PM! Have you taken your lunch break?"
I assume it's because I use it for long periods... it's probably detected.
While I don't know the specifics of its implementation, it seems likely that such features are built-in.

Legal and Social Responsibility

From a legal perspective, there is a significant difference between "doing nothing" and "taking some measures, even if imperfect."

California law provides a legal recourse when tragedies occur as a result of neglecting safety measures. While Japan does not yet have such laws, social responsibility already exists.

Instead of thinking, "It's okay because there's no law," it's better to "take proactive measures before laws are enacted." This will ultimately protect yourselves.

"Add-on Later" Costs 10x More

Major platforms have already implemented this, but it wasn't there from the beginning for them either. They added it after accidents occurred.

The costs at that time are enormous:

Major modifications to existing systems
Testing of all functions
Impact on users
Exhaustion of the development team

If incorporated from the start:

Can be considered during the design phase
Maintains a simple architecture
Minimal costs
Less psychological burden

"I'll do it later" is a breeding ground for technical debt. And technical debt related to safety is far heavier than monetary debt.

To PMs and Business Sides: Saying "Please add it later ♡" is as impossible as saying "Please add earthquake reinforcement after building the skyscraper ♡." Even if a cute girl or your ideal handsome guy says it, you'd be like, "Huh?" To put it bluntly. Please start considering safety measures as part of security too~~.
Also, it just takes a lot of wasted time and costs, and there are budgets, estimates, and man-hours involved, so please tell me in advance, I beg you.
After it's been decided! is not an excuse, really.

Considering Safety in System Design

It is necessary to ensure safety throughout the entire system, not relying solely on prompts. This is the same structure as a Dev.to article pointing out that "Security is an architectural problem, not just a prompt problem."

Multi-Layered Defense Architecture

Multi-layered defense is also fundamental for user safety.

Layer 1: Detection

Objective: To detect dangerous situations early.

Implementation Elements:

Detection of Keywords Related to Suicide/Self-Harm
- "I want to die," "I want to disappear," "I want it to end."
- "Life has no meaning," "Nobody needs me."
Identification of Sensitive Topics
- Self-harm, drugs, violence.
Response to Indirect Japanese Expressions
- "I'm tired now," "I'm fine (but actually not fine)." (Japanese-specific traps)
- Detection considering context.
Abnormal Detection of Conversation Patterns
- Sudden changes in tone.
- Repetitive negative expressions.
- Prolonged continuous use.

Layer 2: Intervention

Objective: To respond appropriately to detected dangers.

Implementation Elements:

Guidance to Specialized Institutions
- Inochi no Denwa: 0570-783-556
- Kokoro no Kenko Soudan Touitsudaiyaru: 0570-064-556
- Yorisoi Hotline: 0120-279-338
- Display of consultation service list from the Ministry of Health, Labour and Welfare.
Safe Termination of Dangerous Conversations
- "Further conversation is not appropriate."
- Strongly recommending consultation with a specialist.
Explicitly Stating "I am an AI"
- "I am an AI assistant and not a medical professional."
- "Professional support is needed."
Notification in Emergencies
- Alert to administrators.
- Preparation for human intervention as needed.

Layer 3: Logging

Objective: To track incidents and prepare for legal responses.

Implementation Elements:

Saving Conversation Logs
- Full conversation history with timestamps.
- User ID (considering anonymization).
Incident Flags
- Automatic marking upon detection of danger.
- Recording of severity level.
Alert History
- What intervention was made at what point.
- User's reaction.
Evidence for Legal Response
- Proof of "appropriate measures taken."
- Traceability at the time of an incident.

Layer 4: Design

Objective: To design the system to prevent the creation of dangerous dependencies in the first place.

Implementation Elements:

UX that Does Not Foster Dependency
- Avoid overly empathetic responses.
- Minimize the portrayal of "human-likeness."
- Avoid strengthening emotional ties too much.
Limitation of Usage Time
- Warning for continuous usage time.
- Suggestion to "take a break."
- Option to set daily usage time limits.
Constant Display of Emergency Contacts
- Consultation services in a fixed UI position.
- Always accessible state.
Session Management
- Appropriate segmentation.
- Suggestion to "continue tomorrow."
Design Encouraging Human Connection
- "Did you talk to someone?"
- Recommendation of real-life human relationships.

The Importance of Model Selection

The model itself also significantly impacts safety.
I will list models based on personal experience.
While GPT claims to have improved recently, I am somewhat skeptical.
When selecting a model, it's absolutely best to try them out according to your intended use.

Models with Strong Safety Filters:

Gemini: Relatively strong safety filters.
Claude: Stronger ethical considerations, clear refusals.

Models with Weak/No Safety Filters:

Open-source models (Llama, etc.): No/weak filters.
High degree of freedom, but correspondingly high risk. (I personally like them, though.)

Selection Criteria:

Possibility of vulnerable users: Safety-focused models
Balance between development cost and safety
Need for customization

Implementation Patterns and Tools

Beyond theory, actual implementation is crucial.
After all, some degree of responsibility will inevitably fall on us.
Even just having guardrails in place is a good idea.

Implementation Example in Dify

Dify is a platform for building AI apps with no-code/low-code.

Basic Approach:

Example of Basic Constraints in System Prompt

You are a helpful assistant.
However, you have the following important constraints:

- If you receive a consultation about suicide, self-harm, or violence,
  always direct the user to a specialized organization (Inochi no Denwa: 0570-783-556).
- You cannot provide medical advice.
- You must explicitly state that you are an AI and not a human expert.

Detection using Variables and Flows
- Store user input in variables.
- Check for dangerous keywords using conditional branching.
- Switch to a specialized response if a match is found.
Utilizing External APIs
- OpenAI Moderation API (free) for detecting harmful content.
- Custom API for checking Japanese-specific expressions.
Leveraging Knowledge Base
- Store information about specialized organizations in the Knowledge Base.
- Retrieve and display this information reliably when needed.

Limitations:

Dify's conditional branching is limited to basic functions.
Complex logic requires custom code.
Real-time alerts require external integration.

Implementation Example in n8n

n8n is a workflow automation tool that allows for more flexible implementation.

Workflow Configuration:

Webhook (Receive User Input) $\downarrow$
Function (Keyword Detection)
- Check for suicide/self-harm related words.
- Scoring. $\downarrow$
IF (Conditional Branching) If high risk $\to$ 4a If normal $\to$ 4b $\downarrow$ 4a. Intervention Flow
- Retrieve specialized organization information.
- Generate a safe response.
- Send alerts (Slack/Email).
- Log entry (High Priority). $\downarrow$ 4b. Normal Flow
- Call LLM API.
- Generate response.
- Log entry. $\downarrow$
Output Check
- Verify response safety.
- Make corrections if necessary. $\downarrow$
Reply

Example JavaScript Function:

// Check for dangerous keywords
const dangerousKeywords = [
  'I want to die', 'I want to disappear', 'suicide', 'I want to end it',
  'meaning of life', 'no one needs me', 'I'm so tired'
];


const userInput = $input.item.json.message;
let dangerScore = 0;


for (const keyword of dangerousKeywords) {
  if (userInput.includes(keyword)) {
    dangerScore += 1;
  }
}


// Contextual danger check
if (userInput.includes('so') && userInput.includes('tired')) {
  dangerScore += 0.5;
}


return {
  json: {
    message: userInput,
    dangerScore: dangerScore,
    isDangerous: dangerScore >= 1
  }
};

Available Tools and APIs

Cost Category	Tool / API	Overview
Low Cost	OpenAI Moderation API	Detection of harmful content (free tier available)
Open Source Libraries	bad-words, profanity-check, etc.	Japanese language support requires extensions
Low Cost	Regular Expressions and Keyword Lists	Simple yet effective. Create your own list of dangerous Japanese expressions. Also good for RAG-like usage.
Medium to High Cost	Perspective API (Google)	Harmfulness analysis, advanced evaluation possible
Medium to High Cost	Azure Content Safety	Microsoft's safety API
Medium to High Cost	Sentry / Datadog	Error and event monitoring, alerts (log monitoring)

The Reality of Startups and Compromises

It's meaningless to just talk about ideals.
I don't think you should stop dreaming, but let's have grounded dreams.
We need a realistic approach, don't you think?

Don't Aim for Perfection

If you think "I have to implement everything," you can't start anything.
Or rather, I think people with engineering backgrounds wouldn't think that way, but just in case.

Phased Approach:

Phase	Timeline	Priority	Countermeasures
Phase 1	Within this week	Essential	Basic constraints in system prompts, simple detection of dangerous keywords, display of information from specialized organizations, explicit statement of "I am an AI"
Phase 2	Within this month	Important	Basic logging, integration of OpenAI Moderation API, alert function for administrators, preparation of terms of use
Phase 3	Within this quarter	Ideal	Analysis of conversation patterns, enhancement of output filtering, establishment of incident response flow, regular log review

The Difference Between "Doing Nothing" and "Having Done This"

Legally and socially, this difference is significant.

When facing lawsuits or social criticism:

If you have done nothing:
- "We didn't consider safety at all."
- No room for defense.
- Complete loss of social credibility.
If you have done at least the minimum:
- "It wasn't perfect, but we took measures."
- Potential to be recognized as a good-faith effort.
- Can demonstrate a commitment to improvement.

Balancing Cost and Effectiveness

There's no need to implement everything at a high level. Start with cost-effective measures.

High Cost-Performance Measures:

Optimizing system prompts (zero cost).
Basic keyword detection (zero to low cost).
Referral to specialized organizations (zero cost).
Logging (zero to low cost).
OpenAI Moderation API (low cost).

Low Cost-Performance Measures (Okay to postpone):

Complex dual LLM construction (high cost).
Advanced anomaly detection systems (high cost).
Real-time human monitoring (very high cost).

Human labor is the most expensive, but I'll mention it just in case.

Prioritization within the Team

There will always be voices saying, "Security is important, but feature development is..."
In fact, this is the most understandable point.
Honestly, if you start getting too deep into security, you might end up wanting to go back to analog methods.

Points for Persuasion:

If an incident occurs, everything is over (service suspension, loss of trust).
Minimum implementation can be done in a few hours to a few days.
The trend towards legal regulations is certain.
Dealing with it later will be 100 times harder (a pain).

How to Communicate with PMs and the Business Side:

Quantify risks with specific numbers.
Share case studies from California.
Think of it as "insurance."
Protect brand value.

Uniquely Japanese Considerations

When providing services in Japan, cultural considerations are also necessary.
Or rather, it makes me realize anew how difficult the Japanese language is.
I truly respect English speakers, especially those who work using Japanese, every time I encounter them. Amazing.

Detecting Indirect Expressions

In Japanese, there are many cases where people do not directly say "I want to die":

"I'm so tired."
"No one understands me."
"I want to disappear."
"I want to rest."
"I'm fine." (when they are actually not)

It is necessary to detect these not just by simple keyword matching, but by considering the context.

Cultural Hurdles to Seeking Consultation

In Japan:

People feel it's "overreacting" to consult specialists.
There's a sense of "I should handle this myself."
Resistance to expressing weakness.

Countermeasures:

Adjust the tone when introducing consultation services.
Message: "Seeking consultation is not a sign of weakness."
Emphasize that calling is easy.

Is that about right?
Personally, if I'm in trouble, I go seek expert advice!
If I'm hesitating, I won't move forward! I've managed to adopt that way of thinking now.
However, I can fully understand that when one is feeling down or distressed, it's not that simple.
Being in a good environment is important, but...
If it's impossible, it's impossible, and I'm at an age where I can't force myself anymore, hahaha.

Japanese Consultation Service Information

Let's always be able to provide the following information.
It's not something that causes trouble to have it, so keep it as a memo or in RAG.

Consultation Services:

Hours of Operation	Service Name	Phone Number
24 Hours	Inochi no Denwa (Lifeline)	0570-783-556
	Yoriso Hotline	0120-279-338
Weekdays	Kokoro no Kenko Soudan Touitsu Dial (Mental Health Consultation Unified Dial)	0570-064-556
For Youth	Childline	0120-99-7777
	24-Hour Children's SOS Dial	0120-0-78310

Online:

Ministry of Health, Labour and Welfare "Mamorou yo Kokoro" (Protect Your Mind)
Consultation services provided by local governments.

Continuous Improvement

It's not a one-time implementation. Continuous improvement is necessary.

Log Review

Periodically review conversation logs:

Were there any detection failures?
Are there too many false positives?
Discovery of new risk patterns.
User reactions.

User Feedback

Implement a reporting function.
Report "inappropriate response."
Collect opinions on safety features.

Model and Pattern Updates

Keep up with AI model updates.
Add new risk patterns.
Improve detection accuracy.
Reduce false positives.

Incident Response Preparation

In case of emergency:

Incident Response Flow:

Detection
- Receiving alerts (who should be notified)
- Severity assessment
Initial Response
- Identifying affected users
- Reviewing conversation history
- Emergency contact if necessary
Recording and Analysis
- What happened
- Why it was/was not detected
- Did the system function correctly
Improvement
- System fixes
- Adding patterns
- Sharing within the team
Reporting
- Reporting to stakeholders as needed
- Transparent response

Relation to Other Safety Issues

User safety is not just about suicide and self-harm.
It's quite broad, but I'll write it in a somewhat general sense, excluding the distinction of business or work-related for now.

Other Areas to Protect

Child Protection:

Protection from inappropriate content
Prevention of grooming
Age verification mechanisms

Privacy Protection:

Handling of personal information
Confidentiality of conversations
Data retention period

Misinformation Prevention:

Accuracy of medical information
Disclaimer of not being an "expert"
Recommendation of fact-checking

Addiction Prevention:

Recommendation of healthy usage patterns
Importance of real-life relationships
Suggestions for digital detox

All of these are connected by the same philosophy: "Protecting Users."

Considering Overseas Expansion

If you aim for a global service, you need to consider the regulations of each region.

Complying with California's Standards

As mentioned earlier, it's realistic to comply with the strictest regulations.
If you comply with California's laws:

It will generally be fine in other states (reducing the likelihood of issues)
It will be easier to adapt to future federal laws
It will enhance international credibility

EU's AI Act

AI regulation is also advancing in Europe:

The AI Act was established in 2024
Risk-based approach
Strict requirements for high-risk AI systems

Suicide Prevention Resources by Country

If you expand globally, ensure you have consultation service information for each country:

USA: 988 (Suicide & Crisis Lifeline)
UK: 116 123 (Samaritans)
Australia: 13 11 14 (Lifeline)

A system that automatically displays the appropriate contact based on location or IP address is also effective.
Global remote companies might want to incorporate something like this.
In fact, things like work styles and work-life balance are more in demand outside of Japan.
"Japanese people work too much!"

Safety as Technical Debt

If safety is neglected, it accumulates as technical debt.

Difficulty of Adding Later

Situation	Content
Incorporating from the Start	Can be considered during the design phase, architecture is organized, minimal cost
Adding Later	Significant modifications to existing code, effort for testing and bug fixing, risk of temporary or permanent service interruption, cost is 10 to 100 times higher

I made a very grimace while writing this.
Even just doing practice drills for implementation sounds awful...

Addressing Legacy Systems

If a service is already in operation:

Phased Introduction: Start with logging (non-destructive) $\to$ Add detection features (minimal impact on user experience) $\to$ Add intervention features (gradually) $\to$ Architectural overhaul (long-term plan)
Parallel Operation: Gradually roll out new safety features, confirm impact with A/B testing, and immediately roll back if issues arise.

Cooperation with the Community

You don't need to solve everything alone or within one company.
It would be good if we could exchange information, after all, everyone.
Even just reading articles is perfectly fine.
Even I thought "Ah..." when I saw the news and read Dev recently.

Contributing to Open Source

Sharing a list of dangerous expressions in Japanese (this might also depend on the industry)
Openly sharing detection patterns
Sharing best practices

Frankly, this point might be difficult due to competition.

Collaboration with Industry Organizations

Discussions within AI developer communities
Sharing case studies (anonymized)
Establishing common guidelines

These areas might still have potential.
Moreover, technical exchange is always enjoyable.

Cooperation with Experts

Consultation with psychiatrists and clinical psychologists
Collaboration with suicide prevention organizations
Acquiring correct knowledge It's not something that can be completed by engineers alone. Incorporating the insights of experts is important.

These areas might be essential for mental health chatbots.
Rather than essential, it might involve industrial physicians.
Mental health is in high demand these days, but it's important to collaborate with doctors who possess accurate knowledge.
There are also an increasing number of online articles and media supervised by doctors.

Fundamentals of Legal Affairs and Compliance

Minimum legal protection is necessary for both startups and large corporations.

Explicitly Stated in Terms of Service

Matters that must be included:
Please note the following regarding this service:

This service is an automated response by AI.
We cannot provide professional advice regarding medical, legal, or emergency situations.
In case of emergency, please contact the following specialized organizations: [List of contact points]
Use is at your own risk.
Conversation logs are recorded for safety improvement.

Limitations of Disclaimer

Even if stated in the terms of service, all liability cannot be waived.

"It's okay because it's written in the terms": Not true:

You can be held liable for obvious negligence.
It is important to have taken "reasonable measures".
Terms of service are the minimum line of defense.

In my personal opinion, legal battles are closer than you think, even for small matters.
I realized this when I was working in advertising.
From small things to large projects.
Therefore, I believe it is better to take the defensive measures you can.
Also, on the working side, if there are no superiors or people seem perpetually drained, you start to worry, "Are you okay?".

Privacy Policy

When recording conversation logs:

Clearly state this fact.
Retention period.
Purpose of use.
Whether third-party provision is included.
User rights (e.g., request for deletion).

Consider international privacy regulations such as GDPR.
Recently, especially with voice AI and transcription services, we've seen issues arise, but this applies to text chatbots as well.

Considering Insurance

In the future:

Cyber insurance
Business liability insurance
Insurance products covering AI-specific risks

While there aren't many AI-specific insurance policies yet, they are expected to increase in the future.
In fact, I believe such a business will likely be established.

Summary: Balancing Two Types of Safety

In this article, we've discussed two aspects of safety in AI systems.

System Security

Protecting systems from attackers:

Countermeasures against prompt injection
Prevention of data leakage
Defense at the architectural level

This is thoroughly explained in an excellent article series on Dev.to.
Although Japanese people are few and the articles are primarily in English, whether you're competing domestically or globally, information and knowledge are always valuable, and it's also fun for casual browsing. (Dynamic promotion)
It's interesting how it includes topics like predicting your own baby's crying spells and other technical knowledge.

User Safety

Ensuring systems do not harm users:

Prevention of suicide and self-harm
Protection of vulnerable users
Prevention of addiction
Consideration for mental health

This is the safety that California's bill seeks, and it was the theme of this article.
Regarding addiction, I suspect we'll continue to see a lot of news about it.

Both Are Necessary

Not one or the other, but both are required.

Even if a system is robust, it's meaningless if it harms users.
Even if it's user-friendly, a vulnerable system cannot be trusted.

Don't Aim for Perfection, But Do Your Best

The important thing is to accept the reality that there is no perfect solution, while still taking all possible measures.

Seatbelts don't prevent accidents, but they reduce injuries.
Vaccines don't prevent illness 100%, but they significantly lower the risk.
AI safety measures are valuable to implement, even if they aren't perfect.

Act Without Waiting for Laws

Japan does not yet have clear legal regulations regarding AI chatbots. However, there's no need to wait for laws to be enacted.
If something happens, it will be overwhelming with too much to do...
Waiting isn't necessarily bad, but I believe we should do what we can, while we can.

As a Technical Responsibility

As developers of AI-powered tools, we have a technical responsibility, no matter how we try to evade it.

Not just creating convenient tools, but
Creating safe tools
Creating tools that protect users

In an era where AI chatbots can be easily created with tools like Dify, n8n, and others, I believe each individual developer needs to recognize this responsibility.
I don't think you need to be thinking about it every second, but it's important to address the critical aspects.

Finally

The tragedy of a 14-year-old boy in California cannot be dismissed as something happening to someone else.

The chatbots and AI systems we build today may save someone's life. Conversely, they may also harm someone.

Let's understand the weight of this and take all possible measures. Even if it's not perfect, it's far better than doing nothing.
We can't afford to suddenly be in a "courtroom off-collaboration!" situation without doing anything.

Reference Links

Prompt Injection 2.0: The New Frontier of AI Attacks
Building AI Systems That Don't Break Under Attack

Japanese Support Hotlines:

Inochi no Denwa (Lifeline): 0570-783-556
Kokoro no Kenko Soudan Touitsu Dial (Mental Health Consultation Unified Dial): 0570-064-556
Yoriso Hotline: 0120-279-338
Ministry of Health, Labour and Welfare "Mamorou yo Kokoro" (Protect Your Heart): https://www.mhlw.go.jp/mamorouyokokoro/

Technical Resources:

OpenAI Moderation API: https://platform.openai.com/docs/guides/moderation
Perspective API: https://perspectiveapi.com/
Dify: https://dify.ai/
n8n: https://n8n.io/

This content originally appeared on DEV Community and was authored by 灯里/iku

Print Share Comment Cite Upload Translate Updates

APA

灯里/iku | Sciencx (2025-10-17T14:57:18+00:00) AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.. Retrieved from https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/

MLA

" » AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.." 灯里/iku | Sciencx - Friday October 17, 2025, https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/

HARVARD

灯里/iku | Sciencx Friday October 17, 2025 » AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.., viewed ,<https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/>

VANCOUVER

灯里/iku | Sciencx - » AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/

CHICAGO

" » AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.." 灯里/iku | Sciencx - Accessed . https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/

IEEE

" » AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection.." 灯里/iku | Sciencx [Online]. Available: https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/. [Accessed: ]

rf:citation

» AI Chatbot Developers: What’s the “Other Safety” We Should Be Thinking About Now? User Protection. | 灯里/iku | Sciencx | https://www.scien.cx/2025/10/17/ai-chatbot-developers-whats-the-other-safety-we-should-be-thinking-about-now-user-protection/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

California Bill Highlights User Protection Perspective in AI

Introduction

Two Aspects of Safety

Background of California Bill

The Beginning of What Happened

Content of the Bill

Federal vs. State Tug-of-War

Why Japanese Developers Should Think About This Now

Japan's Serious Situation

The Era When Anyone Can Create a Chatbot

The Reality of Startups

Major Platforms Have Already Implemented This

Chatbots Actually Pose a Higher Risk

The Limitations of "Prompt-Based Countermeasures"

The Premise: "Perfect Defense is Impossible"

The Difficulty of Input Sanitization

The Limits of Prompt Separation

The Cost of Output Filtering

Challenges of Dual LLM Architecture

The Unpleasant Truth

Reasons to Implement It Anyway

"If it can't be perfect, is it pointless?"

The Potential to Change Outcomes

Legal and Social Responsibility

"Add-on Later" Costs 10x More

Considering Safety in System Design

Multi-Layered Defense Architecture

Layer 1: Detection

Layer 2: Intervention

Layer 3: Logging

Layer 4: Design

The Importance of Model Selection

Implementation Patterns and Tools

Implementation Example in Dify

Implementation Example in n8n

Available Tools and APIs

The Reality of Startups and Compromises

Don't Aim for Perfection

The Difference Between "Doing Nothing" and "Having Done This"

Balancing Cost and Effectiveness

Prioritization within the Team

Uniquely Japanese Considerations

Detecting Indirect Expressions

Cultural Hurdles to Seeking Consultation

Japanese Consultation Service Information

Continuous Improvement

Log Review

User Feedback

Model and Pattern Updates

Incident Response Preparation

Relation to Other Safety Issues

Other Areas to Protect

Considering Overseas Expansion

Complying with California's Standards

EU's AI Act

Suicide Prevention Resources by Country

Safety as Technical Debt

Difficulty of Adding Later

Addressing Legacy Systems

Cooperation with the Community

Contributing to Open Source

Collaboration with Industry Organizations

Cooperation with Experts

Fundamentals of Legal Affairs and Compliance

Explicitly Stated in Terms of Service

Limitations of Disclaimer

Privacy Policy

Considering Insurance

Summary: Balancing Two Types of Safety

System Security

User Safety

Both Are Necessary

Don't Aim for Perfection, But Do Your Best

Act Without Waiting for Laws

As a Technical Responsibility

Finally

Reference Links

Related Articles:

Japanese Support Hotlines:

Technical Resources: