🧠❀️ Gemma-3-1B β€” Empathetic Mental Health Assistant

A fine-tuned, lightweight conversational model designed for supportive and empathetic responses in the mental health domain.


πŸ“Œ Overview

This model is a fine-tuned version of Unsloth's Gemma-3-1B-IT, specifically adapted to provide helpful, empathetic, and non-judgmental responses for mental health–related queries.

Optimized for low-resource environments, it leverages:

  • Unsloth 4-bit LoRA fine-tuning for efficiency
  • GGUF format for a small memory footprint
  • Fast inference on desktop, mobile, and edge devices

The adapter model is available at MeWan2808/scaai_gemma_1b.


✨ Key Features

βœ… Empathetic Responses β€” Tailored for supportive mental health conversations
βœ… Lightweight β€” 1B parameters, quantized to 4-bit for minimal resource use
βœ… Cross-Platform β€” Compatible with llama.cpp, ollama, React Native, and Hugging Face Inference API
βœ… Low Memory Footprint β€” Runs on devices with <8GB RAM
βœ… Domain-Specific β€” Fine-tuned on the Riyazmk/mentalhealth dataset (first 1000 rows)


⚠️ Ethical Notice & Limitations

🚨 This AI is NOT a substitute for professional mental health care.
It is not designed for:

  • Crisis intervention
  • Diagnosing mental health conditions
  • Prescribing treatments or medications

πŸ“’ If you’re in a crisis, please reach out to professionals:

  • πŸ“ž 988 Suicide and Crisis Lifeline (US)
  • πŸ“ž Samaritans (UK) β€” 116 123
  • πŸ“ž AASRA (India) β€” +91-9820466726

πŸ› οΈ Technical Details

  • Base Model: unsloth/gemma-3-1b-it
  • Fine-Tuning Method: LoRA with Unsloth (4-bit QLoRA)
  • Dataset: Riyazmk/mentalhealth (first 1000 rows)
  • Language: English (en)
  • Quantization: 4-bit (q4_0) in GGUF format
  • Adapter Model: MeWan2808/scaai_gemma_1b

Example Training Command

from unsloth import FastLanguageModel
model = FastLanguageModel.from_pretrained(
    model_name="unsloth/gemma-3-1b-it",
    load_in_4bit=True,
)
# Add LoRA fine-tuning configuration here...

B[Tokenizer
Converts to Tokens]
B --> C[Gemma-3-1B Model<br>LoRA Fine-Tuned, 4-bit]
C --> D[Response Generation<br>"Try deep breathing..."]
D --> E[User Output<br>Empathetic Response]

--- -->

## πŸš€ How to Use

### **1️⃣ Python with Transformers and PEFT**

Install dependencies:
```bash
pip install transformers peft torch

Run inference:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model = "unsloth/gemma-3-1b-it"
adapter_model = "MeWan2808/scaai_gemma_1b"

# Load base tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Apply adapter
model = PeftModel.from_pretrained(model, adapter_model)

# Question
question = "I’ve been feeling anxious lately. What are some healthy coping strategies?"

# Encode
inputs = tokenizer(question, return_tensors="pt").to(model.device)

# Generate safely
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        pad_token_id=tokenizer.eos_token_id
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

2️⃣ Python with Unsloth

Install dependencies:

pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes

Run inference:

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    "MeWan2808/scaai_gemma_1b",  # Note: Use the adapter model here if merged, or load base + adapter
    load_in_4bit=True,
)

prompt = "I'm feeling very anxious lately. Can you give me some coping tips?"
response = model.chat(tokenizer, prompt)
print(response)

3️⃣ llama.cpp (GGUF)

llama-cli -m ./gemma_1b_q4_0.gguf -p "How can I manage stress before an exam?"

4️⃣ Ollama

ollama create gemma-mental-health -f ./Modelfile
ollama run gemma-mental-health

5️⃣ React Native / Mobile

  • Bundle gemma_1b_q4_0.gguf in app assets
  • Use llama.cpp mobile bindings or llama-rn for local inference
  • Process user input offline for privacy and performance

πŸ“Š Example Interaction

Prompt:

I’ve been feeling down lately and don’t know why.

Response:

I’m really sorry you’re going through thisβ€”it’s tough when those feelings creep in without a clear cause. Try jotting down your thoughts in a journal to spot any patterns, or treat yourself to small moments of care, like a relaxing walk or deep breathing. You’re not alone, and your feelings matter.


πŸ“‚ Available Files

  • gemma_1b_q4_0.gguf β€” Quantized model for local inference
  • config.json, tokenizer.json β€” Model and tokenizer configurations
  • README.md β€” This documentation

πŸ“… Changelog

  • v1.0 β€” Initial release with LoRA fine-tuning and 4-bit quantization

πŸ™Œ Acknowledgements

  • Unsloth for the Gemma-3-1B-IT base model
  • UnslothAI for efficient fine-tuning tools
  • Riyazmk/mentalhealth dataset contributors

πŸ“œ License

Licensed under Apache 2.0 β€” free to use, modify, and distribute with proper attribution.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for MeWan2808/scaai_gemma_1b

Adapter
(12)
this model

Dataset used to train MeWan2808/scaai_gemma_1b