π§ β€οΈ Gemma-3-1B β Empathetic Mental Health Assistant
A fine-tuned, lightweight conversational model designed for supportive and empathetic responses in the mental health domain.
π Overview
This model is a fine-tuned version of Unsloth's Gemma-3-1B-IT, specifically adapted to provide helpful, empathetic, and non-judgmental responses for mental healthβrelated queries.
Optimized for low-resource environments, it leverages:
- Unsloth 4-bit LoRA fine-tuning for efficiency
- GGUF format for a small memory footprint
- Fast inference on desktop, mobile, and edge devices
The adapter model is available at MeWan2808/scaai_gemma_1b
.
β¨ Key Features
β
Empathetic Responses β Tailored for supportive mental health conversations
β
Lightweight β 1B parameters, quantized to 4-bit for minimal resource use
β
Cross-Platform β Compatible with llama.cpp
, ollama
, React Native, and Hugging Face Inference API
β
Low Memory Footprint β Runs on devices with <8GB RAM
β
Domain-Specific β Fine-tuned on the Riyazmk/mentalhealth
dataset (first 1000 rows)
β οΈ Ethical Notice & Limitations
π¨ This AI is NOT a substitute for professional mental health care.
It is not designed for:
- Crisis intervention
- Diagnosing mental health conditions
- Prescribing treatments or medications
π’ If youβre in a crisis, please reach out to professionals:
- π 988 Suicide and Crisis Lifeline (US)
- π Samaritans (UK) β 116 123
- π AASRA (India) β +91-9820466726
π οΈ Technical Details
- Base Model:
unsloth/gemma-3-1b-it
- Fine-Tuning Method: LoRA with Unsloth (4-bit QLoRA)
- Dataset:
Riyazmk/mentalhealth
(first 1000 rows) - Language: English (
en
) - Quantization: 4-bit (
q4_0
) in GGUF format - Adapter Model:
MeWan2808/scaai_gemma_1b
Example Training Command
from unsloth import FastLanguageModel
model = FastLanguageModel.from_pretrained(
model_name="unsloth/gemma-3-1b-it",
load_in_4bit=True,
)
# Add LoRA fine-tuning configuration here...
B[Tokenizer
Converts to Tokens]
B --> C[Gemma-3-1B Model<br>LoRA Fine-Tuned, 4-bit]
C --> D[Response Generation<br>"Try deep breathing..."]
D --> E[User Output<br>Empathetic Response]
--- -->
## π How to Use
### **1οΈβ£ Python with Transformers and PEFT**
Install dependencies:
```bash
pip install transformers peft torch
Run inference:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model = "unsloth/gemma-3-1b-it"
adapter_model = "MeWan2808/scaai_gemma_1b"
# Load base tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Load base model
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto"
)
# Apply adapter
model = PeftModel.from_pretrained(model, adapter_model)
# Question
question = "Iβve been feeling anxious lately. What are some healthy coping strategies?"
# Encode
inputs = tokenizer(question, return_tensors="pt").to(model.device)
# Generate safely
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
2οΈβ£ Python with Unsloth
Install dependencies:
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
Run inference:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"MeWan2808/scaai_gemma_1b", # Note: Use the adapter model here if merged, or load base + adapter
load_in_4bit=True,
)
prompt = "I'm feeling very anxious lately. Can you give me some coping tips?"
response = model.chat(tokenizer, prompt)
print(response)
3οΈβ£ llama.cpp (GGUF)
llama-cli -m ./gemma_1b_q4_0.gguf -p "How can I manage stress before an exam?"
4οΈβ£ Ollama
ollama create gemma-mental-health -f ./Modelfile
ollama run gemma-mental-health
5οΈβ£ React Native / Mobile
- Bundle
gemma_1b_q4_0.gguf
in app assets - Use
llama.cpp
mobile bindings orllama-rn
for local inference - Process user input offline for privacy and performance
π Example Interaction
Prompt:
Iβve been feeling down lately and donβt know why.
Response:
Iβm really sorry youβre going through thisβitβs tough when those feelings creep in without a clear cause. Try jotting down your thoughts in a journal to spot any patterns, or treat yourself to small moments of care, like a relaxing walk or deep breathing. Youβre not alone, and your feelings matter.
π Available Files
gemma_1b_q4_0.gguf
β Quantized model for local inferenceconfig.json
,tokenizer.json
β Model and tokenizer configurationsREADME.md
β This documentation
π Changelog
- v1.0 β Initial release with LoRA fine-tuning and 4-bit quantization
π Acknowledgements
- Unsloth for the Gemma-3-1B-IT base model
- UnslothAI for efficient fine-tuning tools
- Riyazmk/mentalhealth dataset contributors
π License
Licensed under Apache 2.0 β free to use, modify, and distribute with proper attribution.