monostate-model-c55fca35
This model is a fine-tuned version of unsloth/gemma-3-270m-it.
Model Description
This model was fine-tuned using the Monostate training platform with LoRA (Low-Rank Adaptation) for efficient training.
Training Details
Training Data
- Dataset size: 162 samples
- Training type: Supervised Fine-Tuning (SFT)
Training Procedure
Training Hyperparameters
- Training regime: Mixed precision (fp16)
- Optimizer: AdamW
- LoRA rank: 128
- LoRA alpha: 128
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Results
- Final loss: 1.0929438757896424
- Training time: 0.57 minutes
- Generated on: 2025-09-15T11:07:20.866246
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("andrewmonostate/monostate-model-c55fca35")
tokenizer = AutoTokenizer.from_pretrained("andrewmonostate/monostate-model-c55fca35")
# Generate text
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.95,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Framework Versions
- Transformers: 4.40+
- PyTorch: 2.0+
- Datasets: 2.0+
- Tokenizers: 0.19+
License
This model is licensed under the Apache 2.0 License.
Citation
If you use this model, please cite:
@misc{andrewmonostate_monostate_model_c55fca35,
title={monostate-model-c55fca35},
author={Monostate},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/andrewmonostate/monostate-model-c55fca35}
}
Training Platform
This model was trained using Monostate, an AI training and deployment platform.