monostate-model-c55fca35

This model is a fine-tuned version of unsloth/gemma-3-270m-it.

Model Description

This model was fine-tuned using the Monostate training platform with LoRA (Low-Rank Adaptation) for efficient training.

Training Details

Training Data

Dataset size: 162 samples
Training type: Supervised Fine-Tuning (SFT)

Training Procedure

Training Hyperparameters

Training regime: Mixed precision (fp16)
Optimizer: AdamW
LoRA rank: 128
LoRA alpha: 128
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Results

Final loss: 1.0929438757896424
Training time: 0.57 minutes
Generated on: 2025-09-15T11:07:20.866246

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("andrewmonostate/monostate-model-c55fca35")
tokenizer = AutoTokenizer.from_pretrained("andrewmonostate/monostate-model-c55fca35")

# Generate text
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
        top_p=0.95,
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Framework Versions

Transformers: 4.40+
PyTorch: 2.0+
Datasets: 2.0+
Tokenizers: 0.19+

License

This model is licensed under the Apache 2.0 License.

Citation

If you use this model, please cite:

@misc{andrewmonostate_monostate_model_c55fca35,
  title={monostate-model-c55fca35},
  author={Monostate},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/andrewmonostate/monostate-model-c55fca35}
}

Training Platform

This model was trained using Monostate, an AI training and deployment platform.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for andrewmonostate/monostate-model-c55fca35

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it

Finetuned

unsloth/gemma-3-270m-it

Finetuned

(261)

this model