Gemma-3-270M Dhivehi — Classification Model

Compact Dhivehi (Þ‹Þ¨ÞˆÞ¬Þ€Þ¨) classification model based on google/gemma-3-270m-it, designed to perform various text classification tasks in Dhivehi including sentiment analysis, topic classification, intent recognition, and opinion mining.

Note: This model is specifically tuned for classification tasks and provides structured JSON outputs for easy integration into downstream applications.

Model details

  • Base: google/gemma-3-270m-it
  • Language: Dhivehi
  • Style: Analytical, structured, consistent
  • Output format: JSON-structured classification results
  • Supported tasks:
    • Sentiment analysis (positive, negative, neutral)
    • Topic classification (news, politics, sports, etc.)
    • Intent recognition (question, statement, request, etc.)
    • Opinion mining (support, oppose, neutral)

Intended use

  • Dhivehi text classification for content moderation
  • Sentiment analysis of Dhivehi social media content
  • Topic categorization for news and content organization
  • Intent classification for chatbot and customer service applications
  • Opinion analysis for research and analytics

Not intended for: open-domain factual lookup, legal/medical advice, or long multi-turn conversations.

How to use

Model Setup

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import json

model_path = "alakxender/gemma-3-270m-dhivehi-text-classifier"

model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype="auto", 
    device_map="auto", 
    attn_implementation="eager"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

Classification Function Definitions

Here are the function definitions for each classification task:

def analyze_sentiment(dhivehi_text: str, model, tokenizer, pipe, max_new_tokens: int = 128, temperature: float = 0.1, top_p: float = 0.2, top_k: int = 5, do_sample: bool = False):
    """
    Analyze sentiment of Dhivehi text
    
    Args:
        dhivehi_text: Input Dhivehi text
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with sentiment analysis result
    """
    instruction = f"Analyze the sentiment of this Dhivehi text: {dhivehi_text}"
    return _generate_classification_response(instruction, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)

def classify_topic(dhivehi_text: str, model, tokenizer, pipe, max_new_tokens: int = 128, temperature: float = 0.1, top_p: float = 0.2, top_k: int = 5, do_sample: bool = False):
    """
    Classify topic of Dhivehi text
    
    Args:
        dhivehi_text: Input Dhivehi text
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with topic classification result
    """
    instruction = f"Determine the topic of this Dhivehi text: {dhivehi_text}"
    return _generate_classification_response(instruction, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)

def identify_intent(dhivehi_text: str, model, tokenizer, pipe, max_new_tokens: int = 128, temperature: float = 0.1, top_p: float = 0.2, top_k: int = 5, do_sample: bool = False):
    """
    Identify intent in Dhivehi text
    
    Args:
        dhivehi_text: Input Dhivehi text
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with intent identification result
    """
    instruction = f"Identify the intent behind this Dhivehi text: {dhivehi_text}"
    return _generate_classification_response(instruction, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)

def classify_opinion(dhivehi_text: str, model, tokenizer, pipe, max_new_tokens: int = 128, temperature: float = 0.1, top_p: float = 0.2, top_k: int = 5, do_sample: bool = False):
    """
    Classify opinion in Dhivehi text
    
    Args:
        dhivehi_text: Input Dhivehi text
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with opinion classification result
    """
    instruction = f"Classify the opinion expressed in this Dhivehi text: {dhivehi_text}"
    return _generate_classification_response(instruction, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)

def analyze_all(dhivehi_text: str, model, tokenizer, pipe, max_new_tokens: int = 128, temperature: float = 0.1, top_p: float = 0.2, top_k: int = 5, do_sample: bool = False):
    """
    Run all analysis tasks on the input text
    
    Args:
        dhivehi_text: Input Dhivehi text
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with all analysis results
    """
    if pipe is None or model is None or tokenizer is None:
        return {"error": "Please load a classification model first!"}
    
    try:
        opinion_result = classify_opinion(dhivehi_text, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)
        sentiment_result = analyze_sentiment(dhivehi_text, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)
        intent_result = identify_intent(dhivehi_text, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)
        topic_result = classify_topic(dhivehi_text, model, tokenizer, pipe, max_new_tokens, temperature, top_p, top_k, do_sample)
        
        # Extract the actual values from nested results
        results = {
            "opinion": opinion_result.get("opinion", "unknown") if isinstance(opinion_result, dict) else opinion_result,
            "sentiment": sentiment_result.get("sentiment", "unknown") if isinstance(sentiment_result, dict) else sentiment_result,
            "intent": intent_result.get("intent", "unknown") if isinstance(intent_result, dict) else intent_result,
            "topic": topic_result.get("topic", "unknown") if isinstance(topic_result, dict) else topic_result
        }
        
        return results
    except Exception as e:
        return {"error": f"Error in complete analysis: {str(e)}"}

def _generate_classification_response(instruction: str, model, tokenizer, pipe, max_new_tokens: int, temperature: float, top_p: float, top_k: int, do_sample: bool):
    """
    Helper function to generate classification response
    
    Args:
        instruction: The instruction for the model
        model: Loaded model
        tokenizer: Loaded tokenizer
        pipe: Loaded pipeline
        max_new_tokens: Maximum tokens to generate
        temperature: Sampling temperature
        top_p: Top-p sampling parameter
        top_k: Top-k sampling parameter
        do_sample: Whether to use sampling
        
    Returns:
        Dictionary with classification result
    """
    if pipe is None or model is None or tokenizer is None:
        return {"error": "Please load a classification model first!"}
    
    try:
        messages = [{"role": "user", "content": instruction}]
        prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        
        # Generation parameters
        if do_sample:
            gen_kwargs = {
                "max_new_tokens": max_new_tokens,
                "temperature": temperature,
                "top_p": top_p,
                "top_k": top_k,
                "do_sample": True,
                "disable_compile": True,
                "pad_token_id": tokenizer.eos_token_id
            }
        else:
            gen_kwargs = {
                "max_new_tokens": max_new_tokens,
                "disable_compile": True,
                "pad_token_id": tokenizer.eos_token_id
            }
        
        outputs = pipe(prompt, **gen_kwargs)
        
        # Extract only the generated part
        response = outputs[0]['generated_text'][len(prompt):].strip()
        
        # Try to parse JSON response
        try:
            import json
            result = json.loads(response)
            return result
        except json.JSONDecodeError:
            return {"raw_response": response, "error": "Failed to parse JSON"}
            
    except Exception as e:
        return {"error": f"Error in classification: {str(e)}"}

Usage Example

# Load your model, tokenizer, and pipeline first
# model, tokenizer, pipe = load_your_model()

# Example text for classification
dhivehi_text = "Þ‹Þ¨ÞˆÞ¬Þ€Þ¨ÞƒÞ§Þ‡Þ°Þ–Þ¬Þ‡Þ¦Þ†Þ© Þ‡Þ¨Þ‚Þ°Þ‘Þ¨Þ”Þ§ Þ†Þ¦Þ‚Þ‘ÞªÞŽÞ¦Þ‡Þ¨ Þ‡Þ®Þ‚Þ°Þ‚Þ¦ Þ–Þ¦Þ’Þ©ÞƒÞ§ Þ¤Þ¦Þ‡ÞªÞ‰Þ¬Þ†Þ¬ÞˆÞ¬."

# Run individual classifications
sentiment_result = analyze_sentiment(dhivehi_text, model, tokenizer, pipe)
topic_result = classify_topic(dhivehi_text, model, tokenizer, pipe)
intent_result = identify_intent(dhivehi_text, model, tokenizer, pipe)
opinion_result = classify_opinion(dhivehi_text, model, tokenizer, pipe)

# Or run all analyses at once
all_results = analyze_all(dhivehi_text, model, tokenizer, pipe)

print(f"Sentiment: {sentiment_result}")
print(f"Topic: {topic_result}")
print(f"Intent: {intent_result}")
print(f"Opinion: {opinion_result}")
print(f"Complete Analysis: {all_results}")

""" 
# Response: 
Sentiment: {'sentiment': 'Positive'}
Topic: {'topic': 'Politics'}
Intent: {'intent': 'Identification'}
Opinion: {'opinion': 'Neutral'}
Complete Analysis: {'opinion': 'Neutral', 'sentiment': 'Neutral', 'intent': 'Statement', 'topic': 'International' 
"""

Sample Classification Categories

Note: The model will output additional classification categories that are not listed here.

Sentiment Analysis

  • positive: Expresses positive emotions or favorable views
  • negative: Expresses negative emotions or unfavorable views
  • neutral: Expresses neutral or balanced views

Topic Classification

  • news: News articles and current events
  • politics: Political discussions and government
  • sports: Sports news and athletic activities
  • entertainment: Movies, music, and cultural events
  • technology: Tech news and innovations
  • health: Health and medical information
  • education: Educational content and academic topics
  • business: Business and economic news

Intent Recognition

  • question: Asking for information or clarification
  • statement: Making a factual statement
  • request: Asking for action or assistance
  • opinion: Expressing personal views
  • complaint: Expressing dissatisfaction
  • praise: Expressing appreciation or approval

Opinion Mining

  • support: Expressing agreement or support
  • oppose: Expressing disagreement or opposition
  • neutral: Expressing balanced or neutral stance

Generation Tips

  • Use do_sample=False for consistent, deterministic classification results
  • Keep max_new_tokens modest (64-128) for focused classification outputs
  • The model outputs structured JSON for easy parsing
  • Dhivehi is RTL; ensure your environment renders RTL text correctly

Prompting Patterns

  • Sentiment Analysis

    • instruction: "Analyze the sentiment of this Dhivehi text: [text]"
    • expected output: {"sentiment": "positive/negative/neutral"}
  • Topic Classification

    • instruction: "Determine the topic of this Dhivehi text: [text]"
    • expected output: {"topic": "news/politics/sports/etc"}
  • Intent Recognition

    • instruction: "Identify the intent behind this Dhivehi text: [text]"
    • expected output: {"intent": "question/statement/request/etc"}
  • Opinion Mining

    • instruction: "Classify the opinion expressed in this Dhivehi text: [text]"
    • expected output: {"opinion": "support/oppose/neutral"}

Limitations

  • Classification accuracy depends on the quality and relevance of the input text
  • May struggle with ambiguous or mixed-sentiment text
  • Domain expertise is limited to patterns seen in the training data
  • Long texts may exceed context limits; provide focused excerpts for best results
  • The model is specifically trained for classification tasks in Dhivehi
Downloads last month
49
Safetensors
Model size
268M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/gemma-3-270m-dhivehi-text-classifier

Finetuned
(490)
this model
Merges
3 models

Datasets used to train alakxender/gemma-3-270m-dhivehi-text-classifier