YAML Metadata Warning: The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

πŸ“° T5-small Fine-tuned on News Summarization (Custom 25K Dataset)

This is a custom fine-tuned t5-small model trained for abstractive news summarization. The model was trained on a dataset of 25,000 news articles paired with human-written summaries.


πŸ”§ Model Details

  • Base Model: t5-small (by Google)
  • Architecture Modified: Only 2 encoder and 2 decoder layers trained (others frozen)
  • Fine-tuned for: Abstractive summarization of news articles
  • Dataset Size: 25K article-summary pairs
  • Precision: Mixed precision (fp16)
  • Training Epochs: 10

πŸ§ͺ Training Configuration

Using Hugging Face's Seq2SeqTrainingArguments:

training_args = Seq2SeqTrainingArguments(
    output_dir="./results",
    eval_strategy="epoch",
    save_strategy="epoch",
    logging_strategy="steps",
    learning_rate=2e-4,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=4,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=10,
    predict_with_generate=False,
    generation_max_length=128,
    generation_num_beams=4,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    logging_first_step=True,
    fp16=True,
    gradient_accumulation_steps=3,
    dataloader_num_workers=4,
    report_to=None,
    eval_accumulation_steps=3,
    label_smoothing_factor=0.1,
)

πŸ“ˆ Training Progress

Training and Validation Loss


πŸ“Š ROUGE Scores Over Epochs

ROUGE Metrics


Evaluation Metrics

Metric Value (Approx)
ROUGE-1 ~32.2
ROUGE-2 ~11.0
ROUGE-L ~18.2
ROUGE-Lsum ~28.7

How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("your-username/your-model-name")
model = AutoModelForSeq2SeqLM.from_pretrained("your-username/your-model-name")

text = "Your news article goes here."
inputs = tokenizer("summarize: " + text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs.input_ids, max_length=128, num_beams=4)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))

Notes

The model is lightweight and optimized for lower compute environments due to partial layer training.

Ideal for real-time or edge applications requiring summarization of news, reports, and factual narratives.

The generation_max_length can be tuned during inference for better performance on long articles.


πŸ“ Intended Use

πŸ“° News summarization

🧾 Document compression

🧠 Low-resource abstractive summarization tasks

Downloads last month
2
Safetensors
Model size
60.5M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Harsh-Gupta/t5-small-news-summarizer

Base model

google-t5/t5-small
Finetuned
(2123)
this model

Dataset used to train Harsh-Gupta/t5-small-news-summarizer