RoBERTa β€” BABE β€” HA-FT

This repository provides a RoBERTa-base model fine-tuned on the BABE (Bias Annotations By Experts) dataset for sentence-level lexical/loaded-language bias detection in English news text. BABE was introduced in the paper Neural Media Bias Detection Using Distant Supervision With BABE – Bias Annotations By Experts.

Labels

  • 0 β†’ neutral / non-lexical-bias
  • 1 β†’ lexical-bias

Intended use & limitations

  • Intended use: research and benchmarking of lexical bias at the sentence level on news-like English text.
  • Out-of-scope: detection of informational/selection bias, stance, political leaning, or factuality; production deployments without human oversight.

How to use

    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    m = "mediabiasgroup/roberta-babe-ft"
    tok = AutoTokenizer.from_pretrained(m)
    model = AutoModelForSequenceClassification.from_pretrained(m)

    text = "Democrats shamelessly rammed the bill through Congress."
    probs = model(**tok(text, return_tensors="pt")).logits.softmax(-1).tolist()[0]
    print({"neutral": probs[0], "lexical_bias": probs[1]})

Training data & setup

  • Data: BABE (expert-annotated, sentence-level lexical bias).
  • Backbone: roberta-base with a standard sequence-classification head.
  • Training: single-run fine-tuning; standard hyperparameters (update with your exact config if desired).

Safety, bias & ethics

Media-bias perception is subjective and context-dependent. This model may over-flag emotionally charged wording. Keep a human in the loop and avoid punitive or outlet-level decisions without careful validation.

Citation

If you use this model or the dataset, please cite:

@article{spinde2022neural,
  title   = {Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts},
  author  = {Spinde, Timo and Plank, Manuel and Krieger, Jan-David and Ruas, Terry and Gipp, Bela and Aizawa, Akiko},
  journal = {arXiv preprint arXiv:2209.14557},
  year    = {2022},
  url     = {https://arxiv.org/abs/2209.14557}
}
Downloads last month
74
Safetensors
Model size
125M params
Tensor type
I64
Β·
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for mediabiasgroup/roberta-babe-ft

Finetuned
(1871)
this model

Dataset used to train mediabiasgroup/roberta-babe-ft

Collection including mediabiasgroup/roberta-babe-ft