BABE
Collection
Models from "Neural Media Bias Detection Using Distant Supervision With BABE - Bias
Annotations By Experts"
β’
3 items
β’
Updated
β’
1
This repository provides a RoBERTa-base model fine-tuned on the BABE (Bias Annotations By Experts) dataset for sentence-level lexical/loaded-language bias detection in English news text. BABE was introduced in the paper Neural Media Bias Detection Using Distant Supervision With BABE β Bias Annotations By Experts.
Labels
0
β neutral / non-lexical-bias 1
β lexical-bias from transformers import AutoTokenizer, AutoModelForSequenceClassification
m = "mediabiasgroup/roberta-babe-ft"
tok = AutoTokenizer.from_pretrained(m)
model = AutoModelForSequenceClassification.from_pretrained(m)
text = "Democrats shamelessly rammed the bill through Congress."
probs = model(**tok(text, return_tensors="pt")).logits.softmax(-1).tolist()[0]
print({"neutral": probs[0], "lexical_bias": probs[1]})
roberta-base
with a standard sequence-classification head. Media-bias perception is subjective and context-dependent. This model may over-flag emotionally charged wording. Keep a human in the loop and avoid punitive or outlet-level decisions without careful validation.
If you use this model or the dataset, please cite:
@article{spinde2022neural,
title = {Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts},
author = {Spinde, Timo and Plank, Manuel and Krieger, Jan-David and Ruas, Terry and Gipp, Bela and Aizawa, Akiko},
journal = {arXiv preprint arXiv:2209.14557},
year = {2022},
url = {https://arxiv.org/abs/2209.14557}
}
Base model
FacebookAI/roberta-base