Qwen3-8B-UnBias-Plus-SFT-Instruct

A fine-tuned version of Qwen3-8B for news media bias detection and neutral rewriting, developed by the Vector Institute as part of the UnBias-Plus project.

Given a news article, the model identifies biased language segments, classifies their bias type and severity, provides neutral replacements, and returns a fully rewritten unbiased version — all in a single structured JSON response.

This is the Instruct variant — trained without chain-of-thought thinking blocks (enable_thinking=False). It produces clean structured JSON directly, making it suitable for production inference via vLLM or other OpenAI-compatible backends.

Trained on: train_3 — UnBias-Plus-3000 (3,000 expert-annotated news articles)

Difference from Qwen3-8B-UnBias-Plus-SFT-Instruct-Legacy

Legacy (this model's predecessor) SFT-Instruct (this model)
Training data train_1 (original VLDBench) train_3 (UnBias-Plus-3000)
Output schema includes binary_label bias_found bool only
Severity scale 0, 2, 3, 4 0, 1, 2, 3, 4
System prompt Basic bias detection Includes REWRITE RULES + SUCCESS CRITERIA + LANGUAGE HANDLING
Segment localization (recall @ words) 0.714 0.900
Segment replacement quality 3.58 / 5 3.82 / 5
Latency (median) 27.4s 22.2s

Model Details

Property Value
Base model Qwen/Qwen3-8B
Fine-tuning method Supervised Fine-Tuning (SFT) with LoRA
Training precision bf16 (full precision, no quantization during training)
LoRA rank 16
Training framework Unsloth + TRL
Context length 8192 tokens
Thinking mode Disabled (enable_thinking=False)
Output format Structured JSON
Training dataset UnBias-Plus-3000 (train_3)

Usage

Using with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, json

model_id = "vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()

SYSTEM_PROMPT = """You are an expert linguist and bias detection specialist.
Your task is to carefully read a news article, detect ALL biased language,
and return a structured JSON response. Return ONLY valid JSON, no extra text."""

article = "Your news article here..."

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Analyze the following article for bias and return the result in the required JSON format.\n\nARTICLE:\n{article}"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
    return_dict=True,
    truncation=True,
    max_length=8192,
)

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs["input_ids"].to(model.device),
        attention_mask=inputs["attention_mask"].to(model.device),
        max_new_tokens=4096,
        do_sample=False,
        temperature=None,
        top_p=None,
        pad_token_id=tokenizer.eos_token_id,
    )

new_tokens = outputs[0][inputs["input_ids"].shape[1]:]
response = tokenizer.decode(new_tokens, skip_special_tokens=True)
result = json.loads(response)

Using with vLLM

vllm serve vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct \
  --max-model-len 8192
from openai import OpenAI
import json

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

completion = client.chat.completions.create(
    model="vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct",
    messages=messages,
    max_tokens=4096,
    temperature=0,
    extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
result = json.loads(completion.choices[0].message.content)

Output Schema

{
  "severity": 0 | 1 | 2 | 3 | 4,
  "bias_found": true | false,
  "biased_segments": [
    {
      "original": "exact substring from input article",
      "replacement": "neutral alternative phrase",
      "severity": "high" | "medium" | "low",
      "bias_type": "loaded language | dehumanizing framing | false generalizations | framing bias | euphemism/dysphemism | politically charged terminology | sensationalism",
      "reasoning": "1-2 sentence explanation"
    }
  ],
  "unbiased_text": "Full rewritten neutral article"
}

Severity Scale

Value Meaning
0 Neutral — no bias detected
1 Minor isolated loaded words
2 Recurring biased framing
3 Strong persuasive tone
4 Inflammatory rhetoric

Bias Types Detected

  • Loaded language — words with strong emotional connotations
  • Dehumanizing framing — language that strips dignity from groups
  • False generalizations — sweeping statements ("they always", "all of them")
  • Framing bias — selective wording that implies a viewpoint
  • Euphemism/dysphemism — softening or hardening language to manipulate perception
  • Politically charged terminology — labels used to provoke rather than describe
  • Sensationalism — exaggerated language to evoke emotional responses

Evaluation Results

Evaluated on BABE-350 (175 biased + 175 unbiased, balanced), judge: GPT-4o-mini.

Metric Score
Parse rate 100%
Bias reduction % 53.1%
Contextual relevance 4.07 / 5
Global rewrite quality 3.20 / 5
ROUGE-L vs human reference 0.635
Recall at biased words 0.900
Segment replacement quality 3.82 / 5
Hallucination rate 8.5%
Latency (median) 22.2s

See the full UnBias-Plus Leaderboard for model comparisons.

Hardware Requirements

Setup Configuration
Recommended (server) torch_dtype=torch.bfloat16, device_map="auto" (~16GB VRAM)
Lightweight (laptop) load_in_4bit=True (~5GB VRAM)

Model Variants

Model Params Context Best for
Qwen3.5-4B-UnBias-Plus-SFT-Instruct 4B 4096 Speed, low VRAM
Qwen3-8B-UnBias-Plus-SFT-Instruct (this) 8B 8192 Higher recall, longer articles
Qwen3-8B-UnBias-Plus-SFT-Instruct-Legacy 8B 8192 Best unbiased handling, deployed on GCP

Limitations

  • Trained primarily on English-language news articles
  • Unbiased article handling is weaker than the Legacy model (correct identification median 2.0 vs 5.0)
  • Occasional rewrite expansion on long articles (length ratio outliers up to 19x)
  • Best performance on articles under 5,000 characters
  • Outputs should be reviewed by a human before use in production

Citation

@misc{unbias-plus-qwen3-8b-instruct,
  title        = {Qwen3-8B-UnBias-Plus-SFT-Instruct},
  author       = {Vector Institute AIXpert Team},
  year         = {2026},
  publisher    = {HuggingFace},
  howpublished = {\url{https://huggingface.co/vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct}},
  note         = {Part of the AIXpert project: https://aixpert-project.eu/}
}

Links

Downloads last month
503
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct

Finetuned
Qwen/Qwen3-8B
Finetuned
(1653)
this model
Quantizations
1 model

Dataset used to train vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct

Space using vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct 1