Qwen3-8B-UnBias-Plus-SFT-Instruct

A fine-tuned version of Qwen3-8B for news media bias detection and neutral rewriting, developed by the Vector Institute as part of the UnBias-Plus project.

Given a news article, the model identifies biased language segments, classifies their bias type and severity, provides neutral replacements, and returns a fully rewritten unbiased version — all in a single structured JSON response.

This is the Instruct variant — trained without chain-of-thought thinking blocks (enable_thinking=False). It produces clean structured JSON directly, making it suitable for production inference via vLLM or other OpenAI-compatible backends.

Trained on: train_3 — UnBias-Plus-3000 (3,000 expert-annotated news articles)

Difference from Qwen3-8B-UnBias-Plus-SFT-Instruct-Legacy

	Legacy (this model's predecessor)	SFT-Instruct (this model)
Training data	train_1 (original VLDBench)	train_3 (UnBias-Plus-3000)
Output schema	includes `binary_label`	`bias_found` bool only
Severity scale	0, 2, 3, 4	0, 1, 2, 3, 4
System prompt	Basic bias detection	Includes REWRITE RULES + SUCCESS CRITERIA + LANGUAGE HANDLING
Segment localization (recall @ words)	0.714	0.900
Segment replacement quality	3.58 / 5	3.82 / 5
Latency (median)	27.4s	22.2s

Model Details

Property	Value
Base model	Qwen/Qwen3-8B
Fine-tuning method	Supervised Fine-Tuning (SFT) with LoRA
Training precision	bf16 (full precision, no quantization during training)
LoRA rank	16
Training framework	Unsloth + TRL
Context length	8192 tokens
Thinking mode	Disabled (`enable_thinking=False`)
Output format	Structured JSON
Training dataset	UnBias-Plus-3000 (train_3)

Usage

Using with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, json

model_id = "vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()

SYSTEM_PROMPT = """You are an expert linguist and bias detection specialist.
Your task is to carefully read a news article, detect ALL biased language,
and return a structured JSON response. Return ONLY valid JSON, no extra text."""

article = "Your news article here..."

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Analyze the following article for bias and return the result in the required JSON format.\n\nARTICLE:\n{article}"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    enable_thinking=False,
    return_tensors="pt",
    return_dict=True,
    truncation=True,
    max_length=8192,
)

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs["input_ids"].to(model.device),
        attention_mask=inputs["attention_mask"].to(model.device),
        max_new_tokens=4096,
        do_sample=False,
        temperature=None,
        top_p=None,
        pad_token_id=tokenizer.eos_token_id,
    )

new_tokens = outputs[0][inputs["input_ids"].shape[1]:]
response = tokenizer.decode(new_tokens, skip_special_tokens=True)
result = json.loads(response)

Using with vLLM

vllm serve vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct \
  --max-model-len 8192

from openai import OpenAI
import json

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

completion = client.chat.completions.create(
    model="vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct",
    messages=messages,
    max_tokens=4096,
    temperature=0,
    extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
result = json.loads(completion.choices[0].message.content)

Output Schema

{
  "severity": 0 | 1 | 2 | 3 | 4,
  "bias_found": true | false,
  "biased_segments": [
    {
      "original": "exact substring from input article",
      "replacement": "neutral alternative phrase",
      "severity": "high" | "medium" | "low",
      "bias_type": "loaded language | dehumanizing framing | false generalizations | framing bias | euphemism/dysphemism | politically charged terminology | sensationalism",
      "reasoning": "1-2 sentence explanation"
    }
  ],
  "unbiased_text": "Full rewritten neutral article"
}

Severity Scale

Value	Meaning
0	Neutral — no bias detected
1	Minor isolated loaded words
2	Recurring biased framing
3	Strong persuasive tone
4	Inflammatory rhetoric

Bias Types Detected

Loaded language — words with strong emotional connotations
Dehumanizing framing — language that strips dignity from groups
False generalizations — sweeping statements ("they always", "all of them")
Framing bias — selective wording that implies a viewpoint
Euphemism/dysphemism — softening or hardening language to manipulate perception
Politically charged terminology — labels used to provoke rather than describe
Sensationalism — exaggerated language to evoke emotional responses

Evaluation Results

Evaluated on BABE-350 (175 biased + 175 unbiased, balanced), judge: GPT-4o-mini.

Metric	Score
Parse rate	100%
Bias reduction %	53.1%
Contextual relevance	4.07 / 5
Global rewrite quality	3.20 / 5
ROUGE-L vs human reference	0.635
Recall at biased words	0.900
Segment replacement quality	3.82 / 5
Hallucination rate	8.5%
Latency (median)	22.2s

See the full UnBias-Plus Leaderboard for model comparisons.

Hardware Requirements

Setup	Configuration
Recommended (server)	`torch_dtype=torch.bfloat16, device_map="auto"` (~16GB VRAM)
Lightweight (laptop)	`load_in_4bit=True` (~5GB VRAM)

Model Variants

Model	Params	Context	Best for
Qwen3.5-4B-UnBias-Plus-SFT-Instruct	4B	4096	Speed, low VRAM
Qwen3-8B-UnBias-Plus-SFT-Instruct (this)	8B	8192	Higher recall, longer articles
Qwen3-8B-UnBias-Plus-SFT-Instruct-Legacy	8B	8192	Best unbiased handling, deployed on GCP

Limitations

Trained primarily on English-language news articles
Unbiased article handling is weaker than the Legacy model (correct identification median 2.0 vs 5.0)
Occasional rewrite expansion on long articles (length ratio outliers up to 19x)
Best performance on articles under 5,000 characters
Outputs should be reviewed by a human before use in production

Citation

@misc{unbias-plus-qwen3-8b-instruct,
  title        = {Qwen3-8B-UnBias-Plus-SFT-Instruct},
  author       = {Vector Institute AIXpert Team},
  year         = {2026},
  publisher    = {HuggingFace},
  howpublished = {\url{https://huggingface.co/vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct}},
  note         = {Part of the AIXpert project: https://aixpert-project.eu/}
}

Model tree for vector-institute/Qwen3-8B-UnBias-Plus-SFT-Instruct

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Finetuned

(1653)

this model

Quantizations

1 model

vector-institute
/

Qwen3-8B-UnBias-Plus-SFT-Instruct