CodeX Qwen2.5 0.5B Unsloth SFT - 1M Rows

This repository is prepared for a cloud-only Unsloth LoRA supervised fine-tuning run of Qwen/Qwen2.5-0.5B-Instruct on the first 1,000,000 rows of Modotte/CodeX-2M-Thinking.

Status: training setup/provisioning. Final metrics, checkpoints, adapters, inference samples, and report will be uploaded here during/after training.

Base Model

  • Base: Qwen/Qwen2.5-0.5B-Instruct
  • Architecture: Qwen2ForCausalLM
  • Parameters: ~494M
  • Context length: 32,768 tokens
  • License: Apache-2.0

Dataset

  • Dataset: Modotte/CodeX-2M-Thinking
  • Planned subset: first 1,000,000 rows
  • Columns used:
    • input β†’ user/problem prompt
    • output β†’ assistant solution/reasoning/code response
  • Intended task: coding instruction following, algorithmic reasoning, solution explanation, and code generation.

Planned Training Method

  • Framework: Unsloth + TRL SFTTrainer + PEFT LoRA
  • Quantized base loading: 4-bit
  • LoRA rank: 32
  • LoRA alpha: 64
  • LoRA dropout: 0
  • Target modules:
    • q_proj, k_proj, v_proj, o_proj
    • gate_proj, up_proj, down_proj
  • Max sequence length for first run: 8,192 tokens
  • Epochs: 1
  • Checkpointing: approximately every 2 hours
  • Durable storage: Hugging Face Hub checkpoints and metrics

Planned Artifacts

During training, this repo will receive:

  • checkpoints/checkpoint-* β€” resumable trainer checkpoints
  • metrics/metrics.jsonl β€” train/eval logs
  • metrics/status.json β€” latest job status
  • final_lora_adapter/ β€” final PEFT LoRA adapter
  • final_merged_16bit/ β€” merged model if cloud disk/runtime allows
  • reports/inference_samples.json β€” post-training inference outputs
  • reports/TRAINING_REPORT.md β€” final training report

Intended Use

This model is intended for experimentation with small code-reasoning SFT models, especially:

  • Python coding assistance
  • algorithm explanation
  • competitive-programming style solution drafting
  • reasoning-heavy coding responses

Limitations

  • This is a sub-500M parameter model, so it should not be expected to match larger coding models.
  • The dataset is synthetic and may transfer dataset-specific response style.
  • Long outputs may still be limited by the training sequence length and generation settings.
  • The model may produce incorrect code; generated solutions require testing and review.

Safety and Evaluation

Post-training evaluation will include sample inference prompts and loss tracking. This repository should not be treated as production-ready until final metrics and qualitative outputs are reviewed.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for razor5050/codex-qwen2-5-0-5b-unsloth-codex1m

Adapter
(596)
this model

Dataset used to train razor5050/codex-qwen2-5-0-5b-unsloth-codex1m