--- license: apache-2.0 tags: - code - qwen3 - finetuned - python - competitive-programming base_model: Qwen/Qwen3-4B datasets: - microsoft/rStar-Coder language: - en pipeline_tag: text-generation model-index: - name: qwen3-4b-code-finetuned results: - task: type: text-generation name: Code Generation dataset: name: HumanEval type: openai_humaneval metrics: - name: pass@1 type: pass@1 value: 68.9 verified: false - task: type: text-generation name: Code Generation dataset: name: HumanEval+ type: evalplus/humanevalplus metrics: - name: pass@1 type: pass@1 value: 64.0 verified: false - task: type: text-generation name: Code Generation dataset: name: MBPP type: mbpp metrics: - name: pass@1 type: pass@1 value: 58.2 verified: false - task: type: text-generation name: Code Generation dataset: name: MBPP+ type: evalplus/mbppplus metrics: - name: pass@1 type: pass@1 value: 50.8 verified: false --- # Qwen3-4B Code Fine-Tuned Fine-tuned Qwen3-4B on 10K verified reasoning traces from rStar-Coder (1 epoch SFT). **Optimized for algorithmic/competitive programming tasks.** ## 📊 Performance (EvalPlus Framework) | Benchmark | Base | Plus | vs Base Model | |-----------|------|------|---------------| | **HumanEval** | **68.9%** | **64.0%** | **+6.9%** ✅ | | **MBPP** | **58.2%** | **50.8%** | **-8.8%** ⚠️ | *Evaluated using [EvalPlus](https://github.com/evalplus/evalplus) with greedy decoding* ### Performance Trade-off - ✅ **Improved on complex algorithmic tasks** (HumanEval: 62% → 68.9%) - ⚠️ **Regression on simple practical tasks** (MBPP: 67% → 58.2%) **Why?** Trained on competition-style problems (LeetCode, Codeforces) which emphasizes algorithmic reasoning over simple utility functions. **Use this model if:** You need help with algorithms, data structures, competitive programming **Use base model if:** You need simple utility functions, basic string/list operations ## 🚀 Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "prometheus04/qwen3-4b-code-finetuned", torch_dtype="auto", device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("prometheus04/qwen3-4b-code-finetuned", trust_remote_code=True) # Complete a function messages = [ {"role": "system", "content": "You are a programming expert."}, {"role": "user", "content": "def fibonacci(n):\n "} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.0) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📝 Training Details - **Base Model**: Qwen/Qwen3-4B (4B parameters) - **Dataset**: microsoft/rStar-Coder synthetic_sft (10K samples) - Competition problems from LeetCode, Codeforces, etc. - Execution-verified solutions with reasoning traces - **Method**: LoRA fine-tuning - Rank: 32 - Alpha: 64 - Target modules: All linear layers (q,k,v,o,gate,up,down) - rsLoRA: Enabled - **Training**: - Epochs: 1 - Batch size: 2 × 8 grad accum = 16 effective - Learning rate: 2e-4 (cosine schedule) - Optimizer: AdamW 8-bit - Max seq length: 4096 ## 💡 Key Features ✅ Trained on execution-verified competition solutions ✅ Curriculum learning (easy → hard) ✅ Decontaminated from HumanEval/MBPP ✅ Efficient LoRA (1.62% trainable params) ✅ Production-ready merged weights ## 📈 Comparison | Model | HumanEval | MBPP | Specialization | |-------|-----------|------|----------------| | Qwen3-4B Base | 62% | 67% | General | | **This Model** | **68.9%** | 58.2% | **Algorithms** | | GPT-3.5-turbo | ~75% | ~70% | General | ## 🎯 Strengths - Binary search, dynamic programming, graph algorithms - Recursion, backtracking, tree traversal - Complex data structure manipulation - Competitive programming patterns ## ⚠️ Limitations - **Not recommended for simple utility functions** (use base model instead) - Trained on Python-only data - May overthink simple problems - Best for algorithmic/competitive programming tasks - Optimal for functions <4K tokens ## 🔧 Recommended Use Cases ✅ LeetCode/HackerRank style problems ✅ Algorithm implementation ✅ Data structure coding ✅ Competitive programming practice ✅ Technical interview preparation ❌ Simple string manipulation ❌ Basic list operations ❌ Trivial utility functions ## 📄 Citation ```bibtex @misc{qwen3-4b-code-finetuned, author = {prometheus04}, title = {Qwen3-4B Code Fine-Tuned on rStar-Coder}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/prometheus04/qwen3-4b-code-finetuned}}, } ``` ## 📜 License Apache 2.0 (inherited from base model)