LLM Training
updated
Rethinking Data Selection at Scale: Random Selection is Almost All You
Need
Paper
• 2410.09335
• Published
• 16
From Generalist to Specialist: Adapting Vision Language Models via
Task-Specific Visual Instruction Tuning
Paper
• 2410.06456
• Published
• 37
Emergent properties with repeated examples
Paper
• 2410.07041
• Published
• 8
Personalized Visual Instruction Tuning
Paper
• 2410.07113
• Published
• 70
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for
Enhanced Following of Instructions with Multiple Constraints
Paper
• 2410.06458
• Published
• 8
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
• 2410.23743
• Published
• 64
Constraint Back-translation Improves Complex Instruction Following of
Large Language Models
Paper
• 2410.24175
• Published
• 18
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for
Minority Languages
Paper
• 2410.23825
• Published
• 4
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on
Tasks where Thinking Makes Humans Worse
Paper
• 2410.21333
• Published
• 12
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite
Learning
Paper
• 2410.19290
• Published
• 10
AutoTrain: No-code training for state-of-the-art models
Paper
• 2410.15735
• Published
• 59
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Paper
• 2411.07133
• Published
• 38
IOPO: Empowering LLMs with Complex Instruction Following via
Input-Output Preference Optimization
Paper
• 2411.06208
• Published
• 21
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
Paper
• 2411.04997
• Published
• 39
Language Models are Hidden Reasoners: Unlocking Latent Reasoning
Capabilities via Self-Rewarding
Paper
• 2411.04282
• Published
• 37
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
• 2411.08147
• Published
• 65
Cut Your Losses in Large-Vocabulary Language Models
Paper
• 2411.09009
• Published
• 49
Search, Verify and Feedback: Towards Next Generation Post-training
Paradigm of Foundation Models via Verifier Engineering
Paper
• 2411.11504
• Published
• 24
RedPajama: an Open Dataset for Training Large Language Models
Paper
• 2411.12372
• Published
• 57
SageAttention2 Technical Report: Accurate 4 Bit Attention for
Plug-and-play Inference Acceleration
Paper
• 2411.10958
• Published
• 57
Enhancing the Reasoning Ability of Multimodal Large Language Models via
Mixed Preference Optimization
Paper
• 2411.10442
• Published
• 87
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained
Video Reasoning via Core Frame Selection
Paper
• 2411.14794
• Published
• 13
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper
• 2411.15124
• Published
• 67
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper
• 2412.04467
• Published
• 117
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic
Data From Large Language Models
Paper
• 2412.02980
• Published
• 15
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper
• 2412.01928
• Published
• 45
Free Process Rewards without Process Labels
Paper
• 2412.01981
• Published
• 34
On Domain-Specific Post-Training for Multimodal Large Language Models
Paper
• 2411.19930
• Published
• 31
Reverse Thinking Makes LLMs Stronger Reasoners
Paper
• 2411.19865
• Published
• 23
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
• 2412.05271
• Published
• 160
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at
Scale
Paper
• 2412.05237
• Published
• 46
Fully Open Source Moxin-7B Technical Report
Paper
• 2412.06845
• Published
• 11
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for
Long-term Streaming Video and Audio Interactions
Paper
• 2412.09596
• Published
• 97
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity
Visual Descriptions
Paper
• 2412.08737
• Published
• 54
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Paper
• 2412.09501
• Published
• 48
VisionArena: 230K Real World User-VLM Conversations with Preference
Labels
Paper
• 2412.08687
• Published
• 13
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
• 2412.13663
• Published
• 161
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper
• 2412.11768
• Published
• 43
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
• 2412.14922
• Published
• 88
An Empirical Study of Autoregressive Pre-training from Videos
Paper
• 2501.05453
• Published
• 41
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
• 2501.03262
• Published
• 104
Optimizing Large Language Model Training Using FP4 Quantization
Paper
• 2501.17116
• Published
• 36
GuardReasoner: Towards Reasoning-based LLM Safeguards
Paper
• 2501.18492
• Published
• 88
Scaling Pre-training to One Hundred Billion Data for Vision Language
Models
Paper
• 2502.07617
• Published
• 29
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
Paper
• 2502.14502
• Published
• 91
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Paper
• 2502.17055
• Published
• 20
An Embarrassingly Simple Defense Against LLM Abliteration Attacks
Paper
• 2505.19056
• Published
• 6
Orthogonal Finetuning Made Scalable
Paper
• 2506.19847
• Published
• 11
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in
LLMs
Paper
• 2506.19290
• Published
• 53
Scaling RL to Long Videos
Paper
• 2507.07966
• Published
• 160
Step-Audio 2 Technical Report
Paper
• 2507.16632
• Published
• 74
Towards a Unified View of Large Language Model Post-Training
Paper
• 2509.04419
• Published
• 76
Set Block Decoding is a Language Model Inference Accelerator
Paper
• 2509.04185
• Published
• 54
GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
Paper
• 2510.08872
• Published
• 4