RLHFLow Reward Models - a RLHFlow Collection

RLHFlow 's Collections

Decision-Tree Reward Models

RLHFlow MATH Process Reward Model

Standard-format-preference-dataset

Mixture-of-preference-reward-modeling

RM-Bradley-Terry

RLHFLow Reward Models

RLHFLow Reward Models

updated Aug 21, 2024

Reward models trained by RLHFlow codebase (https://github.com/RLHFlow/RLHF-Reward-Modeling/)

RLHFlow/ArmoRM-Llama3-8B-v0.1

Text Classification • Updated Sep 23, 2024 • 15.1k • 183
RLHFlow/pair-preference-model-LLaMA3-8B

Text Generation • 8B • Updated Oct 14, 2024 • 216 • • 38
sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • 8B • Updated Oct 14, 2024 • 224 • 60

Note Bradley-Terry reward model trained with RLHFlow codebase
RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71

Note Tech report that covers Pairwise Preference Model
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

Paper • 2406.12845 • Published Jun 18, 2024 • 1

Note Tech report for ArmoRM