Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Paipile's picture
1 6 1

Paipile

Paipile
https://jiezhu23.github.io/

AI & ML interests

None yet

Organizations

None yet

Collections 1

RFT
  • Group Sequence Policy Optimization

    Paper • 2507.18071 • Published Jul 24, 2025 • 319
  • LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

    Paper • 2507.15758 • Published Jul 21, 2025 • 35
  • Hierarchical Budget Policy Optimization for Adaptive Reasoning

    Paper • 2507.15844 • Published Jul 21, 2025 • 17
  • Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

    Paper • 2507.16814 • Published Jul 22, 2025 • 21
RFT
  • Group Sequence Policy Optimization

    Paper • 2507.18071 • Published Jul 24, 2025 • 319
  • LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

    Paper • 2507.15758 • Published Jul 21, 2025 • 35
  • Hierarchical Budget Policy Optimization for Adaptive Reasoning

    Paper • 2507.15844 • Published Jul 21, 2025 • 17
  • Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

    Paper • 2507.16814 • Published Jul 22, 2025 • 21

Papers 2

arxiv:2601.06993
arxiv:2508.00053

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs