arxiv:2603.01973

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Published on Mar 2

· Submitted by

Yixin Nie on Mar 3

Meta Llama

Upvote

Authors:

Abstract

CharacterFlywheel is an iterative optimization process that enhances large language models for social chat applications through multiple generations of refinement, achieving significant improvements in user engagement and instruction following while maintaining production stability.

AI-generated summary

This report presents CharacterFlywheel, an iterative flywheel process for improving large language models (LLMs) in production social chat applications across Instagram, WhatsApp, and Messenger. Starting from LLaMA 3.1, we refined models across 15 generations using data from both internal and external real-user traffic. Through continuous deployments from July 2024 to April 2025, we conducted controlled 7-day A/B tests showing consistent engagement improvements: 7 of 8 newly deployed models demonstrated positive lift over the baseline, with the strongest performers achieving up to 8.8% improvement in engagement breadth and 19.4% in engagement depth. We also observed substantial gains in steerability, with instruction following increasing from 59.2% to 84.8% and instruction violations decreasing from 26.6% to 5.8%. We detail the CharacterFlywheel process which integrates data curation, reward modeling to estimate and interpolate the landscape of engagement metrics, supervised fine-tuning (SFT), reinforcement learning (RL), and both offline and online evaluation to ensure reliable progress at each optimization step. We also discuss our methods for overfitting prevention and navigating production dynamics at scale. These contributions advance the scientific rigor and understanding of LLMs in social applications serving millions of users.

View arXiv page View PDF Add to collection

Community

ynie

Paper submitter about 14 hours ago

We present CharacterFlywheel — an iterative process optimizing LLMs for real human engagement and character steerability, while maintaining rigorous safety protocols. Tested across Instagram, WhatsApp & Messenger with millions of users — where they can create, share, and chat with their own AI characters.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.01973 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.01973 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.01973 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.