Clockz commited on
Commit
4e6b7dc
·
verified ·
1 Parent(s): ea74cd3

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ library_name: transformers
4
+ tags:
5
+ - reasoning
6
+ - context-learning
7
+ - synthetic-data
8
+ - transformers
9
+ ---
10
+
11
+ # Interplay-LM Context Pretrain Models
12
+
13
+ This repository is organized by context-mixture setting. Each top-level directory corresponds to one pretraining setting used in the context experiments.
14
+
15
+ Within each setting:
16
+
17
+ - `base/` stores the final pretraining checkpoint used to initialize RL.
18
+ - `rl/` stores the final RL checkpoints for each experiment variant.
19
+
20
+ Only inference-relevant Hugging Face files are included.
21
+
22
+ ## Included settings
23
+
24
+ - `idzoo_0.9zoo_0.1teacher`
25
+ - `idzoo_0.99zoo_0.01teacher`
26
+ - `idzoo_0.999zoo_0.001teacher`
27
+
28
+ ## Load
29
+
30
+ ```python
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+
33
+ repo_id = "Interplay-LM-Reasoning/context_pretrain"
34
+ subdir = "idzoo_0.99zoo_0.01teacher/rl/contextzoo_0.99zoo_0.01teacher_process_strict"
35
+
36
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir)
37
+ model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir)
38
+ ```
39
+
40
+ ## Citation
41
+
42
+ ```bibtex
43
+ @misc{zhang2025interplaypretrainingmidtrainingrl,
44
+ title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
45
+ author={Charlie Zhang and Graham Neubig and Xiang Yue},
46
+ year={2025},
47
+ eprint={2512.07783},
48
+ archivePrefix={arXiv},
49
+ primaryClass={cs.CL},
50
+ url={https://arxiv.org/abs/2512.07783},
51
+ }
52
+ ```