Instructions to use muyo/Olmo-3-1025-7B-tokenizer-bos with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use muyo/Olmo-3-1025-7B-tokenizer-bos with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("muyo/Olmo-3-1025-7B-tokenizer-bos", dtype="auto") - Notebooks
- Google Colab
- Kaggle
muyo/Olmo-3-1025-7B-tokenizer-bos
Tokenizer derived from allenai/Olmo-3-1025-7B with a single
modification: the added token at id 100256 (originally <|extra_id_0|>)
has been renamed to <|beginoftext|> and registered as the BOS token. The original
tokenizer used <|endoftext|> for both BOS and EOS; this version separates
them so that diffusion / non-autoregressive training pipelines can mark sequence
starts without ambiguity.
bos_token=<|beginoftext|>(id 100256)eos_token=<|endoftext|>(id 100257)- Vocab size unchanged; model embeddings stay compatible — only one token's surface text changes.
Usage
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("muyo/Olmo-3-1025-7B-tokenizer-bos")
assert tok.bos_token_id == 100256
assert tok.eos_token_id != tok.bos_token_id
ids = tok("hello world", add_special_tokens=True)["input_ids"]
# First id is 100256 (<|beginoftext|>); prepending is handled by the tokenizer.json
# TemplateProcessing post-processor.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for muyo/Olmo-3-1025-7B-tokenizer-bos
Base model
allenai/Olmo-3-1025-7B