Jinsei Shiraishi's picture

Jinsei Shiraishi

OsakanaTeishoku

·

Osakana7777777

AI & ML interests

Large Language Models, Computer Vision, AI/ML application to medical settings

Recent Activity

upvoted an article 5 days ago

Mixture of Experts Explained

upvoted an article 8 days ago

TRL v1.0: Post-Training Library Built to Move with the Field

liked a Space 13 days ago

ACE-Step/Ace-Step-v1.5

View all activity

Organizations

upvoted an article 5 days ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.12k

upvoted an article 8 days ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

+2

18 days ago

•

49

upvoted an article 15 days ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

16 days ago

•

861

upvoted a paper about 2 months ago

On the Optimal Reasoning Length for RL-Trained Language Models

Paper • 2602.09591 • Published Feb 10 • 6

upvoted an article about 2 months ago

Article

NVIDIA Nemotron 2 Nano 9B Japanese: 日本のソブリンAIを支える最先端小規模言語モデル

Feb 17

•

24

upvoted an article 2 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

Dec 1, 2025

•

309

upvoted 2 articles 7 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

+5

Sep 11, 2025

•

186

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

Apr 30, 2025

•

86

upvoted an article 9 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

769

upvoted 2 collections 10 months ago

OpenMathReasoning

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 3 days ago • 47

Any-to-Any Models, Datasets, Spaces

19 items • Updated Feb 9 • 31

upvoted an article 10 months ago

Article

Vision Language Models Explained

Apr 11, 2024

•

529

upvoted 2 collections about 1 year ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 713

Asagi-VLM

Asagi is a Japanese Vision & Language model, trained on a large-scale synthetic dataset. • 4 items • Updated Nov 27, 2025 • 7