AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
ViewSAM: Learning View-aware Cross-modal Semantics for Weakly Supervised Cross-view Referring Multi-Object Tracking
Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension
Organization Card
Hey 👋! Welcome to our team's corner at HuggingFace! We're a bunch of enthusiastic folks who are totally into the exciting world of Multimodal Large Language Models.
Our research explores innovative ways to enhance interactions between language and Image/Vidio/Audio, aiming to advance the capabilities of AI in understanding and generating multimodal content.
We're a curious bunch, always on the lookout for cool ways to make AI systems understand and generate human-like language.
Official models and datasets for paper μ²Tokenizer(https://arxiv.org/abs/2507.00316)
models 12
AlpachinoNLP/u2Qwen3-4B-Thinking
Image-to-Text • Updated • 24
AlpachinoNLP/QTSplus-LLaVA-Video-7B-Qwen2
Image-Text-to-Text • Updated • 20
AlpachinoNLP/QTSplus-Qwen2.5-VL-7B
Image-Text-to-Text • Updated
AlpachinoNLP/QTSplus-InternVL2.5-8B
Image-Text-to-Text • Updated • 14
AlpachinoNLP/u2Qwen3-4B-Instruct
Image-to-Text • Updated • 23 • 1
AlpachinoNLP/u2Qwen3-1.7B-Instruct
Image-to-Text • Updated • 20 • 1
AlpachinoNLP/LongCLIP-ViT-B-32
Zero-Shot Image Classification • 0.2B • Updated • 43.8k • 1
AlpachinoNLP/QTSplus-3B
Image-Text-to-Text • Updated • 33 • 1
AlpachinoNLP/QTSplus-7B
Image-Text-to-Text • Updated • 16 • 1
AlpachinoNLP/QTSplus-3B-FT
Image-Text-to-Text • Updated • 18 • 1
datasets 7
AlpachinoNLP/GlazyBench
Updated • 2.54k
AlpachinoNLP/CT-RATE-Thinking
Viewer • Updated • 1.41M • 130 • 1
AlpachinoNLP/CC_SBU_High_Quality_Single_Choice
Viewer • Updated • 31.7M • 3
AlpachinoNLP/CC_SBU_High_Quality_Caption
Viewer • Updated • 6.46M • 7
AlpachinoNLP/QTSplus-Dataset
Preview • Updated • 444 • 1
AlpachinoNLP/CT-RATE-Chinese
Viewer • Updated • 50.2k • 145
AlpachinoNLP/CT-RATE-Mini
Updated • 69 • 1