UCSC-VLAA

university

https://ucsc-vlaa.github.io/index.html

cihangxie

UCSC-VLAA

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

xk-huang authored a paper 5 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

xk-huang authored a paper 5 days ago

Scaling Zero-Shot Reference-to-Video Generation

xk-huang authored a paper 5 days ago

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

View all activity

Papers

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

View all Papers

xk-huang

authored 4 papers 5 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 73

submitted a paper to Daily Papers 6 days ago

VecGlypher: Unified Vector Glyph Generation with Language Models

Paper • 2602.21461 • Published 7 days ago • 11

Letian2003

updated a collection about 1 month ago

OpenVision 3

Collection

A Family of Unified Visual Encoder with Unified Visual Representation. • 4 items • Updated Jan 27 • 2

cihangxie

updated a collection about 1 month ago

OpenVision 3

Collection

A Family of Unified Visual Encoder with Unified Visual Representation. • 4 items • Updated Jan 27 • 2

cihangxie

submitted a paper to Daily Papers about 1 month ago

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21

Letian2003

updated a collection about 1 month ago

OpenVision 3

Collection

A Family of Unified Visual Encoder with Unified Visual Representation. • 4 items • Updated Jan 27 • 2

Letian2003

updated a model about 1 month ago

UCSC-VLAA/openvision3-vit-base-patch2-32

Updated Jan 22 • 23

Letian2003

published a model about 1 month ago

UCSC-VLAA/openvision3-vit-base-patch2-32

Updated Jan 22 • 23

xk-huang

authored 2 papers 3 months ago

MedVLThinker: Simple Baselines for Multimodal Medical Reasoning

Paper • 2508.02669 • Published Aug 4, 2025

MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

Paper • 2510.25867 • Published Oct 29, 2025 • 7

xk-huang

updated a collection 4 months ago

MedVLSynther

Collection

0 items • Updated Oct 31, 2025

Xianhang

authored 5 papers 5 months ago

Unleashing the Power of Visual Prompting At the Pixel Level

Paper • 2212.10556 • Published Dec 20, 2022

CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions

Paper • 2411.16828 • Published Nov 25, 2024 • 1

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

Autoregressive Pretraining with Mamba in Vision

Paper • 2406.07537 • Published Jun 11, 2024

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 34

AI & ML interests

Recent Activity

Papers

Team members 15

UCSC-VLAA's activity