oguzhanercan 's Collections Representation Learning
updated
End-to-End Vision Tokenizer Tuning
Paper
• 2505.10562
• Published
• 22
Global and Local Entailment Learning for Natural World Imagery
Paper
• 2506.21476
• Published
• 1
Paper
• 2508.10104
• Published
• 298
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task
Arithmetic
Paper
• 2509.01363
• Published
• 59
AToken: A Unified Tokenizer for Vision
Paper
• 2509.14476
• Published
• 36
Lost in Embeddings: Information Loss in Vision-Language Models
Paper
• 2509.11986
• Published
• 29
Latent Zoning Network: A Unified Principle for Generative Modeling,
Representation Learning, and Classification
Paper
• 2509.15591
• Published
• 45
SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model
Paper
• 2510.12709
• Published
• 13
Better Together: Leveraging Unpaired Multimodal Data for Stronger
Unimodal Models
Paper
• 2510.08492
• Published
• 10
GRACE: Generative Representation Learning via Contrastive Policy
Optimization
Paper
• 2510.04506
• Published
• 12
Latent Diffusion Model without Variational Autoencoder
Paper
• 2510.15301
• Published
• 49
Model Merging with Functional Dual Anchors
Paper
• 2510.21223
• Published
• 13
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via
Hierarchical Model Merging
Paper
• 2510.20479
• Published
• 12
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization
Formats
Paper
• 2510.25602
• Published
• 78
Defeating the Training-Inference Mismatch via FP16
Paper
• 2510.26788
• Published
• 31
UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings
Paper
• 2511.00405
• Published
• 6
Φeat: Physically-Grounded Feature Representation
Paper
• 2511.11270
• Published
• 11
Qwen3-VL Technical Report
Paper
• 2511.21631
• Published
• 158
Next-Embedding Prediction Makes Strong Vision Learners
Paper
• 2512.16922
• Published
• 87
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking
Paper
• 2601.04720
• Published
• 56
Nested Learning: The Illusion of Deep Learning Architectures
Paper
• 2512.24695
• Published
• 44
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
Paper
• 2602.08683
• Published
• 49
Rethinking Generative Recommender Tokenizer: Recsys-Native Encoding and Semantic Quantization Beyond LLMs
Paper
• 2602.02338
• Published
• 40