Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
srisree 's Collections
Tamil Datasets
Multimodal

Multimodal

updated 13 days ago
Upvote
-

  • GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

    Paper • 2311.07562 • Published Nov 13, 2023 • 15

  • VACE: All-in-One Video Creation and Editing

    Paper • 2503.07598 • Published Mar 10, 2025 • 56

  • EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

    Paper • 2502.09509 • Published Feb 13, 2025 • 8
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs