video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions.
AI & ML interests
https://www.ee.tsinghua.edu.cn/en/
Recent Activity
View all activity
Organization Card
Department of Electronic Engineering, Tsinghua University
models 18
tsinghua-ee/video-SALMONN2_plus_3B_full
5B β’ Updated
β’ 15
tsinghua-ee/video-SALMONN2_plus_7B_full
9B β’ Updated
β’ 29
tsinghua-ee/video_SALMONN2plus_3B_audioAlign
5B β’ Updated
β’ 10
tsinghua-ee/D-ORCA-8B-0210
10B β’ Updated
β’ 23 β’ 1
tsinghua-ee/WAVE-7B
Updated
β’ 66 β’ 1
tsinghua-ee/video_SALMONN2_7B_audioAlign
Updated
β’ 21
tsinghua-ee/video_SALMONN2plus_72B_audioAlign
Updated
β’ 4
tsinghua-ee/video_SALMONN2plus_7B_audioAlign
9B β’ Updated
β’ 514
tsinghua-ee/SALMONN
Automatic Speech Recognition β’ Updated
β’ 51
tsinghua-ee/video-SALMONN-2_plus_72B
Updated
β’ 6 β’ 2
datasets 8
tsinghua-ee/ELViM
Viewer
β’ Updated
β’ 211 β’ 10
tsinghua-ee/SACRED-Bench
Viewer
β’ Updated
β’ 2.48k β’ 58
tsinghua-ee/F-16-NBA
Preview
β’ Updated
β’ 48
tsinghua-ee/AVUTBenchmark
Viewer
β’ Updated
β’ 3.28k β’ 5.65k β’ 1
tsinghua-ee/video-SALMONN_2_testset
Preview
β’ Updated
β’ 150
tsinghua-ee/QualiSpeech
Viewer
β’ Updated
β’ 14.6k β’ 576 β’ 21
tsinghua-ee/RivaBench
Viewer
β’ Updated
β’ 542 β’ 473 β’ 2
tsinghua-ee/SAVEBench
Preview
β’ Updated
β’ 67 β’ 3