Towards the Aha Moment of Vision-Language Models
Multi-modal Multilingual Instruction
university
AI & ML interests
None defined yet.
Recent Activity
Collections
1
models
7
MMInstruction/Giraffe
Updated
•
12
•
2
MMInstruction/Qwen-VL-ArXivCap
Text Generation
•
Updated
•
16
•
4
MMInstruction/Qwen-VL-ArXivQA
Text Generation
•
Updated
•
24
•
4
MMInstruction/Silkie
Text Generation
•
Updated
•
30
•
12
MMInstruction/YingVLM
Updated
•
18
•
1
MMInstruction/YingVLM-zh
Updated
•
6
MMInstruction/YingVLM-Video
Updated
•
8
datasets
14
MMInstruction/Clevr_CoGenT_TrainA_R1
Viewer
•
Updated
•
37.8k
•
656
•
22
MMInstruction/SuperClevr_Val
Viewer
•
Updated
•
5k
•
24
MMInstruction/Clevr_CoGenT_TrainA_70K
Viewer
•
Updated
•
70k
•
50
MMInstruction/Clevr_CoGenT_ValB
Viewer
•
Updated
•
5k
•
75
•
1
MMInstruction/Clevr_CoGenT_ValA
Viewer
•
Updated
•
5k
•
62
MMInstruction/Clevr_CoAgent_TrainA_R1
Viewer
•
Updated
•
2.5k
•
32
MMInstruction/VL-RewardBench
Viewer
•
Updated
•
1.25k
•
531
•
5
MMInstruction/RedTeamingVLM
Updated
•
2.42k
•
14
MMInstruction/VLFeedback
Viewer
•
Updated
•
80.3k
•
566
•
45
MMInstruction/ArxivCap
Viewer
•
Updated
•
573k
•
3.72k
•
50