Zhang Yuanhan's picture

Zhang Yuanhan

ZhangYuanhan

·

https://zhangyuanhan-ai.github.io/

AI & ML interests

None yet

Recent Activity

updated a dataset about 10 hours ago

lmms-lab/haha

published a dataset about 16 hours ago

lmms-lab/haha

updated a collection about 22 hours ago

View all activity

Organizations

ZhangYuanhan's activity

updated a dataset about 10 hours ago

lmms-lab/haha

Viewer • Updated about 10 hours ago • 1.23k

published a dataset about 16 hours ago

lmms-lab/haha

Viewer • Updated about 10 hours ago • 1.23k

updated a collection about 22 hours ago

VideoMMMU

3 items • Updated about 16 hours ago

upvoted a paper about 22 hours ago

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published 4 days ago • 57

upvoted a paper 9 days ago

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Paper • 2501.14818 • Published 22 days ago • 4

upvoted a paper 16 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published 19 days ago • 23

authored a paper 19 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published 19 days ago • 23

upvoted a paper 30 days ago

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 39

updated a collection 30 days ago

Vision Language General

Vision Language General • 5 items • Updated 30 days ago

updated a collection about 1 month ago

Vision Language General

Vision Language General • 5 items • Updated 30 days ago

upvoted a paper about 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

upvoted a paper 3 months ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 80

New activity in lmms-lab/LLaVA-Video-178K 3 months ago

Query about how many frames are used to generate each caption?

#7 opened 3 months ago by

upvoted a paper 3 months ago

HourVideo: 1-Hour Video-Language Understanding

Paper • 2411.04998 • Published Nov 7, 2024 • 1

updated 2 models 4 months ago

lmms-lab/LLaVA-Video-72B-Qwen2

Text Generation • Updated Oct 25, 2024 • 1.53k • 17

lmms-lab/LLaVA-Video-7B-Qwen2

Video-Text-to-Text • Updated Oct 25, 2024 • 70.6k • 71

New activity in lmms-lab/LLaVA-Video-178K 4 months ago

Missing videos

#4 opened 4 months ago by

New activity in lmms-lab/LLaVA-Video-7B-Qwen2 4 months ago

Difference between 7B-DPO and 7B-Qwen2

#7 opened 4 months ago by

updated a collection 4 months ago

Vision Language General

Vision Language General • 5 items • Updated 30 days ago