Collections
Discover the best community collections!
Collections including paper arxiv:2402.10294
-
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Paper • 2402.10294 • Published • 25 -
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter
Paper • 2402.10896 • Published • 15 -
NLLB-CLIP -- train performant multilingual image retrieval model on a budget
Paper • 2309.01859 • Published • 3
-
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 12 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 26 -
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 48
-
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Paper • 2402.10294 • Published • 25 -
Valley: Video Assistant with Large Language model Enhanced abilitY
Paper • 2306.07207 • Published • 2 -
Video Editing via Factorized Diffusion Distillation
Paper • 2403.09334 • Published • 22
-
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Paper • 2402.10294 • Published • 25 -
Genie: Generative Interactive Environments
Paper • 2402.15391 • Published • 71 -
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper • 2412.10360 • Published • 139 -
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 90
-
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
Paper • 2402.03162 • Published • 19 -
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
Paper • 2402.03040 • Published • 18 -
Magic-Me: Identity-Specific Video Customized Diffusion
Paper • 2402.09368 • Published • 28 -
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Paper • 2402.10294 • Published • 25
-
Visual Instruction Tuning
Paper • 2304.08485 • Published • 13 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 48 -
Improved Baselines with Visual Instruction Tuning
Paper • 2310.03744 • Published • 37 -
Aligning Large Multimodal Models with Factually Augmented RLHF
Paper • 2309.14525 • Published • 30
-
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Paper • 2401.04468 • Published • 49 -
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Paper • 2401.09047 • Published • 14 -
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
Paper • 2402.00769 • Published • 22 -
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Paper • 2402.03161 • Published • 15
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69