Submitted by akhaliq 32 mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding · 11 authors 8
Submitted by akhaliq 25 LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression · 13 authors 7
Submitted by akhaliq 17 Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models · 8 authors 1
Submitted by akhaliq 15 Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers · 13 authors 1
Submitted by akhaliq 11 GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation · 8 authors 2
Submitted by akhaliq 10 Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs · 7 authors 1
Submitted by akhaliq 10 ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance · 6 authors 2
Submitted by akhaliq 8 FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation · 4 authors 1
Submitted by akhaliq 8 FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis · 7 authors 1
Submitted by akhaliq 6 TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation · 9 authors 1