Submitted by akhaliq 26 Video-LLaVA: Learning United Visual Representation by Alignment Before Projection · 6 authors 1
Submitted by akhaliq 25 Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning · 10 authors 3
Submitted by akhaliq 19 Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 · 11 authors 5
Submitted by akhaliq 5 UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework · 9 authors