Dual Caption Preference Optimization for Diffusion Models Paper β’ 2502.06023 β’ Published 2 days ago β’ 7
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Paper β’ 2502.05179 β’ Published 4 days ago β’ 19
VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper β’ 2502.05173 β’ Published 4 days ago β’ 58
Fast Video Generation with Sliding Tile Attention Paper β’ 2502.04507 β’ Published 5 days ago β’ 43
Goku: Flow Based Video Generative Foundation Models Paper β’ 2502.04896 β’ Published 5 days ago β’ 60
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360Β° Unbounded Scene Inpainting Paper β’ 2502.05176 β’ Published 4 days ago β’ 24
cognitivecomputations/Dolphin3.0-R1-Mistral-24B Text Generation β’ Updated 4 days ago β’ 1.16k β’ 93
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution Paper β’ 2501.10045 β’ Published 26 days ago β’ 9
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper β’ 2501.12380 β’ Published 21 days ago β’ 81
GSTAR: Gaussian Surface Tracking and Reconstruction Paper β’ 2501.10283 β’ Published 25 days ago β’ 5
mlx-community/Llama-3.2-11B-Vision-Instruct-abliterated Image-Text-to-Text β’ Updated Dec 16, 2024 β’ 348 β’ 6