2 32 42

Kyle Tuft

Chilangosta

AI & ML interests

None yet

Recent Activity

upvoted a paper about 9 hours ago

Dual Caption Preference Optimization for Diffusion Models

liked a Space 1 day ago

OpenGVLab/InternVL

liked a model 1 day ago

OpenGVLab/InternVL2_5-78B-MPO

View all activity

Organizations

None yet

Chilangosta's activity

upvoted a paper about 9 hours ago

Dual Caption Preference Optimization for Diffusion Models

Paper • 2502.06023 • Published 2 days ago • 7

upvoted 5 papers 1 day ago

FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

Paper • 2502.05179 • Published 4 days ago • 19

AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting

Paper • 2502.05176 • Published 4 days ago • 24

upvoted 3 papers 9 days ago

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution

Paper • 2501.10045 • Published 26 days ago • 9

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published 21 days ago • 81

GSTAR: Gaussian Surface Tracking and Reconstruction

Paper • 2501.10283 • Published 25 days ago • 5

upvoted a paper 17 days ago

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Paper • 2501.13928 • Published 19 days ago • 16

upvoted an article 22 days ago

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

•

22 days ago

• 60

upvoted an article 26 days ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

27 days ago

• 39

upvoted a paper 26 days ago

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published 27 days ago • 10

upvoted a paper 28 days ago

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10 • 43

upvoted 3 papers about 1 month ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 92

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

Paper • 2501.04561 • Published Jan 8 • 16

The Superposition of Diffusion Models Using the Itô Density Estimator

Paper • 2412.17762 • Published Dec 23, 2024 • 12

upvoted 3 papers 4 months ago

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 49

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10, 2024 • 50

Intriguing Properties of Large Language and Vision Models

Paper • 2410.04751 • Published Oct 7, 2024 • 16