LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper • 2502.01105 • Published 9 days ago • 16
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published 1 day ago • 10
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Paper • 2501.16764 • Published 15 days ago • 21
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt Paper • 2501.13554 • Published 19 days ago • 9
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published 19 days ago • 34
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model Paper • 2501.12368 • Published 21 days ago • 39
Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions Paper • 2501.10020 • Published 25 days ago • 22
AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation Paper • 2501.09503 • Published 26 days ago • 13
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published 28 days ago • 56
Instance-guided Cartoon Editing with a Large-scale Dataset Paper • 2312.01943 • Published Dec 4, 2023 • 4
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation Paper • 2501.03059 • Published Jan 6 • 20
TransPixar: Advancing Text-to-Video Generation with Transparency Paper • 2501.03006 • Published Jan 6 • 23
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides Paper • 2501.03936 • Published Jan 7 • 19
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published Jan 7 • 42
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published Dec 24, 2024 • 19