Submitted by akhaliq 52 Uni-SMART: Universal Science Multimodal Analysis and Research Transformer · 17 authors 4
Submitted by akhaliq 33 VideoAgent: Long-form Video Understanding with Large Language Model as Agent · 4 authors 2
Submitted by akhaliq 31 Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations · 19 authors 2
Submitted by akhaliq 21 Recurrent Drafter for Fast Speculative Decoding in Large Language Models · 5 authors 1
Submitted by akhaliq 11 FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model · 4 authors 2
Submitted by akhaliq 10 EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba · 3 authors 1
Submitted by akhaliq 8 Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding · 7 authors 1
Submitted by akhaliq 7 Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting · 4 authors 1
Submitted by akhaliq 3 NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices · 3 authors 1