Submitted by jymcc 62 HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale · 12 authors 9
Submitted by unilm 22 Direct Preference Knowledge Distillation for Large Language Models · 6 authors 1
Submitted by variante 18 LLaRA: Supercharging Robot Learning Data for Vision-Language Policy · 11 authors 1
Submitted by thewhole 12 GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality · 10 authors 3
Submitted by wondervictor 9 EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model · 9 authors 3
Submitted by JueZhang 9 AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation · 10 authors 1
Submitted by davanstrien 8 Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity · 15 authors 1
Submitted by MatouK98 7 Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning · 9 authors 1