Submitted by akhaliq 76 Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution · 19 authors 4
Submitted by FeYuan 44 A Controlled Study on Long Context Extension and Generalization in LLMs · 9 authors 2
Submitted by akhaliq 37 To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning · 10 authors 3
Submitted by gentaiscool 20 Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey · 7 authors 2
Submitted by akhaliq 12 Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models · 19 authors 4
Submitted by westbrook 10 SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer · 6 authors 2
Submitted by Chenxinglili 7 Towards Diverse and Efficient Audio Captioning via Diffusion Models · 7 authors 3
Submitted by IAMJB 4 Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning · 5 authors 1
Submitted by IAMJB 2 CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark · 5 authors 2
Submitted by IAMJB 2 fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction · 6 authors 1
Submitted by IAMJB 1 Measuring Human and AI Values based on Generative Psychometrics with Large Language Models · 6 authors 2