AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper β’ 2502.01341 β’ Published 8 days ago β’ 33
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper β’ 2412.04626 β’ Published Dec 5, 2024 β’ 13
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper β’ 2409.00509 β’ Published Aug 31, 2024 β’ 38
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Paper β’ 2405.21060 β’ Published May 31, 2024 β’ 64
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU Paper β’ 2403.06504 β’ Published Mar 11, 2024 β’ 53