view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 β’ 593
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 7 days ago β’ 153
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! 20 days ago β’ 124
view article Article PaliGemma β Google's Cutting-Edge Open Vision Language Model May 14, 2024 β’ 238
view post Post 1766 New open Vision Language Model by @Google : PaliGemma ππ€π Comes in 3B, pretrained, mix and fine-tuned models in 224, 448 and 896 resolution𧩠Combination of Gemma 2B LLM and SigLIP image encoderπ€ Supported in transformersPaliGemma can do..𧩠Image segmentation and detection! π€―π Detailed document understanding and reasoningπ Visual question answering, captioning and any other VLM task!Read our blog π hf.co/blog/paligemmaTry the demo πͺ hf.co/spaces/google/paligemmaCheck out the Spaces and the models all in the collection π google/paligemma-release-6643a9ffbf57de2ae0448ddaCollection of fine-tuned PaliGemma models google/paligemma-ft-models-6643b03efb769dad650d2dda 13 replies Β· π₯ 13 13 π 8 8 β€οΈ 6 6 π 4 4 + Reply
Salesforce/xgen-mm-phi3-mini-instruct-r-v1 Image-Text-to-Text β’ Updated 9 days ago β’ 1.3k β’ 186
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x β’ Jun 23, 2024 β’ 33
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin β’ 63 items β’ Updated Apr 17, 2024 β’ 56
view article Article Fine-tuning a large language model on Kaggle Notebooks (or even on your own computer) for solving real-world tasks By lmassaron β’ Feb 21, 2024 β’ 15
view article Article Design choices for Vision Language Models in 2024 By gigant β’ Apr 16, 2024 β’ 27