view article Article From Llasa to Llasagna π: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other β’ about 6 hours ago β’ 13
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Paper β’ 2410.23320 β’ Published Oct 30, 2024 β’ 8
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more⦠Oct 22, 2024 ⒠67
Llama3-8B-1.58 Collection A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! β’ 3 items β’ Updated Sep 14, 2024 β’ 11
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 β’ 217
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated 25 days ago β’ 161
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers β’ 67 items β’ Updated Jul 3, 2024 β’ 103
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24, 2024 β’ 61
Gemma release Collection Groups the Gemma models released by the Google team. β’ 40 items β’ Updated Dec 13, 2024 β’ 329
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace β’ 68 items β’ Updated Feb 13, 2024 β’ 14
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 β’ 10 items β’ Updated Dec 13, 2024 β’ 51
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. β’ 9 items β’ Updated Dec 13, 2024 β’ 17