view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 27 days ago • 65
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 28 days ago • 142
view post Post 1735 A while ago I started experimenting with compiling the Python interpreter to WASM.To build a secure, fast, and lightweight sandbox for code execution — ideal for running LLM-generated Python code.- Send code simply as a POST request- 1-2ms startup timesHack away:https://github.com/ErikKaum/runner 🔥 8 8 👀 6 6 + Reply
Running 56 56 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks 📝 Evaluate multilingual models using FineTasks
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python Oct 22, 2024 • 44
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python Oct 22, 2024 • 44
view post Post 1096 This week in Inference Endpoints - thx @erikkaum for the update!👀 https://huggingface.co/blog/erikkaum/endpoints-changelog 1 reply · 🚀 1 1 👍 1 1 🔥 1 1 ❤️ 1 1 + Reply