Jofthomas (Joffrey THOMAS)

reacted to burtenshaw's post with 🤗🚀🔥 27 days ago

Post

43475

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

28 replies

·

reacted to m-ric's post with 🔥 5 months ago

Post

1509

Transformers v4.45.0 released: includes a lightning-fast method to build tools! ⚡️

During user research with colleagues @MoritzLaurer and @Jofthomas , we discovered that the class definition currently in used to define a Tool in
transformers.agents is a bit tedious to use, because it goes in great detail.

➡️ So I’ve made an easier way to build tools: just make a function with type hints + a docstring, and add a @tool decorator in front.

✅ Voilà, you’re good to go!

Read all about it in the new doc here: https://huggingface.co/docs/transformers/main/en/agents#create-a-new-tool

And don’t hesitate to give feedback, I’m all ears! 🤗

replied to their post 5 months ago

if you liked this space, you can vote for this project on the gemini api contest now : https://ai.google.dev/competition/projects/everchanging-quest

reacted to m-ric's post with 🔥 6 months ago

Post

2130

🎮 𝗔 𝗻𝗲𝘂𝗿𝗮𝗹 𝗻𝗲𝘁𝘄𝗼𝗿𝗸 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗲𝘀 𝗗𝗢𝗢𝗠: 𝗚𝗼𝗼𝗴𝗹𝗲 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗼𝗽𝗲𝗻 𝘁𝗵𝗲 𝘄𝗮𝘆 𝗳𝗼𝗿 𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗹𝘆-𝗔𝗜-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲𝗱 𝗴𝗮𝗺𝗲𝘀!

Imagine if games were completely live-generated by an AI model : the NPCs and their dialogues, the storyline, and even the game environment. The player’s in-game actions would have a real, lasting impact on the game story.

In a very exciting paper, Google researchers just gave us the first credible glimpse of this future.

➡️ They created GameNGen, the first neural model that can simulate a complex 3D game in real-time. They use it to simulate the classic game DOOM running at over 20 frames per second on a single TPU, with image quality comparable to lossy JPEG compression. And it feels just like the true game!

Here's how they did it:
1. They trained an RL agent to play DOOM and recorded its gameplay sessions.
2. They then used these recordings to train a diffusion model to predict the next frame, based on past frames and player actions.
3. During inference, they use only 4 denoising steps (instead of the usual dozens) to generate each frame quickly.

𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
🎮🤔 Human players can barely tell the difference between short clips (3 seconds) of the real game or the simulation
🧠 The model maintains game state (health, ammo, etc.) over long periods despite having only 3 seconds of effective context length
🔄 They use "noise augmentation" during training to prevent quality degradation in long play sessions
🚀 The game runs on one TPU at 20 FPS with 4 denoising steps, or 50 FPS with model distillation (with some quality loss)

The researchers did not open source the code, but I feel like we’ve just seen a part of the future being written!

Their paper (exploding the upvote counter) 👉 Diffusion Models Are Real-Time Game Engines (2408.14837)
In a similar vein, play @Jofthomas 's 'Everchanging Quest' 🎮 Jofthomas/Everchanging-Quest

posted an update 6 months ago

Post

4005

Everchanging Quest is out !

It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.

Come test it on the space :
Jofthomas/Everchanging-Quest

2 replies

·

replied to anakin87's post 8 months ago

thanks @anakin87 , Awesome Notebook and just what I needed !

reacted to anakin87's post with 🔥👀👍 8 months ago

Post

2161

⚙️ Prompt Optimization with Haystack and DSPy

Experimental notebook: 🧪📓 https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prompt_optimization_with_dspy.ipynb

When building applications with LLMs, writing effective prompts is a long process of trial and error. 🔄
Often, if you switch models, you also have to change the prompt. 😩
What if you could automate this process?

💡 That's where DSPy comes in - a framework designed to algorithmically optimize prompts for Language Models.
By applying classical machine learning concepts (training and evaluation data, metrics, optimization), DSPy generates better prompts for a given model and task.

Recently, I explored combining DSPy with the robustness of Haystack Pipelines.

Here's how it works:
▶️ Start from a Haystack RAG pipeline with a basic prompt
🎯 Define a goal (in this case, get correct and concise answers)
📊 Create a DSPy program, define data and metrics
✨ Optimize and evaluate -> improved prompt
🚀 Build a refined Haystack RAG pipeline using the optimized prompt

1 reply

·

reacted to bwang0911's post with ❤️🚀 8 months ago

Post

2642

we are very proud to introduce jinaai/jina-clip-v1, aka "jina-embeddings-multimodal".

The OpenAI CLIP openai/clip-vit-base-patch32 have nice performance to align text and image modality, that user can perform cross-modal text image retrieval or image classification on top of it. However, due to the training data and recipe, it can not:

1. model longer sequence of text inputs (77 token constraint).
2. align text representations (CLIP Text Tower is weak for text search).

In our latest publication, Jina CLIP: Your CLIP Model Is Also Your Text Retriever (2405.20204) , we proposed a multi-task, multi-objective learning scheme. The produced CLIP model shows:

1. Stronger cross-modal performance against OpenAI sets, 2% and 6% improvement on cross-modal retrieval recall@5.
2. Text tower of the JinaCLIP is a strong text encoder, reach the same performance as jinaai/jina-embeddings-v2-base-en, 165% improvement on MTEB[BEIR] recall@5.
3. Image tower of the JinaCLIP also shows strong performance in image-image search (CBIR), 12% recall improvement on Cifar100 test set.

If you are working on MuRAG (multimodal-retrieval argumented generation), try it out!

reacted to lunarflu's post with ❤️ 9 months ago

Post

1972

cooking up something....anyone interested in a daily activity tracker for HF?

12 replies

·

reacted to radames's post with 🔥 9 months ago

Post

2027

AI-town now runs on Hugging Face Spaces with our API for LLMs and embeddings, including the open-source Convex backend, all in one container. Easy to duplicate and config on your own

Demo: radames/ai-town
Instructions: https://github.com/radames/ai-town-huggingface

9 replies

·

reacted to sequelbox's post with 🔥 9 months ago

Post

1738

Llama 70b Instruct + function calling.
Fireplace-70b out now:

ValiantLabs/Llama3-70B-Fireplace

reacted to andrewyng's post with 👍🤯❤️ 11 months ago

Post

DeepLearning.AI just announced a new short course: Open Source Models with Hugging Face 🤗, taught by Hugging Face's own Maria Khalusova, Marc Sun and Younes Belkada!

As many of you already know, Hugging Face has been a game changer by letting developers quickly grab any of hundreds of thousands of already-trained open source models to assemble into new applications. This course teaches you best practices for building this way, including how to search and choose among models.

You'll learn to use the Transformers library and walk through multiple models for text, audio, and image processing, including zero-shot image segmentation, zero-shot audio classification, and speech recognition. You'll also learn to use multimodal models for visual question answering, image search, and image captioning. Finally, you’ll learn how to demo what you build locally, on the cloud, or via an API using Gradio and Hugging Face Spaces.

Thank you very much to Hugging Face's wonderful team for working with us on this.

You can sign up for the course here: https://www.deeplearning.ai/short-courses/open-source-models-hugging-face/

1 reply

·

reacted to philschmid's post with ❤️ about 1 year ago

Post

What's the best way to fine-tune open LLMs in 2024? Look no further! 👀 I am excited to share “How to Fine-Tune LLMs in 2024 with Hugging Face” using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. 🚀

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
💡Define and understand use cases for fine-tuning
🧑🏻‍💻 Setup of the development environment
🧮 Create and prepare dataset (OpenAI format)
🏋️‍♀️ Fine-tune LLM using TRL and the SFTTrainer
🥇 Test and evaluate the LLM
🚀 Deploy for production with TGI

👉 https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. 🔜

4 replies

·

Joffrey THOMAS

AI & ML interests

Recent Activity

Organizations

Jofthomas's activity