"Model mistralai/Mistral-Nemo-Instruct-2407 time out" in Inference APIs
I have the same timeout issue
HTTPError Traceback (most recent call last)
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py:304, in hf_raise_for_status(response, endpoint_name)
303 try:
--> 304 response.raise_for_status()
305 except HTTPError as e:
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)
HTTPError: 503 Server Error: Service Unavailable for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions
The above exception was the direct cause of the following exception:
HfHubHTTPError Traceback (most recent call last)
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/inference/_client.py:273, in InferenceClient.post(self, json, data, model, task, stream)
272 try:
--> 273 hf_raise_for_status(response)
274 return response.iter_lines() if stream else response.content
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py:371, in hf_raise_for_status(response, endpoint_name)
369 # Convert HTTPError
into a HfHubHTTPError
to display request information
370 # as well (request id and/or server error message)
--> 371 raise HfHubHTTPError(str(e), response=response) from e
...
288 ) from error
289 # ...or wait 1s and retry
290 logger.info(f"Waiting for model to be loaded on the server: {error}")
InferenceTimeoutError: Model not loaded on the server: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions. Please retry with a higher timeout (current: 120).
All fixed.
Sorry we had to make tiny corrections for this configuration file (the model is the same but the configuration expresses things differently).
Cheers.
Thanks!
Getting the following error even if I am not using mistralai/Mistral-Nemo-Instruct-2407 model.
"HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407 (Request ID: YOov-g) Model too busy, unable to get response in less than 300 second(s)"