Output quality local version a lot worse compared to online demo
The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?
No, they are exact same model.
I cloned the demo app instead to compare and it's giving me better results than the github repository, so that seemed the solve the problem. However, I'm still getting different output using the same settings. In fact I'm now getting 3 different outputs for the same (deterministic) seed using either the online demo, the locally hosted github version, and the locally hosted demo. Shouldn't I be getting identical outputs?
I cloned the demo app instead to compare and it's giving me better results than the github repository, so that seemed the solve the problem. However, I'm still getting different output using the same settings. In fact I'm now getting 3 different outputs for the same (deterministic) seed using either the online demo, the locally hosted github version, and the locally hosted demo. Shouldn't I be getting identical outputs?
I'm far from an expert, but I know that "deterministic" is a strong word when discussing NNs. Stable Diffusion results are barely reproducible on different hardware, due to optimizations like Flash Attention/XFormers. If your results are way off - you should probably start looking into the installation and configs first
The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?
Hello.
I found that with the newest version of "vector-quantize-pytorch" sound quality is much worse.
Use only vector-quantize-pytorch version 1.14.24. With this version sound will be like on demo.
The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?
Hello.
I found that with the newest version of "vector-quantize-pytorch" sound quality is much worse.
Use only vector-quantize-pytorch version 1.14.24. With this version sound will be like on demo.
Yes, we fixed the library's version since last year in our github repo.