Output quality local version a lot worse compared to online demo

#17

by cherry-pizza - opened Jan 1

Jan 1

The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?

lengyue233

Fish Audio org Jan 2

No, they are exact same model.

cherry-pizza

Jan 2

I cloned the demo app instead to compare and it's giving me better results than the github repository, so that seemed the solve the problem. However, I'm still getting different output using the same settings. In fact I'm now getting 3 different outputs for the same (deterministic) seed using either the online demo, the locally hosted github version, and the locally hosted demo. Shouldn't I be getting identical outputs?

princer0072

1 day ago

•

edited 1 day ago

I cloned the demo app instead to compare and it's giving me better results than the github repository, so that seemed the solve the problem. However, I'm still getting different output using the same settings. In fact I'm now getting 3 different outputs for the same (deterministic) seed using either the online demo, the locally hosted github version, and the locally hosted demo. Shouldn't I be getting identical outputs?

I'm far from an expert, but I know that "deterministic" is a strong word when discussing NNs. Stable Diffusion results are barely reproducible on different hardware, due to optimizations like Flash Attention/XFormers. If your results are way off - you should probably start looking into the installation and configs first

Kasdeja23

about 8 hours ago

The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?

Hello.
I found that with the newest version of "vector-quantize-pytorch" sound quality is much worse.
Use only vector-quantize-pytorch version 1.14.24. With this version sound will be like on demo.

PoTaTo721

Fish Audio org about 7 hours ago

The demo ( https://huggingface.co/spaces/fishaudio/fish-speech-1 ) directs to the model files in https://huggingface.co/fishaudio/fish-speech-1.5/tree/main however I get very different results using the local version compared to the demo. Is the demo using a model that's not available on hugging face?

Hello.
I found that with the newest version of "vector-quantize-pytorch" sound quality is much worse.
Use only vector-quantize-pytorch version 1.14.24. With this version sound will be like on demo.

Yes, we fixed the library's version since last year in our github repo.

PoTaTo721 changed discussion status to closed about 7 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment