skips the thinking process

by muzizon - opened 3 days ago

3 days ago

I am facing an issue with the DeepSeek r1 AWQ model deployed using vLLM. In stream mode, the model consistently skips the thinking process and outputs only "\n\n" instead of generating meaningful responses.

Has anyone else encountered this behavior? Any suggestions on how to resolve this?

v2ray

Cognitive Computations org 3 days ago

Which vLLM version are you using, what's your startup command, and what are the GPUs that you're using?

muzizon

1 day ago

Thanks for your help! 😊
vLLM Version: 0.7.2
Startup Command: python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 12345 --max-model-len 32768 --trust-remote-code --tensor-parallel-size 8 --quantization moe_wna16 --gpu-memory-utilization 0.97 --kv-cache-dtype fp8_e5m2 --calculate-kv-scales --served-model-name deepseek-reasoner --model cognitivecomputations/DeepSeek-R1-AWQ --enable-reasoning --reasoning-parser deepseek_r1
GPU Configuration: 8 * A800

v2ray

Cognitive Computations org 1 day ago

--enable-reasoning --reasoning-parser deepseek_r1 This will make the streaming output format slightly different, if you don't want to add special support for this, simply remove these 2 flags and it will work.

muzizon

1 day ago

thanks I'll try it out

traphix

about 12 hours ago

Thanks for your help! 😊
vLLM Version: 0.7.2
Startup Command: python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 12345 --max-model-len 32768 --trust-remote-code --tensor-parallel-size 8 --quantization moe_wna16 --gpu-memory-utilization 0.97 --kv-cache-dtype fp8_e5m2 --calculate-kv-scales --served-model-name deepseek-reasoner --model cognitivecomputations/DeepSeek-R1-AWQ --enable-reasoning --reasoning-parser deepseek_r1
GPU Configuration: 8 * A800

Does A100 support "--kv-cache-dtype fp8_e5m2"?

v2ray

Cognitive Computations org about 11 hours ago

@traphix Yes, but it would be slower than H100.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment