Host of the model

#138
by henrycwf - opened

What host did you guy use to run this model? on-prem or cloud? what OS and hw configuration? Thanks.

You can run the smaller 1.5B model on your mac or windows machine with 16 gig memory easily.

What host did you guy use to run this model? on-prem or cloud? what OS and hw configuration? Thanks.

For the 671B version, you need hopper card like H20, H800, H100, at least one node.

can run SGLang & vLLM on windows and Intel i7 ultra without Nvidia GPU?

I always got below errors when install sglang and vllm using below commands

pip install "sglang[all]>=0.4.2.post4" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer/

ERROR: Could not find a version that satisfies the requirement sgl-kernel>=0.0.3.post3; extra == "srt" (from sglang[srt]) (from versions: 0.0.1)
ERROR: No matching distribution found for sgl-kernel>=0.0.3.post3; extra == "srt"

pip install vllm

  copying build\lib\vllm\model_executor\layers\quantization\utils\configs\N=1536,K=1536,device_name=NVIDIA_H100_80GB_HBM3,dtype=fp8_w8a8,block_shape=[128,128].json -> build\bdist.win-amd64\wheel\.\vllm\model_executor\layers\quantization\utils\configs
  error: could not create 'build\bdist.win-amd64\wheel\.\vllm\model_executor\layers\quantization\utils\configs\N=1536,K=1536,device_name=NVIDIA_H100_80GB_HBM3,dtype=fp8_w8a8,block_shape=[128,128].json': No such file or directory
  [end of output]

Sign up or log in to comment