Spaces:

yusufs
/

vllm-inference

Paused

App Files Files

vllm-inference / run-sailor.sh

Commit History

feat(sail/Sailor-4B-Chat): try increase gpu-memory-utilization to 0.9 before changing the token length

4a9e328

yusufs commited on Nov 29, 2024

feat(llama3.2): using Llama-3.2-3B-Instruct 0cb88a4f764b7a12671c53f0838cd831a0843b95

8b37c20

yusufs commited on Nov 29, 2024

feat(add-model): always download model during build, it will be cached in the consecutive builds

8679a35

yusufs commited on Nov 27, 2024