Commit
·
f3ef13f
1
Parent(s):
96a5535
add vllm lastest image doc.
Browse files- README.md +3 -7
- README_CN.md +4 -8
README.md
CHANGED
@@ -227,9 +227,7 @@ We provide a pre-built Docker image containing vLLM 0.8.5 with full support for
|
|
227 |
- To get started:
|
228 |
|
229 |
```
|
230 |
-
docker pull
|
231 |
-
or
|
232 |
-
docker pull hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm
|
233 |
```
|
234 |
|
235 |
- Download Model file:
|
@@ -247,8 +245,7 @@ docker run --rm --ipc=host \
|
|
247 |
--net=host \
|
248 |
--gpus=all \
|
249 |
-it \
|
250 |
-
-
|
251 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
252 |
-m vllm.entrypoints.openai.api_server \
|
253 |
--host 0.0.0.0 \
|
254 |
--tensor-parallel-size 4 \
|
@@ -265,8 +262,7 @@ docker run --rm --ipc=host \
|
|
265 |
--net=host \
|
266 |
--gpus=all \
|
267 |
-it \
|
268 |
-
-
|
269 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
270 |
-m vllm.entrypoints.openai.api_server \
|
271 |
--host 0.0.0.0 \
|
272 |
--tensor-parallel-size 4 \
|
|
|
227 |
- To get started:
|
228 |
|
229 |
```
|
230 |
+
docker pull hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1
|
|
|
|
|
231 |
```
|
232 |
|
233 |
- Download Model file:
|
|
|
245 |
--net=host \
|
246 |
--gpus=all \
|
247 |
-it \
|
248 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
249 |
-m vllm.entrypoints.openai.api_server \
|
250 |
--host 0.0.0.0 \
|
251 |
--tensor-parallel-size 4 \
|
|
|
262 |
--net=host \
|
263 |
--gpus=all \
|
264 |
-it \
|
265 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
266 |
-m vllm.entrypoints.openai.api_server \
|
267 |
--host 0.0.0.0 \
|
268 |
--tensor-parallel-size 4 \
|
README_CN.md
CHANGED
@@ -180,14 +180,12 @@ print(response)
|
|
180 |
|
181 |
### Docker 镜像
|
182 |
|
183 |
-
|
184 |
|
185 |
- 快速开始方式如下:
|
186 |
|
187 |
```
|
188 |
-
docker pull
|
189 |
-
或
|
190 |
-
docker pull hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm
|
191 |
```
|
192 |
|
193 |
- 下载模型文件:
|
@@ -203,8 +201,7 @@ docker run --rm --ipc=host \
|
|
203 |
--net=host \
|
204 |
--gpus=all \
|
205 |
-it \
|
206 |
-
-
|
207 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
208 |
-m vllm.entrypoints.openai.api_server \
|
209 |
--host 0.0.0.0 \
|
210 |
--tensor-parallel-size 4 \
|
@@ -222,8 +219,7 @@ docker run --rm --ipc=host \
|
|
222 |
--net=host \
|
223 |
--gpus=all \
|
224 |
-it \
|
225 |
-
-
|
226 |
-
--entrypoint python hunyuaninfer/hunyuan-a13b:hunyuan-moe-A13B-vllm \
|
227 |
-m vllm.entrypoints.openai.api_server \
|
228 |
--host 0.0.0.0 \
|
229 |
--tensor-parallel-size 4 \
|
|
|
180 |
|
181 |
### Docker 镜像
|
182 |
|
183 |
+
我们提供了一个基于官方 vLLM 0.8.5 版本的 Docker 镜像方便快速部署和测试。**注意:该镜像要求使用 CUDA 12.4 版本。**
|
184 |
|
185 |
- 快速开始方式如下:
|
186 |
|
187 |
```
|
188 |
+
docker pull hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1
|
|
|
|
|
189 |
```
|
190 |
|
191 |
- 下载模型文件:
|
|
|
201 |
--net=host \
|
202 |
--gpus=all \
|
203 |
-it \
|
204 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
205 |
-m vllm.entrypoints.openai.api_server \
|
206 |
--host 0.0.0.0 \
|
207 |
--tensor-parallel-size 4 \
|
|
|
219 |
--net=host \
|
220 |
--gpus=all \
|
221 |
-it \
|
222 |
+
--entrypoint python3 hunyuaninfer/hunyuan-infer-vllm-cuda12.4:v1 \
|
|
|
223 |
-m vllm.entrypoints.openai.api_server \
|
224 |
--host 0.0.0.0 \
|
225 |
--tensor-parallel-size 4 \
|