fail

#30
by ratboy1 - opened
🥲 Failed to load the model

Failed to load model

error loading model: missing tensor 'blk.0.ffn_down_exps.weight'

I'm facing the same issue when running GH200 using the CUDA Deep Learning image. https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda-dl-base

the same error

$ ollama run mixtral:8x7b-ins-v0.1-q8 "hi"
Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'

$ ollama show mixtral:8x7b-ins-v0.1-q8
  Model
    architecture        llama
    parameters          46.7B
    context length      32768
    embedding length    4096
    quantization        Q8_0
  Parameters
    stop    "[INST]"
    stop    "[/INST]"

$ ollama --version
ollama version is 0.6.2

same error

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'

Model mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf (blk.0.ffn_down_exps.weight)

maybe corrupt download? i try to download again then, 3080 ti here. will update if it works

Git LFS Details
SHA256: 9193684683657e90707087bd1ed19fd0b277ab66358d19edeadc26d6fdec4f53
Pointer size: 136 Bytes
Size of remote file: 26.4 GB
Raw pointer file
Git Large File Storage (LFS) replaces large files with text pointers inside Git, while storing the file contents on a remote server. More info.

C:\Users\User>certutil -hashfile "C:\AI\text-generation-webui-main\user_data\models\mixtral.gguf" SHA256
SHA256 hash of C:\AI\text-generation-webui-main\user_data\models\mixtral.gguf:
9193684683657e90707087bd1ed19fd0b277ab66358d19edeadc26d6fdec4f53
CertUtil: -hashfile command completed successfully.

my hash is the same, daym what is wrong with this..

This quant is more than a year old, and there was a breaking change regarding how Llama.cpp handles MoE models about 5 months ago see this post for more details:
https://huggingface.co/posts/bartowski/894091265291588

Sign up or log in to comment