ibm-granite-docling-258M-GGUF
This is the GGUF version of the ibm-granite/granite-docling-258M model.
Model Information
- Model Name: granite-docling-258M
- Base Model: ibm-granite/granite-docling-258M
- License: Apache-2.0
- Pipeline Tag: image-text-to-text
- Language: English
- Model Size: 258M
- Model Format: GGUF
Description
Granite Docling is a family of instruction-tuned models designed for document understanding tasks. These models are fine-tuned on a diverse set of tasks including document classification, information extraction, and question answering. The models are optimized for performance on document-centric tasks and can handle a variety of document formats and layouts.
Usage
You need this version of the llama.cpp
to run the model.
Run with docker:
docker run -p 8080:8080 ghcr.io/danchev/llama.cpp:docling \
--server \
-hf danchev/ibm-granite-docling-258M-GGUF \
--host 0.0.0.0 \
--port 8080
Build from source:
git clone git@github.com:gabe-l-hart/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build --config Release -j $(nproc)
Once you have llama.cpp
set up, you can use the following command to run the model:
./build/bin/llama-server -hf danchev/ibm-granite-docling-258M-GGUF
Example Request
You can then send requests to the server using curl
. Here is an example request:
curl -X POST "http://localhost:8080/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ibm-granite/granite-docling-258M",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
}
}
]
}
]
}'
- Downloads last month
- 1,358
Hardware compatibility
Log In
to view the estimation
8-bit
16-bit
Model tree for danchev/ibm-granite-docling-258M-GGUF
Base model
ibm-granite/granite-docling-258M