Spaces:
Running
title: Ocr Llm Test
emoji: π
colorFrom: pink
colorTo: gray
sdk: gradio
sdk_version: 5.16.0
app_file: app.py
pinned: false
short_description: Technical Assessment
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
OCR LLM Classifier
This project provides a simple interface for Optical Character Recognition (OCR) and spam classification using deep learning models. It supports three OCR methods (PaddleOCR, EasyOCR, and KerasOCR) and uses a DistilBERT model for classifying the extracted text as "Spam" or "Not Spam."
Features
- Extract text from images using OCR.
- Classify extracted text as either "Spam" or "Not Spam."
- Save the extracted text and classification results to a local JSON and CSV file.
How It Works
OCR: The app uses one of the three OCR methods to extract text from the uploaded image:
- PaddleOCR
- EasyOCR
- KerasOCR
Classification: The extracted text is passed to a pre-trained DistilBERT model that classifies the text as either "Spam" or "Not Spam."
Save Results: The extracted text and classification results are saved locally in both JSON and CSV formats, allowing easy retrieval and review.
Installation
To get started with this project, follow these steps:
1. Clone the Repository
git clone https://github.com/yourusername/ocr-llm-test.git
cd ocr-llm-test
2. Install Dependencies
You can install the required dependencies using pip:
pip install -r requirements.txt
3. Run the App
To run the Gradio interface locally, execute:
python app.py
Once the app is running, it will be accessible through your web browser at http://localhost:7860.
API Documentation
1. API Endpoint
The main endpoint for this API is /predict
.
2. API Call Example
Install the Python Client
If you don't already have it installed, run the following command:
pip install gradio_client
Make an API Call
from gradio_client import Client, handle_file
client = Client("winamnd/ocr-llm-test")
result = client.predict(
method="PaddleOCR",
img=handle_file('https://raw.githubusercontent.com/gradio-app/gradio/main/test/test_files/bus.png'),
api_name="/predict"
)
print(result)
3. Parameters
Parameter | Type | Description |
---|---|---|
method |
Literal['PaddleOCR', 'EasyOCR', 'KerasOCR'] |
Choose the OCR method to be used for text extraction. Default is "PaddleOCR." |
img |
dict |
The image input, which can be provided as a URL, path, or base64 encoded image. |
Image Input Details
- path: Path to a local file.
- url: Publicly available URL for the image.
- size: The size of the image (in bytes).
- orig_name: Original filename.
- mime_type: MIME type of the image.
- is_stream: Always set to False.
- meta: Metadata.
4. Returns
The API returns a tuple with two elements:
- Extracted Text (
str
): The text extracted from the image. - Spam Classification (
str
): The classification result ("Spam" or "Not Spam").