{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "id": "0", "metadata": { "id": "0" }, "source": [ "# Exercises: Putting the Building Blocks into Practice\n", "\n", "Welcome to the hands-on portion of the workshop! In these exercises, you will apply the concepts we've learned to solve a few practical problems.\n", "\n", "**Your goals will be to:**\n", "1. **Extend Function Calling**: Add a new tool for the LLM to use.\n", "2. **Modify Structured Output**: Change a Pydantic schema to extract additional structured information from an image.\n", "3. **Bonus! Use Grammar Mode**: Force the LLM to respond in a highly specific, token-efficient format.\n", "\n", "Look out for the lines marked \"TODO\" in each cell; those are where you will write your code. Let's get started!" ] }, { "cell_type": "code", "execution_count": null, "id": "e966e0b4", "metadata": { "id": "e966e0b4" }, "outputs": [], "source": [ "#\n", "# SETUP CELL #1: PLEASE RUN THIS BEFORE CONTINUING WITH THE EXERCISES.\n", "# RESTART THE RUNTIME AFTER RUNNING THIS CELL IF PROMPTED TO DO SO.\n", "#\n", "!pip install pydantic requests Pillow python-dotenv" ] }, { "cell_type": "code", "execution_count": null, "id": "eac6208b", "metadata": { "id": "eac6208b" }, "outputs": [], "source": [ "#\n", "# SETUP CELL #2: PLEASE RUN THIS BEFORE CONTINUING WITH THE EXERCISES\n", "#\n", "import os\n", "import io\n", "import base64\n", "from dotenv import load_dotenv\n", "import requests\n", "import json\n", "load_dotenv()\n", "\n", "MODEL_ID = \"accounts/fireworks/models/llama4-scout-instruct-basic\"\n", "\n", "# This pattern is for Google Colab.\n", "# If running locally, set the FIREWORKS_API_KEY environment variable.\n", "try:\n", " from google.colab import userdata\n", " FIREWORKS_API_KEY = userdata.get('FIREWORKS_API_KEY')\n", "except ImportError:\n", " FIREWORKS_API_KEY = os.getenv(\"FIREWORKS_API_KEY\")\n", "\n", "# Make sure to set your FIREWORKS_API_KEY\n", "if not FIREWORKS_API_KEY:\n", " print(\"⚠️ Warning: FIREWORKS_API_KEY not set. The following cells will not run without it.\")\n", "\n", "# Helper function to prepare images for VLMs.\n", "# It is defined here to be available for later exercises.\n", "def pil_to_base64_dict(pil_image):\n", " \"\"\"Convert PIL image to the format expected by VLMs\"\"\"\n", " if pil_image is None:\n", " return None\n", "\n", " buffered = io.BytesIO()\n", " if pil_image.mode != \"RGB\":\n", " pil_image = pil_image.convert(\"RGB\")\n", "\n", " pil_image.save(buffered, format=\"JPEG\")\n", " img_base64 = base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n", "\n", " return {\"image\": pil_image, \"path\": \"uploaded_image.jpg\", \"base64\": img_base64}\n", "\n", "# Helper function to make api calls with requests\n", "def make_api_call(payload, tools=None, model_id=None, base_url=None):\n", " \"\"\"Make API call with requests\"\"\"\n", " # Use defaults if not provided\n", " final_model_id = model_id or MODEL_ID\n", " final_base_url = base_url or \"https://api.fireworks.ai/inference/v1\"\n", "\n", " # Add model to payload\n", " payload[\"model\"] = final_model_id\n", "\n", " # Add tools if provided\n", " if tools:\n", " payload[\"tools\"] = tools\n", " payload[\"tool_choice\"] = \"auto\"\n", "\n", " headers = {\n", " \"Authorization\": f\"Bearer {FIREWORKS_API_KEY}\",\n", " \"Content-Type\": \"application/json\"\n", " }\n", "\n", " response = requests.post(\n", " f\"{final_base_url}/chat/completions\",\n", " headers=headers,\n", " json=payload\n", " )\n", "\n", " if response.status_code == 200:\n", " return response.json()\n", " else:\n", " raise Exception(f\"API Error: {response.status_code} - {response.text}\")\n", "\n", "print(\"✅ Setup complete. Helper function and API key are ready.\")" ] }, { "cell_type": "markdown", "id": "09bc4200", "metadata": { "id": "09bc4200" }, "source": [ "## Exercise 1: Extending Function Calling\n", "\n", "[Function calling](https://docs.fireworks.ai/guides/function-calling) allows an LLM to use external tools. Your first task is to give the LLM a new tool.\n", "\n", "**Goal**: Define a new function called `count_letter` that counts the occurrences of a specific letter in a word. You will then define its schema and make it available to the LLM.\n", "\n", "**Your Steps:**\n", "1. Define the Python function `count_letter`.\n", "2. Add it to the `available_functions` dictionary.\n", "3. Define its schema and add it to the `tools` list.\n", "4. Write a prompt to test your new function" ] }, { "cell_type": "code", "execution_count": null, "id": "99c48d84", "metadata": { "id": "99c48d84" }, "outputs": [], "source": [ "###\n", "### EXERCISE 1: WRITE YOUR CODE IN THIS CELL\n", "###\n", "import json\n", "\n", "# --- Step 1: Define the Python function and the available functions mapping ---\n", "\n", "# Base function from the previous notebook\n", "def get_weather(location: str) -> str:\n", " \"\"\"Get current weather for a location\"\"\"\n", " weather_data = {\"New York\": \"Sunny, 72°F\", \"London\": \"Cloudy, 15°C\", \"Tokyo\": \"Rainy, 20°C\"}\n", " return weather_data.get(location, \"Weather data not available\")\n", "\n", "# ---TODO Block start---- #\n", "# Define a new function `count_letter` that takes a `word` and a `letter`\n", "# and returns the number of times the letter appears in the word.\n", "def count_letter(): # TODO: Add your function header here\n", " # TODO: Add your function body here\n", " pass\n", "# ---TODO Block end---- #\n", "\n", "available_functions = {\n", " \"get_weather\": get_weather,\n", " # TODO: Add your new function to this dictionary\n", "}\n", "\n", "\n", "# --- Step 2: Define the function schemas for the LLM ---\n", "\n", "# Base tool schema from the previous notebook\n", "tools = [\n", " {\n", " \"type\": \"function\",\n", " \"function\": {\n", " \"name\": \"get_weather\",\n", " \"description\": \"Get current weather for a location\",\n", " \"parameters\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"location\": {\n", " \"type\": \"string\",\n", " \"description\": \"The city name\"\n", " }\n", " },\n", " \"required\": [\"location\"]\n", " }\n", " }\n", " },\n", " # TODO: Add the JSON schema for your `count_letter` function here.\n", " # It should have two parameters: \"word\" and \"letter\", both are required strings.\n", "]\n", "\n", "\n", "# --- Step 3: Build your input to the LLM ---\n", "\n", "# Initialize the messages list\n", "messages = [\n", " {\n", " \"role\": \"system\",\n", " \"content\": \"You are a helpful assistant. You have access to a couple of tools, use them when needed.\"\n", " },\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"\" #TODO: Add your user prompt here\n", " }\n", "]\n", "\n", "# Create payload\n", "payload = {\n", " \"messages\": messages,\n", " \"tools\": tools,\n", " \"model\": \"accounts/fireworks/models/llama4-maverick-instruct-basic\"\n", "}\n", "\n", "# Get response from LLM\n", "response = make_api_call(payload=payload)\n", "\n", "# Check if the model wants to call a tool/function\n", "if response[\"choices\"][0][\"message\"][\"tool_calls\"]:\n", " tool_call = response[\"choices\"][0][\"message\"][\"tool_calls\"][0]\n", " function_name = tool_call[\"function\"][\"name\"]\n", " function_args = json.loads(tool_call[\"function\"][\"arguments\"])\n", "\n", " print(f\"LLM wants to call: {function_name}\")\n", " print(f\"With arguments: {function_args}\")\n", "\n", " # Execute the function\n", " function_response = available_functions[function_name](**function_args)\n", " print(f\"Function result: {function_response}\")\n", "\n", " # Add the assistant's tool call to the conversation\n", " messages.append({\n", " \"role\": \"assistant\",\n", " \"content\": \"\",\n", " \"tool_calls\": response[\"choices\"][0][\"message\"][\"tool_calls\"]\n", " })\n", "\n", " # Add the function result to the conversation\n", " messages.append({\n", " \"role\": \"tool\",\n", " \"content\": json.dumps(function_response) if isinstance(function_response, dict) else str(function_response)\n", " })\n", "\n", " # Create the final payload\n", " final_payload = {\n", " \"messages\": messages,\n", " \"tools\": tools,\n", " \"model\": \"accounts/fireworks/models/llama4-maverick-instruct-basic\"\n", " }\n", "\n", " # Get final response from LLM\n", " final_response = make_api_call(payload=payload)\n", "\n", " print(f'Final response: {final_response[\"choices\"][0][\"message\"][\"content\"]}')" ] }, { "cell_type": "markdown", "id": "4d198002", "metadata": { "id": "4d198002" }, "source": [ "## Exercise 2: Modifying Structured Outputs (JSON Mode)\n", "\n", "Structured output is critical for building reliable applications. Here, you'll modify an existing schema to extract more information from an image.\n", "\n", "**Goal**: Update the `IncidentAnalysis` Pydantic model to also extract the `make` and `model` of the vehicle in the image.\n", "\n", "**Your Steps:**\n", "1. Add the `make` and `model` fields to the `IncidentAnalysis` Pydantic class.\n", "2. Run the VLM call using [JSON mode](https://docs.fireworks.ai/structured-responses/structured-response-formatting) to see the new structured output." ] }, { "cell_type": "code", "execution_count": null, "id": "1dc5d727", "metadata": { "id": "1dc5d727" }, "outputs": [], "source": [ "###\n", "### EXERCISE 2: WRITE YOUR CODE IN THIS CELL\n", "###\n", "import requests\n", "import io\n", "from PIL import Image\n", "from pydantic import BaseModel, Field\n", "from typing import Literal\n", "\n", "# --- Step 1: Download a sample image ---\n", "url = \"https://raw.githubusercontent.com/RobertoBarrosoLuque/scout-claims/main/images/back_rhs_damage.png\"\n", "response = requests.get(url)\n", "image = Image.open(io.BytesIO(response.content))\n", "print(\"Image downloaded.\")\n", "\n", "\n", "# --- Step 2: Define the output schema ---\n", "# ---TODO Block start---- #\n", "# Add two new string fields to this Pydantic model:\n", "# - `make`: To store the make of the car (e.g., \"Ford\")\n", "# - `model`: To store the model of the car (e.g., \"Mustang\")\n", "class IncidentAnalysis(BaseModel):\n", " description: str = Field(description=\"A description of the damage to the vehicle.\")\n", " location: Literal[\"front-left\", \"front-right\", \"back-left\", \"back-right\", \"front\", \"side\"]\n", " severity: Literal[\"minor\", \"moderate\", \"major\"]\n", " license_plate: str | None = Field(description=\"The license plate of the vehicle, if visible.\")\n", "# ---TODO Block end---- #\n", "\n", "# --- Step 3: Call the VLM with the new schema ---\n", "# The 'pil_to_base64_dict' function was defined in the setup cell\n", "image_for_llm = pil_to_base64_dict(image)\n", "\n", "# Create payload\n", "prompt = \"Describe the car damage in this image and extract all useful information.\" # TODO: modify the prompt to include the new fields\n", "messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:image/jpeg;base64,{image_for_llm['base64']}\"}},\n", " {\"type\": \"text\", \"text\": prompt},\n", " ],\n", " }\n", "]\n", "response_format={\n", " \"type\": \"json_object\",\n", " \"schema\": IncidentAnalysis.model_json_schema(),\n", "}\n", "\n", "payload = {\n", " \"messages\": messages,\n", " \"response_format\": response_format,\n", " \"model\": \"accounts/fireworks/models/llama4-maverick-instruct-basic\"\n", "}\n", "\n", "# Get response from LLM\n", "response = make_api_call(payload=payload)\n", "\n", "\n", "result = json.loads(response[\"choices\"][0][\"message\"][\"content\"])\n", "print(json.dumps(result, indent=2))" ] }, { "cell_type": "markdown", "id": "8e5a2e3d", "metadata": { "id": "8e5a2e3d" }, "source": [ "## Bonus Exercise: Constrained Output with Grammar Mode\n", "\n", "Sometimes you need the model to respond in a very specific, non-JSON format. This is where [Grammar Mode](https://docs.fireworks.ai/structured-responses/structured-output-grammar-based) excels. It forces the model's output to conform to a strict pattern you define, which can also save output tokens vs. JSON mode and offer even more granular control.\n", "\n", "**Goal**: Use grammar mode to force the model to output *only* the make and model of the car as a single lowercase string (e.g., \"ford mustang\").\n", "\n", "**Your Steps:**\n", "1. Define a GBNF grammar string.\n", "2. Call the model using `response_format={\"type\": \"grammar\", \"grammar\": ...}`." ] }, { "cell_type": "code", "execution_count": null, "id": "1ea8cec3", "metadata": { "id": "1ea8cec3" }, "outputs": [], "source": [ "###\n", "### BONUS EXERCISE: WRITE YOUR CODE IN THIS CELL\n", "###\n", "\n", "# The 'image' variable and 'pil_to_base64_dict' helper function from previous\n", "# cells are used here. Make sure those cells have been run.\n", "# This assumes the image from Exercise 2 is still loaded.\n", "image_for_llm = pil_to_base64_dict(image)\n", "\n", "\n", "# --- Step 1: Define the GBNF grammar ---\n", "# Define a grammar that forces the output to be:\n", "# 1. A 'make' (one or more lowercase letters).\n", "# 2. Followed by a single space.\n", "# 3. Followed by a 'model' (one or more lowercase letters).\n", "car_grammar = r'''\n", "# TODO: define a grammar that forces the output to satisfy the format specified above (example output: \"ford mustang\")\n", "'''\n", "\n", "# --- Step 2: Define the prompt ---\n", "# Update the prompt to ask the model to identify the make and model and to respond only in the format specified above\n", "prompt = \"\" # TODO: write your prompt here\n", "\n", "\n", "# --- Step 3: Call the VLM with grammar mode ---\n", "messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:image/jpeg;base64,{image_for_llm['base64']}\"}},\n", " {\"type\": \"text\", \"text\": prompt},\n", " ],\n", " }\n", "]\n", "response_format={\n", " # TODO: define the response format to use the grammar defined above\n", "}\n", "\n", "# Define payload\n", "payload = {\n", " \"messages\": messages,\n", " \"response_format\": response_format,\n", " \"model\": \"accounts/fireworks/models/llama4-maverick-instruct-basic\"\n", "}\n", "\n", "# Get response from LLM\n", "response = make_api_call(payload=payload)\n", "\n", "print(f'Constrained output from model: {response[\"choices\"][0][\"message\"][\"content\"]}')" ] } ], "metadata": { "colab": { "provenance": [], "include_colab_link": true }, "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "3.11.13" } }, "nbformat": 4, "nbformat_minor": 5 }