"Open

# Exercises: Putting the Building Blocks into Practice

Welcome to the hands-on portion of the workshop! In these exercises, you will apply the concepts we've learned to solve a few practical problems.

**Your goals will be to:**
1. **Extend Function Calling**: Add a new tool for the LLM to use.
2. **Modify Structured Output**: Change a Pydantic schema to extract additional structured information from an image.
3. **Bonus! Use Grammar Mode**: Force the LLM to respond in a highly specific, token-efficient format.

Look out for the lines marked "TODO" in each cell; those are where you will write your code. Let's get started!

In [None]:
#
# SETUP CELL #1: PLEASE RUN THIS BEFORE CONTINUING WITH THE EXERCISES.
# RESTART THE RUNTIME AFTER RUNNING THIS CELL IF PROMPTED TO DO SO.
#
!pip install pydantic requests Pillow python-dotenv

In [None]:
#
# SETUP CELL #2: PLEASE RUN THIS BEFORE CONTINUING WITH THE EXERCISES
#
import os
import io
import base64
from dotenv import load_dotenv
import requests
import json
load_dotenv()

MODEL_ID = "accounts/fireworks/models/llama4-scout-instruct-basic"

# This pattern is for Google Colab.
# If running locally, set the FIREWORKS_API_KEY environment variable.
try:
 from google.colab import userdata
 FIREWORKS_API_KEY = userdata.get('FIREWORKS_API_KEY')
except ImportError:
 FIREWORKS_API_KEY = os.getenv("FIREWORKS_API_KEY")

# Make sure to set your FIREWORKS_API_KEY
if not FIREWORKS_API_KEY:
 print("⚠️ Warning: FIREWORKS_API_KEY not set. The following cells will not run without it.")

# Helper function to prepare images for VLMs.
# It is defined here to be available for later exercises.
def pil_to_base64_dict(pil_image):
 """Convert PIL image to the format expected by VLMs"""
 if pil_image is None:
 return None

 buffered = io.BytesIO()
 if pil_image.mode != "RGB":
 pil_image = pil_image.convert("RGB")

 pil_image.save(buffered, format="JPEG")
 img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")

 return {"image": pil_image, "path": "uploaded_image.jpg", "base64": img_base64}

# Helper function to make api calls with requests
def make_api_call(payload, tools=None, model_id=None, base_url=None):
 """Make API call with requests"""
 # Use defaults if not provided
 final_model_id = model_id or MODEL_ID
 final_base_url = base_url or "https://api.fireworks.ai/inference/v1"

 # Add model to payload
 payload["model"] = final_model_id

 # Add tools if provided
 if tools:
 payload["tools"] = tools
 payload["tool_choice"] = "auto"

 headers = {
 "Authorization": f"Bearer {FIREWORKS_API_KEY}",
 "Content-Type": "application/json"
 }

 response = requests.post(
 f"{final_base_url}/chat/completions",
 headers=headers,
 json=payload
 )

 if response.status_code == 200:
 return response.json()
 else:
 raise Exception(f"API Error: {response.status_code} - {response.text}")

print("✅ Setup complete. Helper function and API key are ready.")

## Exercise 1: Extending Function Calling

[Function calling](https://docs.fireworks.ai/guides/function-calling) allows an LLM to use external tools. Your first task is to give the LLM a new tool.

**Goal**: Define a new function called `count_letter` that counts the occurrences of a specific letter in a word. You will then define its schema and make it available to the LLM.

**Your Steps:**
1. Define the Python function `count_letter`.
2. Add it to the `available_functions` dictionary.
3. Define its schema and add it to the `tools` list.
4. Write a prompt to test your new function

In [None]:
###
### EXERCISE 1: WRITE YOUR CODE IN THIS CELL
###
import json

# --- Step 1: Define the Python function and the available functions mapping ---

# Base function from the previous notebook
def get_weather(location: str) -> str:
 """Get current weather for a location"""
 weather_data = {"New York": "Sunny, 72°F", "London": "Cloudy, 15°C", "Tokyo": "Rainy, 20°C"}
 return weather_data.get(location, "Weather data not available")

# ---TODO Block start---- #
# Define a new function `count_letter` that takes a `word` and a `letter`
# and returns the number of times the letter appears in the word.
def count_letter(): # TODO: Add your function header here
 # TODO: Add your function body here
 pass
# ---TODO Block end---- #

available_functions = {
 "get_weather": get_weather,
 # TODO: Add your new function to this dictionary
}


# --- Step 2: Define the function schemas for the LLM ---

# Base tool schema from the previous notebook
tools = [
 {
 "type": "function",
 "function": {
 "name": "get_weather",
 "description": "Get current weather for a location",
 "parameters": {
 "type": "object",
 "properties": {
 "location": {
 "type": "string",
 "description": "The city name"
 }
 },
 "required": ["location"]
 }
 }
 },
 # TODO: Add the JSON schema for your `count_letter` function here.
 # It should have two parameters: "word" and "letter", both are required strings.
]


# --- Step 3: Build your input to the LLM ---

# Initialize the messages list
messages = [
 {
 "role": "system",
 "content": "You are a helpful assistant. You have access to a couple of tools, use them when needed."
 },
 {
 "role": "user",
 "content": "" #TODO: Add your user prompt here
 }
]

# Create payload
payload = {
 "messages": messages,
 "tools": tools,
 "model": "accounts/fireworks/models/llama4-maverick-instruct-basic"
}

# Get response from LLM
response = make_api_call(payload=payload)

# Check if the model wants to call a tool/function
if response["choices"][0]["message"]["tool_calls"]:
 tool_call = response["choices"][0]["message"]["tool_calls"][0]
 function_name = tool_call["function"]["name"]
 function_args = json.loads(tool_call["function"]["arguments"])

 print(f"LLM wants to call: {function_name}")
 print(f"With arguments: {function_args}")

 # Execute the function
 function_response = available_functions[function_name](**function_args)
 print(f"Function result: {function_response}")

 # Add the assistant's tool call to the conversation
 messages.append({
 "role": "assistant",
 "content": "",
 "tool_calls": response["choices"][0]["message"]["tool_calls"]
 })

 # Add the function result to the conversation
 messages.append({
 "role": "tool",
 "content": json.dumps(function_response) if isinstance(function_response, dict) else str(function_response)
 })

 # Create the final payload
 final_payload = {
 "messages": messages,
 "tools": tools,
 "model": "accounts/fireworks/models/llama4-maverick-instruct-basic"
 }

 # Get final response from LLM
 final_response = make_api_call(payload=payload)

 print(f'Final response: {final_response["choices"][0]["message"]["content"]}')

## Exercise 2: Modifying Structured Outputs (JSON Mode)

Structured output is critical for building reliable applications. Here, you'll modify an existing schema to extract more information from an image.

**Goal**: Update the `IncidentAnalysis` Pydantic model to also extract the `make` and `model` of the vehicle in the image.

**Your Steps:**
1. Add the `make` and `model` fields to the `IncidentAnalysis` Pydantic class.
2. Run the VLM call using [JSON mode](https://docs.fireworks.ai/structured-responses/structured-response-formatting) to see the new structured output.

In [None]:
###
### EXERCISE 2: WRITE YOUR CODE IN THIS CELL
###
import requests
import io
from PIL import Image
from pydantic import BaseModel, Field
from typing import Literal

# --- Step 1: Download a sample image ---
url = "https://raw.githubusercontent.com/RobertoBarrosoLuque/scout-claims/main/images/back_rhs_damage.png"
response = requests.get(url)
image = Image.open(io.BytesIO(response.content))
print("Image downloaded.")


# --- Step 2: Define the output schema ---
# ---TODO Block start---- #
# Add two new string fields to this Pydantic model:
# - `make`: To store the make of the car (e.g., "Ford")
# - `model`: To store the model of the car (e.g., "Mustang")
class IncidentAnalysis(BaseModel):
 description: str = Field(description="A description of the damage to the vehicle.")
 location: Literal["front-left", "front-right", "back-left", "back-right", "front", "side"]
 severity: Literal["minor", "moderate", "major"]
 license_plate: str | None = Field(description="The license plate of the vehicle, if visible.")
# ---TODO Block end---- #

# --- Step 3: Call the VLM with the new schema ---
# The 'pil_to_base64_dict' function was defined in the setup cell
image_for_llm = pil_to_base64_dict(image)

# Create payload
prompt = "Describe the car damage in this image and extract all useful information." # TODO: modify the prompt to include the new fields
messages=[
 {
 "role": "user",
 "content": [
 {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_for_llm['base64']}"}},
 {"type": "text", "text": prompt},
 ],
 }
]
response_format={
 "type": "json_object",
 "schema": IncidentAnalysis.model_json_schema(),
}

payload = {
 "messages": messages,
 "response_format": response_format,
 "model": "accounts/fireworks/models/llama4-maverick-instruct-basic"
}

# Get response from LLM
response = make_api_call(payload=payload)


result = json.loads(response["choices"][0]["message"]["content"])
print(json.dumps(result, indent=2))

## Bonus Exercise: Constrained Output with Grammar Mode

Sometimes you need the model to respond in a very specific, non-JSON format. This is where [Grammar Mode](https://docs.fireworks.ai/structured-responses/structured-output-grammar-based) excels. It forces the model's output to conform to a strict pattern you define, which can also save output tokens vs. JSON mode and offer even more granular control.

**Goal**: Use grammar mode to force the model to output *only* the make and model of the car as a single lowercase string (e.g., "ford mustang").

**Your Steps:**
1. Define a GBNF grammar string.
2. Call the model using `response_format={"type": "grammar", "grammar": ...}`.

In [None]:
###
### BONUS EXERCISE: WRITE YOUR CODE IN THIS CELL
###

# The 'image' variable and 'pil_to_base64_dict' helper function from previous
# cells are used here. Make sure those cells have been run.
# This assumes the image from Exercise 2 is still loaded.
image_for_llm = pil_to_base64_dict(image)


# --- Step 1: Define the GBNF grammar ---
# Define a grammar that forces the output to be:
# 1. A 'make' (one or more lowercase letters).
# 2. Followed by a single space.
# 3. Followed by a 'model' (one or more lowercase letters).
car_grammar = r'''
# TODO: define a grammar that forces the output to satisfy the format specified above (example output: "ford mustang")
'''

# --- Step 2: Define the prompt ---
# Update the prompt to ask the model to identify the make and model and to respond only in the format specified above
prompt = "" # TODO: write your prompt here


# --- Step 3: Call the VLM with grammar mode ---
messages=[
 {
 "role": "user",
 "content": [
 {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_for_llm['base64']}"}},
 {"type": "text", "text": prompt},
 ],
 }
]
response_format={
 # TODO: define the response format to use the grammar defined above
}

# Define payload
payload = {
 "messages": messages,
 "response_format": response_format,
 "model": "accounts/fireworks/models/llama4-maverick-instruct-basic"
}

# Get response from LLM
response = make_api_call(payload=payload)

print(f'Constrained output from model: {response["choices"][0]["message"]["content"]}')