Spaces:

marimo-team
/

marimo-learn

Running

Azmi-84 commited on May 3

Commit

e5fc993

1 Parent(s): dd84d5b

Enhance and redesign DuckDB introductory notebook

This commit addresses and resolves the suggestions provided in the review, including:

- Ensuring the notebook follows the best practices outlined in the contribution guidelines.
- Removing irrelevant markdown blocks and using marimo features.

Additionally, the notebook has been completely redesigned with:
- Improved structure and flow for better readability and learning experience.
- Enhanced examples and interactive content for database connections, table creation, and data manipulation.
- Better integration of visuals using Plotly and Marimo for basic interactive analysis.
- Updated dependency management using for reproducibility.

The notebook now provides a polished and user-friendly guide to DuckDB, ensuring a high-quality learning experience for users.

Files changed (1) hide show

duckdb/01_getting_started.py +1531 -121

duckdb/01_getting_started.py CHANGED Viewed

@@ -1,232 +1,1638 @@
 import marimo
-__generated_with = "0.11.30"
-app = marimo.App(width="medium")
 @app.cell(hide_code=True)
-def _introduction(mo):
     mo.md(
-        """
-        # DuckDB: An Embeddable Analytical Database System
-        ### What is DuckDB?
-        [DuckDB](https://duckdb.org/) is a high-performance, in-process analytical database management system (DBMS) designed for speed and simplicity. It's particularly well-suited for analytical query workloads, offering a robust SQL interface and efficient data processing capabilities. This document highlights key features and aspects of DuckDB relevant for a course on database systems or data analysis.
-        ### [Key Features](https://duckdb.org/why_duckdb):
-        - In-Process: Easy integration, zero external dependencies.
-        - Portable: Works on various OS and architectures.
-        - Columnar Storage: Efficient for analytical queries.
-        - Vectorized Execution: Speeds up data processing.
-        - ACID Transactions: Ensures data integrity.
-        - Multi-Language APIs: Python, R, Java, etc.
-        ### [Use Cases](https://github.com/davidgasquez/awesome-duckdb?tab=readme-ov-file):
-        - Data analysis and exploration
-        - Embedded analytics in applications
-        - ETL (Extract, Transform, Load) processes
-        - Data science and machine learning workflows
-        - Rapid prototyping of data analysis pipelines.
-        ### [Installation](https://duckdb.org/docs/installation/?version=stable&environment=python):
-        - The DuckDB Python API can be installed using pip:
-        ```
-        pip install duckdb
-        ```
-        - It is also possible to install DuckDB using conda:
-        ```
-        conda install python-duckdb -c conda-forge.
-        ```
-        **Python version:** DuckDB requires Python 3.7 or newer.
-        """
     )
     return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
         r"""
-        # [1. DuckDB Basic Connection](https://duckdb.org/docs/stable/connect/overview.html)
-        DuckDB can run entirely in your computer's RAM, known as in-memory mode, which you can enable by using `:memory:` as the database name or by not providing a database file. It's crucial to understand that this means all data is temporary and will be completely erased when the program closes, as it isn't saved to disk.
-        """
     )
     return
 @app.cell
-def _database_connection(duckdb):
-    # Create a connection to an in-memory database
-    database_connection = duckdb.connect(database=":memory:")
-    print(
-        f"DuckDB version: {database_connection.execute('SELECT version()').fetchone()[0]}"
-    )
-    return (database_connection,)
 @app.cell(hide_code=True)
 def _(mo):
-    mo.md(
-        r"""# [2. Creating Tables](https://duckdb.org/docs/stable/sql/statements/create_table.html)"""
-    )
     return
 @app.cell
-def _create_users_table(database_connection):
-    database_connection.execute(
-        """
-    CREATE TABLE users (
         id INTEGER,
-        name VARCHAR,
-        age INTEGER,
-        registration_date DATE
     )
     """
     )
     return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [3. Instering data into table](https://duckdb.org/docs/stable/sql/statements/insert)"""
     )
     return
 @app.cell
-def _insert_user_data(database_connection):
-    database_connection.execute(
-        """
-    INSERT INTO users VALUES
-    (1, 'Alice', 25, '2021-01-01'),
-    (2, 'Bob', 30, '2021-02-01'),
-    (3, 'Charlie', 35, '2021-03-01')
-    """
     )
     return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [4. Basic Queries](https://duckdb.org/docs/stable/sql/query_syntax/select)"""
     )
     return
 @app.cell
-def _basic_queries(database_connection):
-    # Select all data
-    user_results = database_connection.execute("SELECT * FROM users").fetchall()
-    for user_row in user_results:
-        print(user_row)
-    return user_results, user_row
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [5. Working with Polars](https://duckdb.org/docs/stable/guides/python/polars.html)"""
     )
     return
 @app.cell
-def _polars_dataframe(database_connection, pl):
-    # Create a Polars DataFrame
-    polars_dataframe = pl.DataFrame(
-        {
-            "id": [1, 2, 3],
-            "name": ["Alice", "Bob", "Charlie"],
-            "age": [25, 30, 35],
-            "registration_date": ["2021-01-01", "2021-02-01", "2021-03-01"],
-        }
     )
-    # Register the Polars DataFrame as a DuckDB table
-    database_connection.register("users_polars", polars_dataframe)
-    # Query the Polars DataFrame using DuckDB
-    polars_results = database_connection.execute(
-        "SELECT * FROM users_polars"
-    ).fetchall()
-    print("New Table:")
-    for polars_row in polars_results:
-        print(polars_row)
-    return polars_dataframe, polars_results, polars_row
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [6. Join Operations](https://duckdb.org/docs/stable/guides/performance/join_operations.html)"""
     )
     return
 @app.cell
-def _join_operations(database_connection):
-    join_results = database_connection.execute(
-        """
-    SELECT u.id, u.name, u.age, nu.registration_date
-    FROM users u
-    JOIN users_polars nu ON u.age < nu.age
     """
     )
-    print("Join Result:")
-    for join_row in join_results.fetchall():
-        print(join_row)
-    return join_results, join_row
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [7. Aggregate Functions](https://duckdb.org/docs/stable/sql/functions/aggregates.html)"""
     )
     return
 @app.cell
-def _aggregate_functions(database_connection):
-    aggregate_results = database_connection.execute(
-        """
-    SELECT AVG(age) as avg_age, MAX(age) as max_age, MIN(age) as min_age
-    FROM (SELECT * FROM users UNION ALL SELECT * FROM users_polars) AS all_users
     """
-    ).fetchall()
-    print(
-        f"Average Age: {aggregate_results[0][0]:.1f}, "
-        f"Max Age: {aggregate_results[0][1]}, "
-        f"Min Age: {aggregate_results[0][2]}"
     )
-    return (aggregate_results,)
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
-        r"""# [8. Converting Results to Polars DataFrames](https://duckdb.org/docs/stable/guides/python/polars.html)"""
     )
     return
 @app.cell
-def _convert_to_polars(database_connection):
-    # -- 8. Converting Results to Polars DataFrames --
-    # Convert the result to a Polars DataFrame
-    polars_result_df = database_connection.execute("SELECT * FROM users").df()
-    print("Result as Polars DataFrame:")
-    print(polars_result_df)
-    return (polars_result_df,)
 @app.cell(hide_code=True)
@@ -234,8 +1640,12 @@ def _():
     import marimo as mo
     import duckdb
     import polars as pl
-    return duckdb, mo, pl
 if __name__ == "__main__":

+# /// script
+# requires-python = ">=3.11"
+# dependencies = [
+#     "marimo",
+#     "duckdb==1.2.2",
+#     "polars==1.27.0",
+#     "numpy==2.2.4",
+#     "pyarrow==19.0.1",
+#     "pandas==2.2.3",
+#     "sqlglot==26.12.1",
+#     "plotly==5.23.1",
+# ]
+# ///
 import marimo
+__generated_with = "0.13.4"
+app = marimo.App(width="medium")
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        rf"""
+    <p align="center">
+      <img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSxHAqB0W_61zuIGVMiU6sEeQyTaw-9xwiprw&s" alt="DuckDB Image"/>
+    </p>
+    """
+    )
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        rf"""
+    # 🦆 **DuckDB**: An Embeddable Analytical Database System
+    ## What is DuckDB?
+    [DuckDB](https://duckdb.org/) is a _high-performance_, in-process, embeddable SQL OLAP (Online Analytical Processing) Database Management System (DBMS) designed for simplicity and speed. It's essentially a fully-featured database that runs directly within your application's process, without needing a separate server. This makes it excellent for complex analytical workloads, offering a robust SQL interface and efficient processing – perfect for learning about databases and data analysis concepts.  It's a great alternative to heavier database systems like PostgreSQL or MySQL when you don't need a full-blown server.
+    ---
+    ## ⚡ Key Features
+    | Feature | Description |
+    |:---------|:-------------|
+    | **In-Process Architecture** | Runs directly within your application's memory space - no separate server needed, simplifying deployment |
+    | **Columnar Storage** | Data stored in columns instead of rows, dramatically improving performance for analytical queries |
+    | **Vectorized Execution** | Performs operations on entire columns at once, significantly speeding up data processing |
+    | **ACID Transactions** | Ensures data integrity and reliability across operations |
+    | **Multi-Language Support** | Provides APIs for `Python`, `R`, `Java`, `C++`, and more |
+    | **Zero External Dependencies** | Minimal dependencies, making setup and deployment straightforward |
+    | **High Portability** | Works across various operating systems (Windows, macOS, Linux) and hardware architectures |
+    ---
+    ## [Use Cases](https://github.com/davidgasquez/awesome-duckdb?tab=readme-ov-file):
+    - **Data Analysis and Exploration:**  DuckDB is ideal for quickly querying and analyzing datasets, especially for initial exploratory analysis.
+    - **Embedded Analytics in Applications:**  You can integrate DuckDB directly into your applications to provide analytical capabilities without the need for a separate database server.
+    - **ETL (Extract, Transform, Load) Processes:** DuckDB can be used to perform initial data transformation and cleaning steps as part of an ETL pipeline.
+    - **Data Science and Machine Learning Workflows:**  It's a lightweight alternative to larger databases for prototyping data analysis and machine learning models.
+    - **Rapid Prototyping of Data Analysis Pipelines:** Quickly test and iterate on data analysis ideas without the complexity of setting up a full-blown database environment.
+    - **Small to Medium Datasets:** DuckDB shines when working with datasets that don't require the massive scalability of a traditional database server.
+    ---
+    ### [Installation](https://duckdb.org/docs/installation/?version=stable&environment=python):
+    - Python installation:
+    ```
+    pip install duckdb
+    ```
+    ```
+    conda install python-duckdb -c conda-forge.
+    ```
+    <!-- >**_Note_:** DuckDB requires Python 3.7 or newer. You also need to have Python and `pip` or `conda` installed on your system. -->
+    /// attention | Note
+    DuckDB requires Python 3.7 or newer. You also need to have Python and `pip` or `conda` installed on your system.
+    ///
+    """
+    )
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    # [1. DuckDB Connections: In-Memory vs. File-based](https://duckdb.org/docs/stable/connect/overview.html)
+    DuckDB is a lightweight, _relational database management system (RDBMS)_ designed for analytical workloads. Unlike traditional client-server databases, it operates _in-process_ (embedded within your application) and supports both _in-memory_ (temporary) and _file-based_ (persistent) storage.
+    ---
+    | Feature | In-Memory Connection | File-Based Connection |
+    |:---------|:---------------------|:----------------------|
+    | Persistence | Temporary (lost when session ends) | Stored on disk (persists between sessions) |
+    | Use Cases | Quick analysis, ephemeral data, testing | Long-term storage, data that needs to be accessed later |
+    | Performance | Faster for most operations | Slightly slower but provides persistence |
+    | Creation | duckdb.connect(':memory:') | duckdb.connect('filename.db') |
+    | Multiple Connection Access | Limited to single connection | Multiple connections can access the same database |
+    """
+    )
+    return
+@app.cell
+def _(os):
+    # Remove previous database if it exists
+    if os.path.exists("example.db"):
+        os.remove("example.db")
+    if not os.path.exists("data"):
+        os.makedirs("data")
+    return
+@app.cell
+def _(mo):
+    _df = mo.sql(
+        f"""
+        -- Print the DuckDB version
+        SELECT version() AS version_info
+        """
+    )
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        """
+    ## Creating DuckDB Connections
+    Let's create both types of DuckDB connections and explore their characteristics.
+    1. **In-memory connection**: Data exists only during the current session
+    2. **File-based connection**: Data persists between sessions
+    We'll then demonstrate the key differences between these connection types.
+    """
+    )
+    return
+@app.cell
+def _(duckdb):
+    # Create an in-memory DuckDB connection
+    memory_db = duckdb.connect(":memory:")
+    # Create a file-based DuckDB connection
+    file_db = duckdb.connect("example.db")
+    return file_db, memory_db
+@app.cell
+def _(file_db, memory_db):
+    # Test both connections
+    memory_db.execute(
+        "CREATE TABLE IF NOT EXISTS mem_test (id INTEGER, name VARCHAR)"
+    )
+    memory_db.execute("INSERT INTO mem_test VALUES (1, 'Memory Test')")
+    file_db.execute(
+        "CREATE TABLE IF NOT EXISTS file_test (id INTEGER, name VARCHAR)"
+    )
+    file_db.execute("INSERT INTO file_test VALUES (1, 'File Test')")
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    ## Testing Connection Persistence
+    Let's demonstrate how in-memory databases are ephemeral, while file-based databases persist.
+    1. First, we'll query our tables to confirm the data was properly inserted
+    2. Then, we'll simulate an application restart by creating new connections
+    3. Finally, we'll check which data persists after the "restart"
+    """
+    )
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""## Current Database Contents""")
+    return
+@app.cell
+def _(mem_test, memory_db, mo):
+    _df = mo.sql(
+        f"""
+        SELECT * FROM mem_test
+        """,
+        engine=memory_db
+    )
+    return
+@app.cell
+def _(file_db, file_test, mo):
+    _df = mo.sql(
+        f"""
+        SELECT * FROM file_test
+        """,
+        engine=file_db
+    )
+    return
+@app.cell
+def _():
+    # We don't actually close the connections here since we need them for later cells
+    # Just a placeholder for the concept
+    return
+@app.cell(hide_code=True)
+def _file_query(mo):
+    mo.md(rf"""## 🔄 Simulating Application Restart...""")
+    return
+@app.cell
+def _(duckdb):
+    # Create new connections (simulating restart)
+    new_memory_db = duckdb.connect(":memory:")
+    new_file_db = duckdb.connect("example.db")
+    return new_file_db, new_memory_db
+@app.cell
+def _(new_memory_db):
+    # Try to query tables in the new memory connection
+    try:
+        new_memory_db.execute("SELECT * FROM mem_test").df()
+        memory_persistence = "✅ Data persisted in memory (unexpected)"
+        memory_data_available = True
+    except Exception as e:
+        memory_persistence = "❌ Data lost from memory (expected behavior)"
+        memory_data_available = False
+    return memory_data_available, memory_persistence
+@app.cell
+def _(new_file_db):
+    # Try to query tables in the new file connection
+    try:
+        file_data = new_file_db.execute("SELECT * FROM file_test").df()
+        file_persistence = "✅ Data persisted in file (expected behavior)"
+        file_data_available = True
+    except Exception as e:
+        file_persistence = "❌ Data lost from file (unexpected)"
+        file_data_available = False
+        file_data = None
+    return file_data, file_data_available, file_persistence
+@app.cell
+def _(
+    file_data_available,
+    file_persistence,
+    memory_data_available,
+    memory_persistence,
+    mo,
+):
+    # Create an interactive display to show persistence results
+    persistence_results = mo.ui.table(
+        {
+            "Connection Type": ["In-Memory Database", "File-Based Database"],
+            "Persistence Status": [memory_persistence, file_persistence],
+            "Data Available After Restart": [
+                memory_data_available,
+                file_data_available,
+            ],
+        }
+    )
+    mo.md("### Persistence Test Results")
+    return (persistence_results,)
+@app.cell
+def _(persistence_results):
+    persistence_results
+    return
+@app.cell
+def _(file_data, file_data_available, mo):
+    if file_data_available:
+        mo.md("### Persisted File-Based Data:")
+        mo.ui.table(file_data)
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    # [2. Creating Tables in DuckDB](https://duckdb.org/docs/stable/sql/statements/create_table.html)
+    DuckDB supports standard SQL syntax for creating tables. Let's create more complex tables to demonstrate different data types and constraints.
+    ## Table Creation Options
+    DuckDB supports various table creation options, including:
+    - **Basic tables** with column definitions
+    - **Temporary tables** that exist only during the session
+    - **CREATE OR REPLACE** to recreate tables
+    - **Primary keys** and other constraints
+    - **Various data types** including INTEGER, VARCHAR, TIMESTAMP, DECIMAL, etc.
+    """
+    )
+    return
+@app.cell
+def _create_users_tables(file_db, new_memory_db):
+    # For the memory database
+    try:
+        new_memory_db.execute("DROP TABLE IF EXISTS users_memory")
+    except:
+        pass
+    # For the file database
+    try:
+        file_db.execute("DROP TABLE IF EXISTS users_file")
+    except:
+        pass
+    return
+@app.cell
+def _(file_db, new_memory_db):
+    # Create advanced users table in memory database with primary key
+    new_memory_db.execute("""
+    CREATE TABLE users_memory (
+        id INTEGER PRIMARY KEY,
+        name VARCHAR NOT NULL,
+        age INTEGER CHECK (age > 0),
+        email VARCHAR UNIQUE,
+        registration_date DATE DEFAULT CURRENT_DATE,
+        last_login TIMESTAMP,
+        account_balance DECIMAL(10,2) DEFAULT 0.00
+    )
+    """)
+    # Create users table in file database
+    file_db.execute("""
+    CREATE TABLE users_file (
+        id INTEGER PRIMARY KEY,
+        name VARCHAR NOT NULL,
+        age INTEGER CHECK (age > 0),
+        email VARCHAR UNIQUE,
+        registration_date DATE DEFAULT CURRENT_DATE,
+        last_login TIMESTAMP,
+        account_balance DECIMAL(10,2) DEFAULT 0.00
+    )
+    """)
+    return
+@app.cell
+def _(mo, new_memory_db):
+    # Get table schema information using DuckDB's internal system tables
+    memory_schema = new_memory_db.execute("""
+        SELECT column_name, data_type, is_nullable
+        FROM information_schema.columns
+        WHERE table_name = 'users_memory'
+        ORDER BY ordinal_position
+    """).df()
+    # Display the schema using marimo's UI components
+    mo.md("### 🔍 Table Schema Information")
+    return (memory_schema,)
+@app.cell(hide_code=True)
+def _(memory_schema, mo):
+    mo.ui.table(memory_schema)
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    # [3. Inserting Data Into Tables](https://duckdb.org/docs/stable/sql/statements/insert)
+    DuckDB supports multiple ways to insert data:
+    1. **INSERT INTO VALUES**: Insert specific values
+    2. **INSERT INTO SELECT**: Insert data from query results
+    3. **Parameterized inserts**: Using prepared statements
+    4. **Bulk inserts**: For efficient loading of multiple rows
+    Let's demonstrate these different insertion methods:
+    """
+    )
+    return
+@app.cell
+def _insert_user_data(date):
+    today = date.today()
+    # First check if records already exist to avoid duplicate key errors
+    def safe_insert(connection, table_name, data):
+        """
+        Safely insert data into a table by checking for existing IDs first
+        """
+        # Check which IDs already exist in the table
+        existing_ids = (
+            connection.execute(f"SELECT id FROM {table_name}")
+            .fetchdf()["id"]
+            .tolist()
+        )
+        # Filter out data with IDs that already exist
+        new_data = [record for record in data if record[0] not in existing_ids]
+        if not new_data:
+            print(
+                f"No new records to insert into {table_name}. All IDs already exist."
+            )
+            return 0
+        # Prepare the placeholders for the SQL statement
+        placeholders = ", ".join(
+            ["(" + ", ".join(["?"] * len(new_data[0])) + ")"] * len(new_data)
+        )
+        # Flatten the list of tuples for parameter binding
+        flat_data = [item for sublist in new_data for item in sublist]
+        # Perform the insertion
+        if flat_data:
+            columns = "(id, name, age, email, registration_date, last_login, account_balance)"
+            connection.execute(
+                f"INSERT INTO {table_name} {columns} VALUES {placeholders}",
+                flat_data,
+            )
+            return len(new_data)
+        return 0
+    return (safe_insert,)
+@app.cell
+def _():
+    # Prepare the data
+    user_data = [
+        (
+            1,
+            "Alice",
+            25,
+            "alice@example.com",
+            "2021-01-01",
+            "2023-01-15 14:30:00",
+            1250.75,
+        ),
+        (
+            2,
+            "Bob",
+            30,
+            "bob@example.com",
+            "2021-02-01",
+            "2023-02-10 09:15:22",
+            750.50,
+        ),
+        (
+            3,
+            "Charlie",
+            35,
+            "charlie@example.com",
+            "2021-03-01",
+            "2023-03-05 17:45:10",
+            3200.25,
+        ),
+        (
+            4,
+            "David",
+            40,
+            "david@example.com",
+            "2021-04-01",
+            "2023-04-20 10:30:45",
+            1800.00,
+        ),
+        (
+            5,
+            "Emma",
+            45,
+            "emma@example.com",
+            "2021-05-01",
+            "2023-05-12 11:20:30",
+            2500.00,
+        ),
+        (
+            6,
+            "Frank",
+            50,
+            "frank@example.com",
+            "2021-06-01",
+            "2023-06-18 16:10:15",
+            900.25,
+        ),
+    ]
+    return (user_data,)
+@app.cell
+def _(mo, new_memory_db, safe_insert, user_data):
+    # Safely insert data into memory database
+    records_inserted = safe_insert(new_memory_db, "users_memory", user_data)
+    mo.md(
+        f"""
+        Inserted {records_inserted} new records into users_memory.
+        """
+    )
+    return
+@app.cell
+def _(file_db, safe_insert, user_data):
+    def _():
+        # Safely insert data into file database
+        records_inserted = safe_insert(file_db, "users_file", user_data)
+        return print(f"Inserted {records_inserted} new records into users_file")
+    _()
+    return
+@app.cell
+def _():
+    # If you need to add just one record, you can use a similar approach:
+    new_user = (
+        7,
+        "Grace",
+        28,
+        "grace@example.com",
+        "2021-07-01",
+        "2023-07-22 13:45:10",
+        1675.50,
+    )
+    return (new_user,)
+@app.cell
+def _(new_memory_db, new_user):
+    # Check if the ID exists before inserting
+    if not new_memory_db.execute(
+        "SELECT id FROM users_memory WHERE id = ?", [new_user[0]]
+    ).fetchone():
+        new_memory_db.execute(
+            """
+            INSERT INTO users_memory (id, name, age, email, registration_date, last_login, account_balance)
+            VALUES (?, ?, ?, ?, ?, ?, ?)
+            """,
+            new_user,
+        )
+        print(f"Added user {new_user[1]} to users_memory")
+    else:
+        print(f"User with ID {new_user[0]} already exists in users_memory")
+    return
+@app.cell
+def _(file_db, new_user):
+    # Do the same for the file database
+    if not file_db.execute(
+        "SELECT id FROM users_file WHERE id = ?", [new_user[0]]
+    ).fetchone():
+        file_db.execute(
+            """
+            INSERT INTO users_file (id, name, age, email, registration_date, last_login, account_balance)
+            VALUES (?, ?, ?, ?, ?, ?, ?)
+            """,
+            new_user,
+        )
+        print(f"Added user {new_user[1]} to users_file")
+    else:
+        print(f"User with ID {new_user[0]} already exists in users_file")
+    return
+@app.cell
+def _(new_memory_db):
+    # First try to update
+    cursor = new_memory_db.execute(
+        """
+        UPDATE users_memory
+        SET name = ?, age = ?, email = ?,
+            registration_date = ?, last_login = ?, account_balance = ?
+        WHERE id = ?
+        """,
+        (
+            "Henry",
+            33,
+            "henry@example.com",
+            "2021-08-01",
+            "2023-08-05 09:10:15",
+            3100.75,
+            8,  # ID should be the last parameter
+        ),
+    )
+    return (cursor,)
+@app.cell
+def _(cursor, mo, new_memory_db):
+    # If no rows were updated, perform an insert
+    if cursor.rowcount == 0:
+        new_memory_db.execute(
+            """
+            INSERT INTO users_memory
+            (id, name, age, email, registration_date, last_login, account_balance)
+            VALUES (?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                8,
+                "Henry",
+                33,
+                "henry@example.com",
+                "2021-08-01",
+                "2023-08-05 09:10:15",
+                3100.75,
+            ),
+        )
+    mo.md(
+        f"""
+        Upserted Henry into users_memory.
+        """
+    )
+    return
+@app.cell
+def _(file_db, mo):
+    # For DuckDB using ON CONFLICT, we need to specify the conflict target column
+    file_db.execute(
+        """
+        INSERT INTO users_file (id, name, age, email, registration_date, last_login, account_balance)
+        VALUES (?, ?, ?, ?, ?, ?, ?)
+        ON CONFLICT (id) DO UPDATE SET
+            name = EXCLUDED.name,
+            age = EXCLUDED.age,
+            email = EXCLUDED.email,
+            registration_date = EXCLUDED.registration_date,
+            last_login = EXCLUDED.last_login,
+            account_balance = EXCLUDED.account_balance
+        """,
+        (
+            8,
+            "Henry",
+            33,
+            "henry@example.com",
+            "2021-08-01",
+            "2023-08-05 09:10:15",
+            3100.75,
+        ),
+    )
+    mo.md(
+        f"""
+        Upserted Henry into users_file.
+        """
+    )
+    return
+@app.cell
+def _view_tables_after_insert(new_memory_db):
+    # Display memory data using DuckDB's query capabilities
+    memory_results = new_memory_db.execute("""
+        SELECT
+            id,
+            name,
+            age,
+            email,
+            registration_date,
+            last_login,
+            account_balance
+        FROM users_memory
+        ORDER BY id
+    """).df()
+    return (memory_results,)
+@app.cell
+def _(file_db):
+    # Display file data with formatting
+    file_results = file_db.execute("""
+        SELECT
+            id,
+            name,
+            age,
+            email,
+            registration_date,
+            last_login,
+            CAST(account_balance AS DECIMAL(10,2)) AS account_balance
+        FROM users_file
+        ORDER BY id
+    """).df()
+    return (file_results,)
+@app.cell
+def _(mo):
+    mo.md(
+        r"""
+    <!-- Create an interactive display with tabs using marimo components -->
+    ## 📊 Database Contents After Insertion
+    """
+    )
+    return
+@app.cell(hide_code=True)
+def _(file_results, memory_results, mo):
+    tabs = mo.ui.tabs(
+        {
+            "In-Memory Database": mo.ui.table(memory_results),
+            "File-Based Database": mo.ui.table(file_results),
+        }
+    )
+    tabs
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    # [4. Using SQL Directly in Marimo](https://duckdb.org/docs/stable/sql/query_syntax/select)
+    There are multiple ways to leverage DuckDB's SQL capabilities in marimo:
+    1. **Direct execution**: Using DuckDB connections to execute SQL
+    2. **Marimo SQL**: Using Marimo's built-in SQL engine
+    3. **Interactive queries**: Combining UI elements with SQL execution
+    Let's explore these approaches:
+    """
+    )
+    return
+@app.cell(hide_code=True)
+def _sql_with_marimo(mo):
+    mo.md(
+        rf"""
+    <!-- Using Marimo's SQL engine with direct SQL on memory_results DataFrame -->
+    ## 🔍 Query with Marimo SQL
+    """
+    )
+    return
 @app.cell(hide_code=True)
+def _(mo):
     mo.md(
+        rf"""
+    ## Marimo has its own built-in SQL engine that can work with DataFrames.
+    Let's use it to filter our users:
+    """
+    )
+    return
+@app.cell
+def _(mo):
+    # Create a SQL selector for users with age threshold
+    age_threshold = mo.ui.slider(25, 50, value=30, label="Minimum Age")
+    return (age_threshold,)
+@app.cell
+def _(age_threshold, memory_results, mo):
+    # Create a function to filter users based on the slider value
+    def filtered_users():
+        # Use DuckDB directly instead of mo.sql with users param
+        filtered_df = memory_results[memory_results["age"] >= age_threshold.value]
+        filtered_df = filtered_df.sort_values("age")
+        return mo.ui.table(filtered_df)
+    return (filtered_users,)
+@app.cell
+def _(age_threshold, filtered_users, mo):
+    layout = mo.vstack(
+        [
+            mo.md("### Select minimum age:"),
+            age_threshold,
+            mo.md("### Users meeting age criteria:"),
+            filtered_users(),
+        ],
+        gap=1.5,
+    )
+    layout
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""# [5. Working with Polars and DuckDB](https://duckdb.org/docs/stable/guides/python/polars.html)""")
+    return
+@app.cell
+def _polars_integration(pl):
+    # Create a Polars DataFrame
+    polars_df = pl.DataFrame(
+        {
+            "id": [101, 102, 103],
+            "name": ["Product A", "Product B", "Product C"],
+            "price": [29.99, 49.99, 19.99],
+            "category": ["Electronics", "Furniture", "Books"],
+        }
+    )
+    return (polars_df,)
+@app.cell
+def _(mo):
+    mo.md(
+        rf"""
+    <!-- Display the Polars DataFrame -->
+    ## Original Polars DataFrame:
+    """
     )
     return
+@app.cell
+def _(mo, polars_df):
+    mo.ui.table(polars_df)
+    return
+@app.cell
+def _(new_memory_db, polars_df):
+    # Register the Polars DataFrame as a DuckDB table in memory connection
+    new_memory_db.register("products_polars", polars_df)
+    # Query the registered table
+    polars_query_result = new_memory_db.execute(
+        "SELECT * FROM products_polars WHERE price > 25"
+    ).df()
+    return (polars_query_result,)
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
         r"""
+    <!-- Display the query result -->
+    ## DuckDB Query Result (From Polars Data):
+    """
     )
     return
 @app.cell
+def _(mo, polars_query_result):
+    mo.ui.table(polars_query_result)
+    return
+@app.cell
+def _(mo, new_memory_db):
+    # Demonstrate a more complex query
+    complex_query_result = new_memory_db.execute("""
+        SELECT
+            category,
+            COUNT(*) as product_count,
+            AVG(price) as avg_price,
+            MIN(price) as min_price,
+            MAX(price) as max_price
+        FROM products_polars
+        GROUP BY category
+        ORDER BY avg_price DESC
+    """).df()
+    mo.md("## Aggregated Product Data by Category:")
+    return (complex_query_result,)
+@app.cell
+def _(complex_query_result, mo):
+    mo.ui.table(complex_query_result)
+    return
 @app.cell(hide_code=True)
 def _(mo):
+    mo.md(r"""# [6. Advanced Queries: Joins Between Tables](https://duckdb.org/docs/stable/guides/performance/join_operations.html)""")
     return
 @app.cell
+def _join_operations(new_memory_db):
+    # Create another table to join with
+    new_memory_db.execute("""
+    CREATE TABLE IF NOT EXISTS departments (
         id INTEGER,
+        department_name VARCHAR,
+        manager_id INTEGER
     )
+    """)
+    return
+@app.cell
+def _(new_memory_db):
+    new_memory_db.execute("""
+    INSERT INTO departments VALUES
+    (101, 'Engineering', 1),
+    (102, 'Marketing', 2),
+    (103, 'Finance', NULL)
+    """)
+    return
+@app.cell
+def _(new_memory_db):
+    # Execute a join query
+    join_result = new_memory_db.execute("""
+    SELECT
+        u.id,
+        u.name,
+        u.age,
+        d.department_name
+    FROM users_memory u
+    LEFT JOIN departments d ON u.id = d.manager_id
+    ORDER BY u.id
+    """).df()
+    return (join_result,)
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        rf"""
+    <!-- Display the join result -->
+    ## Join Result (Users and Departments):
     """
     )
     return
+@app.cell
+def _(join_result, mo):
+    mo.ui.table(join_result)
+    return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        rf"""
+    <!-- Demonstrate different types of joins -->
+    ## Different Types of Joins
+    """
     )
     return
 @app.cell
+def _(new_memory_db):
+    # Inner join
+    inner_join = new_memory_db.execute("""
+    SELECT u.id, u.name, d.department_name
+    FROM users_memory u
+    INNER JOIN departments d ON u.id = d.manager_id
+    """).df()
+    # Right join
+    right_join = new_memory_db.execute("""
+    SELECT u.id, u.name, d.department_name
+    FROM users_memory u
+    RIGHT JOIN departments d ON u.id = d.manager_id
+    """).df()
+    # Full outer join
+    full_join = new_memory_db.execute("""
+    SELECT u.id, u.name, d.department_name
+    FROM users_memory u
+    FULL OUTER JOIN departments d ON u.id = d.manager_id
+    """).df()
+    return full_join, inner_join, right_join
+@app.cell
+def _(full_join, inner_join, join_result, mo, right_join):
+    join_tabs = mo.ui.tabs(
+        {
+            "Left Join": mo.ui.table(join_result),
+            "Inner Join": mo.ui.table(inner_join),
+            "Right Join": mo.ui.table(right_join),
+            "Full Outer Join": mo.ui.table(full_join),
+        }
     )
+    join_tabs
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""# [7. Aggregate Functions in DuckDB](https://duckdb.org/docs/stable/sql/functions/aggregates.html)""")
     return
+@app.cell
+def _aggregate_operations(new_memory_db):
+    # Execute an aggregate query
+    agg_result = new_memory_db.execute("""
+    SELECT
+        AVG(age) as avg_age,
+        MAX(age) as max_age,
+        MIN(age) as min_age,
+        COUNT(*) as total_users,
+        SUM(account_balance) as total_balance
+    FROM users_memory
+    """).df()
+    return (agg_result,)
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        rf"""
+    <!-- Display the aggregate result -->
+    ## Aggregate Results (All Users):
+    """
     )
     return
 @app.cell
+def _(agg_result, mo):
+    mo.ui.table(agg_result)
+    return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        rf"""
+    <!-- More complex aggregate query with grouping -->
+    ## Aggregate Results (Grouped by Age Range):
+    """
     )
     return
 @app.cell
+def _(new_memory_db):
+    age_groups = new_memory_db.execute("""
+    SELECT
+        CASE
+            WHEN age < 30 THEN 'Under 30'
+            WHEN age BETWEEN 30 AND 40 THEN '30 to 40'
+            ELSE 'Over 40'
+        END as age_group,
+        COUNT(*) as count,
+        AVG(age) as avg_age,
+        AVG(account_balance) as avg_balance
+    FROM users_memory
+    GROUP BY 1
+    ORDER BY 1
+    """).df()
+    return (age_groups,)
+@app.cell
+def _(age_groups, mo):
+    mo.ui.table(age_groups)
+    return
+@app.cell
+def _(mo):
+    mo.md(
+        r"""
+    <!-- Window functions demo -->
+    ### Window Functions Example:
+    """
     )
+    return
+@app.cell
+def _(mo, new_memory_db):
+    window_result = new_memory_db.execute("""
+    SELECT
+        id,
+        name,
+        age,
+        account_balance,
+        RANK() OVER (ORDER BY account_balance DESC) as balance_rank,
+        account_balance - AVG(account_balance) OVER () as diff_from_avg,
+        account_balance / SUM(account_balance) OVER () * 100 as pct_of_total
+    FROM users_memory
+    ORDER BY balance_rank
+    """).df()
+    mo.ui.table(window_result)
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""# [8. Converting DuckDB Results to Polars/Pandas](https://duckdb.org/docs/stable/guides/python/polars.html)""")
+    return
+@app.cell
+def _convert_results(new_memory_db):
+    polars_result = new_memory_db.execute(
+        """SELECT * FROM users_memory WHERE age > 25 ORDER BY age"""
+    ).pl()
+    return (polars_result,)
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        r"""
+    <!-- Display the converted results -->
+    ## Query Result as Polars DataFrame:
+    """
     )
     return
 @app.cell
+def _(mo, polars_result):
+    mo.ui.table(polars_result)
+    return
+@app.cell
+def _(new_memory_db):
+    pandas_result = new_memory_db.execute(
+        """SELECT * FROM users_memory WHERE age > 25 ORDER BY age"""
+    ).fetch_df()
+    return (pandas_result,)
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""## Same Query Result as Pandas DataFrame:""")
+    return
+@app.cell
+def _(mo, pandas_result):
+    mo.ui.table(pandas_result)
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    <!-- Demonstrate the differences in handling -->
+    ## Differences in DataFrame Handling
     """
     )
+    return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        r"""
+    <!-- Polars operation -->
+    ## Polars: Filter users over 35 and calculate average balance
+    """
     )
     return
 @app.cell
+def _(mo, pl, polars_result):
+    def _():
+        polars_filtered = polars_result.filter(pl.col("age") > 35)
+        polars_avg = polars_filtered.select(
+            pl.col("account_balance").mean().alias("avg_balance")
+        )
+        layout = mo.vstack(
+            [
+                mo.md("### Filtered Polars DataFrame (Age > 35):"),
+                mo.ui.table(polars_filtered),
+                mo.md("### Average Account Balance:"),
+                mo.ui.table(polars_avg),
+            ],
+            gap=1.5,
+        )
+        return layout
+    _()
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(
+        r"""
+    <!-- Pandas equivalent (using pandas style) -->
+    ## Pandas: Same operation in pandas style
     """
     )
+    return
+@app.cell
+def _(mo, pandas_result):
+    pandas_avg = pandas_result[pandas_result["age"] > 35]["account_balance"].mean()
+    mo.md(f"Average balance: {pandas_avg:.2f}")
+    return
+@app.cell(hide_code=True)
+def _(mo):
+    mo.md("""## 9. Data Visualization with DuckDB and Plotly""")
+    return
+@app.cell
+def _(age_groups, mo, new_memory_db, plotly_express):
+    # User distribution by age group
+    fig1 = plotly_express.bar(
+        age_groups,
+        x="age_group",
+        y="count",
+        title="User Distribution by Age Group",
+        labels={"count": "Number of Users", "age_group": "Age Group"},
+        color="age_group",
+        color_discrete_sequence=plotly_express.colors.qualitative.Plotly,
+    )
+    fig1.update_traces(
+        text=age_groups["count"],
+        textposition="outside",
+    )
+    fig1.update_layout(height=450, margin=dict(t=50, b=50))
+    # Average balance by age group
+    fig2 = plotly_express.bar(
+        age_groups,
+        x="age_group",
+        y="avg_balance",
+        title="Average Account Balance by Age Group",
+        labels={"avg_balance": "Average Balance ($)", "age_group": "Age Group"},
+        color="age_group",
+        color_discrete_sequence=plotly_express.colors.qualitative.Plotly,
+    )
+    fig2.update_traces(
+        text=[f"${val:.2f}" for val in age_groups["avg_balance"]],
+        textposition="outside",
+    )
+    fig2.update_layout(height=450, margin=dict(t=50, b=50))
+    # Age vs Account Balance scatter plot
+    scatter_data = new_memory_db.execute(
+        """
+        SELECT
+            name,
+            age,
+            account_balance
+        FROM users_memory
+        ORDER BY age
+        """
+    ).df()
+    fig3 = plotly_express.scatter(
+        scatter_data,
+        x="age",
+        y="account_balance",
+        title="Age vs. Account Balance",
+        labels={"account_balance": "Account Balance ($)", "age": "Age"},
+        color_discrete_sequence=["#FF7F0E"],
+        trendline="ols",
+        hover_data=["age", "account_balance"],
+        size_max=15,
+    )
+    fig3.update_traces(marker=dict(size=12))
+    fig3.update_layout(height=450, margin=dict(t=50, b=50))
+    # Distribution of account balances
+    balance_data = new_memory_db.execute(
+        """
+        SELECT
+            name,
+            account_balance
+        FROM users_memory
+        ORDER BY account_balance DESC
+        """
+    ).df()
+    fig4 = plotly_express.pie(
+        balance_data,
+        names="name",
+        values="account_balance",
+        title="Distribution of Account Balances",
+        labels={"account_balance": "Account Balance ($)", "name": "User"},
+        color_discrete_sequence=plotly_express.colors.qualitative.Pastel,
+    )
+    fig4.update_traces(textinfo="percent+label", textposition="inside")
+    fig4.update_layout(height=450, margin=dict(t=50, b=50))
+    category_tabs = mo.ui.tabs(
+        {
+            "Age Group Analysis": mo.vstack(
+                [
+                    mo.ui.tabs(
+                        {
+                            "User Distribution": mo.ui.plotly(fig1),
+                            "Average Balance": mo.ui.plotly(fig2),
+                        }
+                    )
+                ]
+            ),
+            "Financial Analysis": mo.vstack(
+                [
+                    mo.ui.tabs(
+                        {
+                            "Age vs Balance": mo.ui.plotly(fig3),
+                            "Balance Distribution": mo.ui.plotly(fig4),
+                        }
+                    )
+                ]
+            ),
+        },
+        lazy=True,
+    )
+    mo.vstack(
+        [
+            mo.md("### Select a visualization category:"),
+            category_tabs,
+        ],
+        gap=1.5,
+    )
+    return
 @app.cell(hide_code=True)
 def _(mo):
     mo.md(
+        r"""
+    # [9. Database Management Best Practices]
+    ### Closing Connections
+    It's important to close database connections when you're done with them, especially for file-based connections:
+    ```python
+    memory_db.close()
+    file_db.close()
+    ```
+    ### Transaction Management
+    DuckDB supports transactions, which can be useful for more complex operations:
+    ```python
+    conn = duckdb.connect('mydb.db')
+    conn.begin()  # Start transaction
+    try:
+        conn.execute("INSERT INTO users VALUES (1, 'Test User')")
+        conn.execute("UPDATE balances SET amount = amount - 100 WHERE user_id = 1")
+        conn.commit()  # Commit changes
+    except:
+        conn.rollback()  # Undo changes if error
+        raise
+    ```
+    ### Query Performance
+    DuckDB is optimized for analytical queries. For best performance:
+    - Use appropriate data types
+    - Create indexes for frequently queried columns
+    - For large datasets, consider partitioning
+    - Use prepared statements for repeated queries
+    """
     )
     return
+@app.cell(hide_code=True)
+def _interactive_dashboard(mo):
+    mo.md(rf"""## 10. Interactive DuckDB Dashboard with Marimo and Plotly""")
+    return
+@app.cell
+def _(mo):
+    # Create an interactive filter for age range
+    min_age = mo.ui.slider(20, 50, value=25, label="Minimum Age")
+    max_age = mo.ui.slider(20, 50, value=50, label="Maximum Age")
+    return max_age, min_age
+@app.cell
+def _(max_age, min_age, new_memory_db):
+    # Create a function to filter data and update visualizations
+    def get_filtered_data(min_val=min_age.value, max_val=max_age.value):
+        # Get filtered data based on slider values using parameterized query for safety
+        return new_memory_db.execute(
+            """
+            SELECT
+                id,
+                name,
+                age,
+                email,
+                account_balance,
+                registration_date
+            FROM users_memory
+            WHERE age >= ? AND age <= ?
+            ORDER BY age
+            """,
+            [min_val, max_val],
+        ).df()
+    return (get_filtered_data,)
+@app.cell
+def _(get_filtered_data):
+    def get_metrics(data=get_filtered_data()):
+        return {
+            "user count": len(data),
+            "avg_balance": data["account_balance"].mean() if len(data) > 0 else 0,
+            "total_balance": data["account_balance"].sum() if len(data) > 0 else 0,
+        }
+    return (get_metrics,)
+@app.cell
+def _(get_metrics, mo):
+    def metrics_display(metrics=get_metrics()):
+        return mo.hstack(
+            [
+                mo.vstack(
+                    [
+                        mo.md("### Selected Users"),
+                        mo.md(f"## {metrics['user count']}"),
+                    ],
+                    align="center",
+                ),
+                mo.vstack(
+                    [
+                        mo.md("### Average Balance"),
+                        mo.md(f"## ${metrics['avg_balance']:.2f}"),
+                    ],
+                    align="center",
+                ),
+                mo.vstack(
+                    [
+                        mo.md("### Total Balance"),
+                        mo.md(f"## ${metrics['total_balance']:.2f}"),
+                    ],
+                    align="center",
+                ),
+            ],
+            justify="space-between",
+            gap=1.5,
+        )
+    return (metrics_display,)
+@app.cell
+def _(get_filtered_data, max_age, min_age, mo, plotly_express):
+    def create_visualization(
+        data=get_filtered_data(), min_val=min_age.value, max_val=max_age.value
+    ):
+        if len(data) == 0:
+            return mo.ui.text("No data available for the selected age range.")
+        # Create visualizations for filtered data
+        fig1 = plotly_express.bar(
+            data,
+            x="name",
+            y="account_balance",
+            title=f"Account Balance by User (Age {min_val} - {max_val})",
+            labels={"account_balance": "Account Balance ($)", "name": "User"},
+            color="account_balance",
+            color_continuous_scale=plotly_express.colors.sequential.Plasma,
+            text_auto=".2s",
+        )
+        fig1.update_layout(
+            height=400,
+            xaxis_tickangle=-45,
+            margin=dict(t=50, b=70, l=50, r=30),
+        )
+        fig1.update_traces(
+            textposition="outside",
+        )
+        fig2 = plotly_express.histogram(
+            data,
+            x="age",
+            nbins=min(10, len(set(data["age"]))),
+            title=f"Age Distribution (Age {min_val} - {max_val})",
+            color_discrete_sequence=["#4C78A8"],
+            opacity=0.8,
+            histnorm="probability density",
+        )
+        fig2.update_layout(
+            height=400,
+            margin=dict(t=50, b=70, l=50, r=30),
+            bargap=0.1,
+        )
+        fig3 = plotly_express.scatter(
+            data,
+            x="age",
+            y="account_balance",
+            title=f"Age vs. Account Balance (Age {min_val} - {max_val})",
+            labels={"account_balance": "Account Balance ($)", "age": "Age"},
+            color="age",
+            color_continuous_scale="Viridis",
+            size_max=25,
+            size="account_balance",
+            hover_name="name",
+        )
+        fig3.update_layout(
+            height=400,
+            margin=dict(t=50, b=70, l=50, r=30),
+        )
+        return mo.ui.tabs(
+            {
+                "Account Balance by User": mo.ui.plotly(fig1),
+                "Age Distribution": mo.ui.plotly(fig2),
+                "Age vs. Account Balance": mo.ui.plotly(fig3),
+            }
+        )
+    return (create_visualization,)
 @app.cell
+def _(
+    create_visualization,
+    get_filtered_data,
+    max_age,
+    metrics_display,
+    min_age,
+    mo,
+):
+    def dashboard(
+        min_val=min_age.value,
+        max_val=max_age.value,
+        metrics=metrics_display(),
+        data=get_filtered_data(),
+        visualization=create_visualization()
+    ):
+        return mo.vstack(
+            [
+                mo.md(f"### Interactive Dashboard (Age {min_val} - {max_val})"),
+                metrics,
+                mo.md("### Data Table"),
+                mo.ui.table(data, page_size=5),
+                mo.md("### Visualizations"),
+                visualization,
+            ],
+            gap=2
+        )
+    dashboard()
+    return
+@app.cell(hide_code=True)
+def _conclusion(mo):
+    mo.md(
+        rf"""
+    # Summary and Key Takeaways
+    In this notebook, we've explored DuckDB, a powerful embedded analytical database system. Here's what we covered:
+    1. **Connection types**: We learned the difference between in-memory databases (temporary) and file-based databases (persistent).
+    2. **Table creation**: We created tables with various data types, constraints, and primary keys.
+    3. **Data insertion**: We demonstrated different ways to insert data, including single inserts and bulk loading.
+    4. **SQL queries**: We executed various SQL queries directly and through Marimo's UI components.
+    5. **Integration with Polars**: We showed how DuckDB can work seamlessly with Polars DataFrames.
+    6. **Joins and relationships**: We performed JOIN operations between tables to combine related data.
+    7. **Aggregation**: We used aggregate functions to summarize and analyze data.
+    8. **Data conversion**: We converted DuckDB results to both Polars and Pandas DataFrames.
+    9. **Best practices**: We reviewed best practices for managing DuckDB connections and transactions.
+    10. **Visualization**: We created interactive visualizations and dashboards with Plotly and Marimo.
+    DuckDB is an excellent tool for data analysis, especially for analytical workloads. Its in-process nature makes it fast and easy to use, while its SQL compatibility makes it accessible for anyone familiar with SQL databases.
+    ### Next Steps
+    - Try loading larger datasets into DuckDB
+    - Experiment with more complex queries and window functions
+    - Use DuckDB's COPY functionality to import/export data from/to files
+    - Create more advanced interactive dashboards with Marimo and Plotly
+    """
+    )
+    return
 @app.cell(hide_code=True)
     import marimo as mo
     import duckdb
     import polars as pl
+    import os
+    from datetime import date
+    import plotly.express as plotly_express
+    import plotly.graph_objects as plotly_graph_objects
+    import numpy as np
+    return date, duckdb, mo, os, pl, plotly_express
 if __name__ == "__main__":