Consulting_Assistant / ARCHITECTURE.md
DavisEdward's picture
Initial project structure and core files
948951c

Gradio Demo Architecture

This document describes the architecture of the multi-agent Gradio application with two main workflows: Shop Drawing Comparison and Everything File Search.

System Architecture Diagram

graph TB
    subgraph "Gradio UI Layer"
        UI[Gradio Web Interface]
        CompUI[Shop Drawing Comparison UI]
        SearchUI[Everything File Search UI]
    end

    subgraph "Main Application - gradio_demo.py"
        HandleUser[handle_user_openhands]
        HandleSearch[handle_search_interaction]
        CreateReport[create_comparison_report]
        ExtractSchedule[extract_schedule_section]
        ChatFunc[chat_with_functions]
    end

    subgraph "Agent Executor - openhands_poc/executor.py"
        Executor[AgentExecutor]
        ClassifyCompare[classify_and_compare_with_sdk]
        SearchSDK[search_with_sdk]
    end

    subgraph "Shop Drawing Comparison - SDK Agents"
        direction TB
        ReceptionistSDK[ReceptionistAgentSDK]
        ExpertSDK[ExpertAgentSDK]
        FileProc[file_processor.py]

        ReceptionistSDK -->|delegates to| ExpertSDK
        ExpertSDK -->|extracts sections| FileProc
    end

    subgraph "Everything File Search - SDK Agents"
        direction TB
        SearchAgent[SearchAgentSDK]
        VerifierAgent[VerifierAgent]
        MCPClient[FastMCP Client]
        EverythingServer[Everything MCP Server]

        SearchAgent -->|spawns parallel| VerifierAgent
        SearchAgent -->|uses| MCPClient
        MCPClient -->|stdio transport| EverythingServer
    end

    subgraph "External Services"
        AzureOAI[Azure OpenAI API]
        EverythingSDK[Everything SDK DLL]
    end

    subgraph "File Processing"
        PDFExtract[PyMuPDF - PDF]
        DOCXExtract[python-docx - DOCX]
        OCRExtract[Tesseract OCR - PNG]
    end

    subgraph "Output Generation"
        ExcelGen[openpyxl - Excel Report]
        LogGen[Structured Logging]
    end

    UI --> CompUI
    UI --> SearchUI

    CompUI --> HandleUser
    SearchUI --> HandleSearch

    HandleUser --> Executor
    HandleSearch --> Executor

    Executor --> ClassifyCompare
    Executor --> SearchSDK

    ClassifyCompare --> ReceptionistSDK
    ReceptionistSDK --> AzureOAI
    ReceptionistSDK --> PDFExtract
    ReceptionistSDK --> DOCXExtract
    ReceptionistSDK --> OCRExtract

    ExpertSDK --> AzureOAI
    ExpertSDK --> FileProc

    SearchSDK --> SearchAgent
    SearchAgent --> MCPClient
    SearchAgent --> VerifierAgent
    VerifierAgent --> AzureOAI

    EverythingServer --> EverythingSDK

    ExpertSDK -->|markdown table| CreateReport
    CreateReport --> ExcelGen

    ReceptionistSDK --> LogGen
    ExpertSDK --> LogGen
    SearchAgent --> LogGen

Workflow 1: Shop Drawing Comparison

sequenceDiagram
    actor User
    participant UI as Gradio UI
    participant Handler as handle_user_openhands
    participant Executor as AgentExecutor
    participant Recept as ReceptionistAgentSDK
    participant Expert as ExpertAgentSDK
    participant FileProc as file_processor
    participant Azure as Azure OpenAI
    participant Excel as create_comparison_report

    User->>UI: Upload 3 files (Schedule, Drawing, Spec)
    UI->>Handler: File paths
    Handler->>Executor: classify_and_compare_with_sdk()

    Executor->>Recept: classify() with file snippets
    Recept->>Azure: LLM call - classify equipment type
    Azure-->>Recept: classification (e.g., "boiler")
    Recept-->>Executor: classification

    Executor->>Expert: compare() with classification & files
    Expert->>FileProc: extract_relevant_sections(keyword)
    FileProc-->>Expert: filtered text (70-90% reduction)
    Expert->>Azure: LLM call - generate comparison table
    Azure-->>Expert: markdown table (6 columns)
    Expert-->>Executor: comparison_table + metadata

    Executor-->>Handler: comparison_table + metadata
    Handler->>Excel: create_comparison_report()
    Excel->>Excel: Parse markdown → Excel rows
    Excel-->>Handler: .xlsx file path
    Handler-->>UI: Excel file + metadata
    UI-->>User: Download link + logs

Workflow 2: Everything File Search

sequenceDiagram
    actor User
    participant UI as Gradio Chatbot
    participant Handler as handle_search_interaction
    participant Executor as AgentExecutor
    participant SearchAgent as SearchAgentSDK
    participant MCP as FastMCP Client
    participant Everything as Everything MCP Server
    participant Verifier as VerifierAgent (Parallel)
    participant Azure as Azure OpenAI

    User->>UI: Enter search query + verification settings
    UI->>Handler: query, history, enable_verification, target_count
    Handler->>Executor: search_with_sdk()

    alt Verification Enabled
        Executor->>SearchAgent: search_with_verification()
    else No Verification
        Executor->>SearchAgent: search()
    end

    SearchAgent->>Azure: Parse query → Everything syntax
    Azure-->>SearchAgent: search_query (e.g., "<boiler> <shop|drawing>")

    SearchAgent->>MCP: call_tool("search", query)
    MCP->>Everything: Execute search
    Everything-->>MCP: file paths (max 100)
    MCP-->>SearchAgent: search results

    alt Verification Enabled
        SearchAgent->>SearchAgent: Create VerifierAgent pool (5 workers)

        loop Wave 1-N (15 files per wave)
            par Parallel Verification
                SearchAgent->>Verifier: verify(file_1)
                SearchAgent->>Verifier: verify(file_2)
                SearchAgent->>Verifier: verify(file_N)
            end

            Verifier->>FileProc: extract snippet
            Verifier->>Azure: Check relevance
            Azure-->>Verifier: confidence + reasoning
            Verifier-->>SearchAgent: match status

            alt Target reached
                SearchAgent->>SearchAgent: Early stop
            end
        end

        SearchAgent-->>Executor: verified matches + metadata
    else No Verification
        SearchAgent-->>Executor: raw results + metadata
    end

    Executor-->>Handler: response + metadata
    Handler-->>UI: Updated chat history
    UI-->>User: Display results

Key Components

1. Shop Drawing Comparison Agents

  • ReceptionistAgentSDK (openhands_poc/agents/receptionist_sdk.py)

    • Classifies documents by equipment type
    • Analyzes snippets (2000 chars) for efficiency
    • Returns: classification keyword or error state
  • ExpertAgentSDK (openhands_poc/agents/expert_sdk.py)

    • Generates comparison tables
    • Uses smart extraction (70-90% token reduction)
    • Returns: 6-column markdown table
  • CustomPromptAgent (openhands_poc/agents/custom_agent.py)

    • Base class for all SDK agents
    • Loads system prompts from local prompts/ folder

2. Everything File Search Agents

  • SearchAgentSDK (openhands_poc/agents/search_sdk.py)

    • Translates natural language to Everything syntax
    • Manages MCP connection via FastMCP
    • Supports iterative refinement (max 3 rounds)
    • Optional parallel verification
  • VerifierAgent (openhands_poc/agents/verifier_agent.py)

    • Lightweight document relevance checker
    • Parallel execution (ThreadPoolExecutor, 5 workers)
    • Early stopping when target count reached
    • Returns: confidence scores + reasoning

3. File Processing

  • file_processor.py (openhands_poc/utils/file_processor.py)
    • Smart keyword-based extraction
    • Multi-format support (PDF, DOCX, PNG)
    • Context window management
    • Extraction statistics

4. Output Generation

  • create_comparison_report() (gradio_demo.py)
    • Parses markdown table with literal \n conversion
    • Handles Rich console formatting artifacts
    • Creates Excel with proper row/column layout
    • Adds blank row after header for readability

5. MCP Integration

  • FastMCP Client with StdioTransport
    • Launches Everything MCP server as subprocess
    • Tool argument wrapping: {"base": {...}}
    • Custom environment variables
    • Async context manager pattern

Configuration

Environment Variables (.env)

AZURE_OPENAI_API_KEY=<azure-openai-key>
AZURE_OPENAI_ENDPOINT=<endpoint-url>
AZURE_DEPLOYMENT_NAME=<deployment-name>
EVERYTHING_SDK_PATH=<path-to-everything-sdk-dll>

Agent Configuration

  • LLM: Azure OpenAI with 8192 max tokens
  • Temperature: 0.7
  • Reasoning Effort: Low (minimize reasoning tokens)
  • API Version: 2025-03-01-preview

Data Flow

Shop Drawing Comparison

  1. User uploads 3 files
  2. Receptionist analyzes snippets (2000 chars each)
  3. Classification returned (e.g., "boiler")
  4. Expert extracts relevant sections by keyword
  5. Full comparison table generated
  6. Markdown converted to Excel
  7. User downloads report + views logs

Everything File Search

  1. User enters search query
  2. SearchAgent translates to Everything syntax
  3. MCP server returns file paths
  4. (Optional) Verifier agents check relevance in parallel
  5. Results filtered to target count
  6. User sees matched files with confidence scores

Error Handling

  • Classification Errors: "unsupported", "mismatched"
  • SDK Fallback: Automatic fallback to legacy if SDK unavailable
  • MCP Errors: Graceful handling with error messages
  • File Processing: Timeout protection (30s snippets, unlimited full)

Logging

  • Structured Logging (openhands_poc/logging_config.py)
    • Agent lifecycle tracking
    • LLM call logging
    • File processing statistics
    • Real-time log viewer in UI

Performance Optimizations

  1. Smart Extraction: 70-90% token reduction via keyword filtering
  2. Parallel Verification: 5 workers with early stopping
  3. Wave Processing: 15 files per wave for verification
  4. Connection Pooling: Persistent requests.Session
  5. Snippet-Based Classification: 2000 chars vs full document