Gradio Demo Architecture
This document describes the architecture of the multi-agent Gradio application with two main workflows: Shop Drawing Comparison and Everything File Search.
System Architecture Diagram
graph TB
subgraph "Gradio UI Layer"
UI[Gradio Web Interface]
CompUI[Shop Drawing Comparison UI]
SearchUI[Everything File Search UI]
end
subgraph "Main Application - gradio_demo.py"
HandleUser[handle_user_openhands]
HandleSearch[handle_search_interaction]
CreateReport[create_comparison_report]
ExtractSchedule[extract_schedule_section]
ChatFunc[chat_with_functions]
end
subgraph "Agent Executor - openhands_poc/executor.py"
Executor[AgentExecutor]
ClassifyCompare[classify_and_compare_with_sdk]
SearchSDK[search_with_sdk]
end
subgraph "Shop Drawing Comparison - SDK Agents"
direction TB
ReceptionistSDK[ReceptionistAgentSDK]
ExpertSDK[ExpertAgentSDK]
FileProc[file_processor.py]
ReceptionistSDK -->|delegates to| ExpertSDK
ExpertSDK -->|extracts sections| FileProc
end
subgraph "Everything File Search - SDK Agents"
direction TB
SearchAgent[SearchAgentSDK]
VerifierAgent[VerifierAgent]
MCPClient[FastMCP Client]
EverythingServer[Everything MCP Server]
SearchAgent -->|spawns parallel| VerifierAgent
SearchAgent -->|uses| MCPClient
MCPClient -->|stdio transport| EverythingServer
end
subgraph "External Services"
AzureOAI[Azure OpenAI API]
EverythingSDK[Everything SDK DLL]
end
subgraph "File Processing"
PDFExtract[PyMuPDF - PDF]
DOCXExtract[python-docx - DOCX]
OCRExtract[Tesseract OCR - PNG]
end
subgraph "Output Generation"
ExcelGen[openpyxl - Excel Report]
LogGen[Structured Logging]
end
UI --> CompUI
UI --> SearchUI
CompUI --> HandleUser
SearchUI --> HandleSearch
HandleUser --> Executor
HandleSearch --> Executor
Executor --> ClassifyCompare
Executor --> SearchSDK
ClassifyCompare --> ReceptionistSDK
ReceptionistSDK --> AzureOAI
ReceptionistSDK --> PDFExtract
ReceptionistSDK --> DOCXExtract
ReceptionistSDK --> OCRExtract
ExpertSDK --> AzureOAI
ExpertSDK --> FileProc
SearchSDK --> SearchAgent
SearchAgent --> MCPClient
SearchAgent --> VerifierAgent
VerifierAgent --> AzureOAI
EverythingServer --> EverythingSDK
ExpertSDK -->|markdown table| CreateReport
CreateReport --> ExcelGen
ReceptionistSDK --> LogGen
ExpertSDK --> LogGen
SearchAgent --> LogGen
Workflow 1: Shop Drawing Comparison
sequenceDiagram
actor User
participant UI as Gradio UI
participant Handler as handle_user_openhands
participant Executor as AgentExecutor
participant Recept as ReceptionistAgentSDK
participant Expert as ExpertAgentSDK
participant FileProc as file_processor
participant Azure as Azure OpenAI
participant Excel as create_comparison_report
User->>UI: Upload 3 files (Schedule, Drawing, Spec)
UI->>Handler: File paths
Handler->>Executor: classify_and_compare_with_sdk()
Executor->>Recept: classify() with file snippets
Recept->>Azure: LLM call - classify equipment type
Azure-->>Recept: classification (e.g., "boiler")
Recept-->>Executor: classification
Executor->>Expert: compare() with classification & files
Expert->>FileProc: extract_relevant_sections(keyword)
FileProc-->>Expert: filtered text (70-90% reduction)
Expert->>Azure: LLM call - generate comparison table
Azure-->>Expert: markdown table (6 columns)
Expert-->>Executor: comparison_table + metadata
Executor-->>Handler: comparison_table + metadata
Handler->>Excel: create_comparison_report()
Excel->>Excel: Parse markdown → Excel rows
Excel-->>Handler: .xlsx file path
Handler-->>UI: Excel file + metadata
UI-->>User: Download link + logs
Workflow 2: Everything File Search
sequenceDiagram
actor User
participant UI as Gradio Chatbot
participant Handler as handle_search_interaction
participant Executor as AgentExecutor
participant SearchAgent as SearchAgentSDK
participant MCP as FastMCP Client
participant Everything as Everything MCP Server
participant Verifier as VerifierAgent (Parallel)
participant Azure as Azure OpenAI
User->>UI: Enter search query + verification settings
UI->>Handler: query, history, enable_verification, target_count
Handler->>Executor: search_with_sdk()
alt Verification Enabled
Executor->>SearchAgent: search_with_verification()
else No Verification
Executor->>SearchAgent: search()
end
SearchAgent->>Azure: Parse query → Everything syntax
Azure-->>SearchAgent: search_query (e.g., "<boiler> <shop|drawing>")
SearchAgent->>MCP: call_tool("search", query)
MCP->>Everything: Execute search
Everything-->>MCP: file paths (max 100)
MCP-->>SearchAgent: search results
alt Verification Enabled
SearchAgent->>SearchAgent: Create VerifierAgent pool (5 workers)
loop Wave 1-N (15 files per wave)
par Parallel Verification
SearchAgent->>Verifier: verify(file_1)
SearchAgent->>Verifier: verify(file_2)
SearchAgent->>Verifier: verify(file_N)
end
Verifier->>FileProc: extract snippet
Verifier->>Azure: Check relevance
Azure-->>Verifier: confidence + reasoning
Verifier-->>SearchAgent: match status
alt Target reached
SearchAgent->>SearchAgent: Early stop
end
end
SearchAgent-->>Executor: verified matches + metadata
else No Verification
SearchAgent-->>Executor: raw results + metadata
end
Executor-->>Handler: response + metadata
Handler-->>UI: Updated chat history
UI-->>User: Display results
Key Components
1. Shop Drawing Comparison Agents
ReceptionistAgentSDK (
openhands_poc/agents/receptionist_sdk.py)- Classifies documents by equipment type
- Analyzes snippets (2000 chars) for efficiency
- Returns: classification keyword or error state
ExpertAgentSDK (
openhands_poc/agents/expert_sdk.py)- Generates comparison tables
- Uses smart extraction (70-90% token reduction)
- Returns: 6-column markdown table
CustomPromptAgent (
openhands_poc/agents/custom_agent.py)- Base class for all SDK agents
- Loads system prompts from local
prompts/folder
2. Everything File Search Agents
SearchAgentSDK (
openhands_poc/agents/search_sdk.py)- Translates natural language to Everything syntax
- Manages MCP connection via FastMCP
- Supports iterative refinement (max 3 rounds)
- Optional parallel verification
VerifierAgent (
openhands_poc/agents/verifier_agent.py)- Lightweight document relevance checker
- Parallel execution (ThreadPoolExecutor, 5 workers)
- Early stopping when target count reached
- Returns: confidence scores + reasoning
3. File Processing
- file_processor.py (
openhands_poc/utils/file_processor.py)- Smart keyword-based extraction
- Multi-format support (PDF, DOCX, PNG)
- Context window management
- Extraction statistics
4. Output Generation
- create_comparison_report() (
gradio_demo.py)- Parses markdown table with literal
\nconversion - Handles Rich console formatting artifacts
- Creates Excel with proper row/column layout
- Adds blank row after header for readability
- Parses markdown table with literal
5. MCP Integration
- FastMCP Client with StdioTransport
- Launches Everything MCP server as subprocess
- Tool argument wrapping:
{"base": {...}} - Custom environment variables
- Async context manager pattern
Configuration
Environment Variables (.env)
AZURE_OPENAI_API_KEY=<azure-openai-key>
AZURE_OPENAI_ENDPOINT=<endpoint-url>
AZURE_DEPLOYMENT_NAME=<deployment-name>
EVERYTHING_SDK_PATH=<path-to-everything-sdk-dll>
Agent Configuration
- LLM: Azure OpenAI with 8192 max tokens
- Temperature: 0.7
- Reasoning Effort: Low (minimize reasoning tokens)
- API Version: 2025-03-01-preview
Data Flow
Shop Drawing Comparison
- User uploads 3 files
- Receptionist analyzes snippets (2000 chars each)
- Classification returned (e.g., "boiler")
- Expert extracts relevant sections by keyword
- Full comparison table generated
- Markdown converted to Excel
- User downloads report + views logs
Everything File Search
- User enters search query
- SearchAgent translates to Everything syntax
- MCP server returns file paths
- (Optional) Verifier agents check relevance in parallel
- Results filtered to target count
- User sees matched files with confidence scores
Error Handling
- Classification Errors: "unsupported", "mismatched"
- SDK Fallback: Automatic fallback to legacy if SDK unavailable
- MCP Errors: Graceful handling with error messages
- File Processing: Timeout protection (30s snippets, unlimited full)
Logging
- Structured Logging (
openhands_poc/logging_config.py)- Agent lifecycle tracking
- LLM call logging
- File processing statistics
- Real-time log viewer in UI
Performance Optimizations
- Smart Extraction: 70-90% token reduction via keyword filtering
- Parallel Verification: 5 workers with early stopping
- Wave Processing: 15 files per wave for verification
- Connection Pooling: Persistent
requests.Session - Snippet-Based Classification: 2000 chars vs full document