| # Gradio Demo Architecture | |
| This document describes the architecture of the multi-agent Gradio application with two main workflows: Shop Drawing Comparison and Everything File Search. | |
| ## System Architecture Diagram | |
| ```mermaid | |
| graph TB | |
| subgraph "Gradio UI Layer" | |
| UI[Gradio Web Interface] | |
| CompUI[Shop Drawing Comparison UI] | |
| SearchUI[Everything File Search UI] | |
| end | |
| subgraph "Main Application - gradio_demo.py" | |
| HandleUser[handle_user_openhands] | |
| HandleSearch[handle_search_interaction] | |
| CreateReport[create_comparison_report] | |
| ExtractSchedule[extract_schedule_section] | |
| ChatFunc[chat_with_functions] | |
| end | |
| subgraph "Agent Executor - openhands_poc/executor.py" | |
| Executor[AgentExecutor] | |
| ClassifyCompare[classify_and_compare_with_sdk] | |
| SearchSDK[search_with_sdk] | |
| end | |
| subgraph "Shop Drawing Comparison - SDK Agents" | |
| direction TB | |
| ReceptionistSDK[ReceptionistAgentSDK] | |
| ExpertSDK[ExpertAgentSDK] | |
| FileProc[file_processor.py] | |
| ReceptionistSDK -->|delegates to| ExpertSDK | |
| ExpertSDK -->|extracts sections| FileProc | |
| end | |
| subgraph "Everything File Search - SDK Agents" | |
| direction TB | |
| SearchAgent[SearchAgentSDK] | |
| VerifierAgent[VerifierAgent] | |
| MCPClient[FastMCP Client] | |
| EverythingServer[Everything MCP Server] | |
| SearchAgent -->|spawns parallel| VerifierAgent | |
| SearchAgent -->|uses| MCPClient | |
| MCPClient -->|stdio transport| EverythingServer | |
| end | |
| subgraph "External Services" | |
| AzureOAI[Azure OpenAI API] | |
| EverythingSDK[Everything SDK DLL] | |
| end | |
| subgraph "File Processing" | |
| PDFExtract[PyMuPDF - PDF] | |
| DOCXExtract[python-docx - DOCX] | |
| OCRExtract[Tesseract OCR - PNG] | |
| end | |
| subgraph "Output Generation" | |
| ExcelGen[openpyxl - Excel Report] | |
| LogGen[Structured Logging] | |
| end | |
| UI --> CompUI | |
| UI --> SearchUI | |
| CompUI --> HandleUser | |
| SearchUI --> HandleSearch | |
| HandleUser --> Executor | |
| HandleSearch --> Executor | |
| Executor --> ClassifyCompare | |
| Executor --> SearchSDK | |
| ClassifyCompare --> ReceptionistSDK | |
| ReceptionistSDK --> AzureOAI | |
| ReceptionistSDK --> PDFExtract | |
| ReceptionistSDK --> DOCXExtract | |
| ReceptionistSDK --> OCRExtract | |
| ExpertSDK --> AzureOAI | |
| ExpertSDK --> FileProc | |
| SearchSDK --> SearchAgent | |
| SearchAgent --> MCPClient | |
| SearchAgent --> VerifierAgent | |
| VerifierAgent --> AzureOAI | |
| EverythingServer --> EverythingSDK | |
| ExpertSDK -->|markdown table| CreateReport | |
| CreateReport --> ExcelGen | |
| ReceptionistSDK --> LogGen | |
| ExpertSDK --> LogGen | |
| SearchAgent --> LogGen | |
| ``` | |
| ## Workflow 1: Shop Drawing Comparison | |
| ```mermaid | |
| sequenceDiagram | |
| actor User | |
| participant UI as Gradio UI | |
| participant Handler as handle_user_openhands | |
| participant Executor as AgentExecutor | |
| participant Recept as ReceptionistAgentSDK | |
| participant Expert as ExpertAgentSDK | |
| participant FileProc as file_processor | |
| participant Azure as Azure OpenAI | |
| participant Excel as create_comparison_report | |
| User->>UI: Upload 3 files (Schedule, Drawing, Spec) | |
| UI->>Handler: File paths | |
| Handler->>Executor: classify_and_compare_with_sdk() | |
| Executor->>Recept: classify() with file snippets | |
| Recept->>Azure: LLM call - classify equipment type | |
| Azure-->>Recept: classification (e.g., "boiler") | |
| Recept-->>Executor: classification | |
| Executor->>Expert: compare() with classification & files | |
| Expert->>FileProc: extract_relevant_sections(keyword) | |
| FileProc-->>Expert: filtered text (70-90% reduction) | |
| Expert->>Azure: LLM call - generate comparison table | |
| Azure-->>Expert: markdown table (6 columns) | |
| Expert-->>Executor: comparison_table + metadata | |
| Executor-->>Handler: comparison_table + metadata | |
| Handler->>Excel: create_comparison_report() | |
| Excel->>Excel: Parse markdown → Excel rows | |
| Excel-->>Handler: .xlsx file path | |
| Handler-->>UI: Excel file + metadata | |
| UI-->>User: Download link + logs | |
| ``` | |
| ## Workflow 2: Everything File Search | |
| ```mermaid | |
| sequenceDiagram | |
| actor User | |
| participant UI as Gradio Chatbot | |
| participant Handler as handle_search_interaction | |
| participant Executor as AgentExecutor | |
| participant SearchAgent as SearchAgentSDK | |
| participant MCP as FastMCP Client | |
| participant Everything as Everything MCP Server | |
| participant Verifier as VerifierAgent (Parallel) | |
| participant Azure as Azure OpenAI | |
| User->>UI: Enter search query + verification settings | |
| UI->>Handler: query, history, enable_verification, target_count | |
| Handler->>Executor: search_with_sdk() | |
| alt Verification Enabled | |
| Executor->>SearchAgent: search_with_verification() | |
| else No Verification | |
| Executor->>SearchAgent: search() | |
| end | |
| SearchAgent->>Azure: Parse query → Everything syntax | |
| Azure-->>SearchAgent: search_query (e.g., "<boiler> <shop|drawing>") | |
| SearchAgent->>MCP: call_tool("search", query) | |
| MCP->>Everything: Execute search | |
| Everything-->>MCP: file paths (max 100) | |
| MCP-->>SearchAgent: search results | |
| alt Verification Enabled | |
| SearchAgent->>SearchAgent: Create VerifierAgent pool (5 workers) | |
| loop Wave 1-N (15 files per wave) | |
| par Parallel Verification | |
| SearchAgent->>Verifier: verify(file_1) | |
| SearchAgent->>Verifier: verify(file_2) | |
| SearchAgent->>Verifier: verify(file_N) | |
| end | |
| Verifier->>FileProc: extract snippet | |
| Verifier->>Azure: Check relevance | |
| Azure-->>Verifier: confidence + reasoning | |
| Verifier-->>SearchAgent: match status | |
| alt Target reached | |
| SearchAgent->>SearchAgent: Early stop | |
| end | |
| end | |
| SearchAgent-->>Executor: verified matches + metadata | |
| else No Verification | |
| SearchAgent-->>Executor: raw results + metadata | |
| end | |
| Executor-->>Handler: response + metadata | |
| Handler-->>UI: Updated chat history | |
| UI-->>User: Display results | |
| ``` | |
| ## Key Components | |
| ### 1. Shop Drawing Comparison Agents | |
| - **ReceptionistAgentSDK** (`openhands_poc/agents/receptionist_sdk.py`) | |
| - Classifies documents by equipment type | |
| - Analyzes snippets (2000 chars) for efficiency | |
| - Returns: classification keyword or error state | |
| - **ExpertAgentSDK** (`openhands_poc/agents/expert_sdk.py`) | |
| - Generates comparison tables | |
| - Uses smart extraction (70-90% token reduction) | |
| - Returns: 6-column markdown table | |
| - **CustomPromptAgent** (`openhands_poc/agents/custom_agent.py`) | |
| - Base class for all SDK agents | |
| - Loads system prompts from local `prompts/` folder | |
| ### 2. Everything File Search Agents | |
| - **SearchAgentSDK** (`openhands_poc/agents/search_sdk.py`) | |
| - Translates natural language to Everything syntax | |
| - Manages MCP connection via FastMCP | |
| - Supports iterative refinement (max 3 rounds) | |
| - Optional parallel verification | |
| - **VerifierAgent** (`openhands_poc/agents/verifier_agent.py`) | |
| - Lightweight document relevance checker | |
| - Parallel execution (ThreadPoolExecutor, 5 workers) | |
| - Early stopping when target count reached | |
| - Returns: confidence scores + reasoning | |
| ### 3. File Processing | |
| - **file_processor.py** (`openhands_poc/utils/file_processor.py`) | |
| - Smart keyword-based extraction | |
| - Multi-format support (PDF, DOCX, PNG) | |
| - Context window management | |
| - Extraction statistics | |
| ### 4. Output Generation | |
| - **create_comparison_report()** (`gradio_demo.py`) | |
| - Parses markdown table with literal `\n` conversion | |
| - Handles Rich console formatting artifacts | |
| - Creates Excel with proper row/column layout | |
| - Adds blank row after header for readability | |
| ### 5. MCP Integration | |
| - **FastMCP Client** with StdioTransport | |
| - Launches Everything MCP server as subprocess | |
| - Tool argument wrapping: `{"base": {...}}` | |
| - Custom environment variables | |
| - Async context manager pattern | |
| ## Configuration | |
| ### Environment Variables (.env) | |
| ``` | |
| AZURE_OPENAI_API_KEY=<azure-openai-key> | |
| AZURE_OPENAI_ENDPOINT=<endpoint-url> | |
| AZURE_DEPLOYMENT_NAME=<deployment-name> | |
| EVERYTHING_SDK_PATH=<path-to-everything-sdk-dll> | |
| ``` | |
| ### Agent Configuration | |
| - **LLM**: Azure OpenAI with 8192 max tokens | |
| - **Temperature**: 0.7 | |
| - **Reasoning Effort**: Low (minimize reasoning tokens) | |
| - **API Version**: 2025-03-01-preview | |
| ## Data Flow | |
| ### Shop Drawing Comparison | |
| 1. User uploads 3 files | |
| 2. Receptionist analyzes snippets (2000 chars each) | |
| 3. Classification returned (e.g., "boiler") | |
| 4. Expert extracts relevant sections by keyword | |
| 5. Full comparison table generated | |
| 6. Markdown converted to Excel | |
| 7. User downloads report + views logs | |
| ### Everything File Search | |
| 1. User enters search query | |
| 2. SearchAgent translates to Everything syntax | |
| 3. MCP server returns file paths | |
| 4. (Optional) Verifier agents check relevance in parallel | |
| 5. Results filtered to target count | |
| 6. User sees matched files with confidence scores | |
| ## Error Handling | |
| - **Classification Errors**: "unsupported", "mismatched" | |
| - **SDK Fallback**: Automatic fallback to legacy if SDK unavailable | |
| - **MCP Errors**: Graceful handling with error messages | |
| - **File Processing**: Timeout protection (30s snippets, unlimited full) | |
| ## Logging | |
| - **Structured Logging** (`openhands_poc/logging_config.py`) | |
| - Agent lifecycle tracking | |
| - LLM call logging | |
| - File processing statistics | |
| - Real-time log viewer in UI | |
| ## Performance Optimizations | |
| 1. **Smart Extraction**: 70-90% token reduction via keyword filtering | |
| 2. **Parallel Verification**: 5 workers with early stopping | |
| 3. **Wave Processing**: 15 files per wave for verification | |
| 4. **Connection Pooling**: Persistent `requests.Session` | |
| 5. **Snippet-Based Classification**: 2000 chars vs full document | |