File size: 9,807 Bytes
948951c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
# Gradio Demo Architecture

This document describes the architecture of the multi-agent Gradio application with two main workflows: Shop Drawing Comparison and Everything File Search.

## System Architecture Diagram

```mermaid
graph TB
    subgraph "Gradio UI Layer"
        UI[Gradio Web Interface]
        CompUI[Shop Drawing Comparison UI]
        SearchUI[Everything File Search UI]
    end

    subgraph "Main Application - gradio_demo.py"
        HandleUser[handle_user_openhands]
        HandleSearch[handle_search_interaction]
        CreateReport[create_comparison_report]
        ExtractSchedule[extract_schedule_section]
        ChatFunc[chat_with_functions]
    end

    subgraph "Agent Executor - openhands_poc/executor.py"
        Executor[AgentExecutor]
        ClassifyCompare[classify_and_compare_with_sdk]
        SearchSDK[search_with_sdk]
    end

    subgraph "Shop Drawing Comparison - SDK Agents"
        direction TB
        ReceptionistSDK[ReceptionistAgentSDK]
        ExpertSDK[ExpertAgentSDK]
        FileProc[file_processor.py]

        ReceptionistSDK -->|delegates to| ExpertSDK
        ExpertSDK -->|extracts sections| FileProc
    end

    subgraph "Everything File Search - SDK Agents"
        direction TB
        SearchAgent[SearchAgentSDK]
        VerifierAgent[VerifierAgent]
        MCPClient[FastMCP Client]
        EverythingServer[Everything MCP Server]

        SearchAgent -->|spawns parallel| VerifierAgent
        SearchAgent -->|uses| MCPClient
        MCPClient -->|stdio transport| EverythingServer
    end

    subgraph "External Services"
        AzureOAI[Azure OpenAI API]
        EverythingSDK[Everything SDK DLL]
    end

    subgraph "File Processing"
        PDFExtract[PyMuPDF - PDF]
        DOCXExtract[python-docx - DOCX]
        OCRExtract[Tesseract OCR - PNG]
    end

    subgraph "Output Generation"
        ExcelGen[openpyxl - Excel Report]
        LogGen[Structured Logging]
    end

    UI --> CompUI
    UI --> SearchUI

    CompUI --> HandleUser
    SearchUI --> HandleSearch

    HandleUser --> Executor
    HandleSearch --> Executor

    Executor --> ClassifyCompare
    Executor --> SearchSDK

    ClassifyCompare --> ReceptionistSDK
    ReceptionistSDK --> AzureOAI
    ReceptionistSDK --> PDFExtract
    ReceptionistSDK --> DOCXExtract
    ReceptionistSDK --> OCRExtract

    ExpertSDK --> AzureOAI
    ExpertSDK --> FileProc

    SearchSDK --> SearchAgent
    SearchAgent --> MCPClient
    SearchAgent --> VerifierAgent
    VerifierAgent --> AzureOAI

    EverythingServer --> EverythingSDK

    ExpertSDK -->|markdown table| CreateReport
    CreateReport --> ExcelGen

    ReceptionistSDK --> LogGen
    ExpertSDK --> LogGen
    SearchAgent --> LogGen
```

## Workflow 1: Shop Drawing Comparison

```mermaid
sequenceDiagram
    actor User
    participant UI as Gradio UI
    participant Handler as handle_user_openhands
    participant Executor as AgentExecutor
    participant Recept as ReceptionistAgentSDK
    participant Expert as ExpertAgentSDK
    participant FileProc as file_processor
    participant Azure as Azure OpenAI
    participant Excel as create_comparison_report

    User->>UI: Upload 3 files (Schedule, Drawing, Spec)
    UI->>Handler: File paths
    Handler->>Executor: classify_and_compare_with_sdk()

    Executor->>Recept: classify() with file snippets
    Recept->>Azure: LLM call - classify equipment type
    Azure-->>Recept: classification (e.g., "boiler")
    Recept-->>Executor: classification

    Executor->>Expert: compare() with classification & files
    Expert->>FileProc: extract_relevant_sections(keyword)
    FileProc-->>Expert: filtered text (70-90% reduction)
    Expert->>Azure: LLM call - generate comparison table
    Azure-->>Expert: markdown table (6 columns)
    Expert-->>Executor: comparison_table + metadata

    Executor-->>Handler: comparison_table + metadata
    Handler->>Excel: create_comparison_report()
    Excel->>Excel: Parse markdown → Excel rows
    Excel-->>Handler: .xlsx file path
    Handler-->>UI: Excel file + metadata
    UI-->>User: Download link + logs
```

## Workflow 2: Everything File Search

```mermaid
sequenceDiagram
    actor User
    participant UI as Gradio Chatbot
    participant Handler as handle_search_interaction
    participant Executor as AgentExecutor
    participant SearchAgent as SearchAgentSDK
    participant MCP as FastMCP Client
    participant Everything as Everything MCP Server
    participant Verifier as VerifierAgent (Parallel)
    participant Azure as Azure OpenAI

    User->>UI: Enter search query + verification settings
    UI->>Handler: query, history, enable_verification, target_count
    Handler->>Executor: search_with_sdk()

    alt Verification Enabled
        Executor->>SearchAgent: search_with_verification()
    else No Verification
        Executor->>SearchAgent: search()
    end

    SearchAgent->>Azure: Parse query → Everything syntax
    Azure-->>SearchAgent: search_query (e.g., "<boiler> <shop|drawing>")

    SearchAgent->>MCP: call_tool("search", query)
    MCP->>Everything: Execute search
    Everything-->>MCP: file paths (max 100)
    MCP-->>SearchAgent: search results

    alt Verification Enabled
        SearchAgent->>SearchAgent: Create VerifierAgent pool (5 workers)

        loop Wave 1-N (15 files per wave)
            par Parallel Verification
                SearchAgent->>Verifier: verify(file_1)
                SearchAgent->>Verifier: verify(file_2)
                SearchAgent->>Verifier: verify(file_N)
            end

            Verifier->>FileProc: extract snippet
            Verifier->>Azure: Check relevance
            Azure-->>Verifier: confidence + reasoning
            Verifier-->>SearchAgent: match status

            alt Target reached
                SearchAgent->>SearchAgent: Early stop
            end
        end

        SearchAgent-->>Executor: verified matches + metadata
    else No Verification
        SearchAgent-->>Executor: raw results + metadata
    end

    Executor-->>Handler: response + metadata
    Handler-->>UI: Updated chat history
    UI-->>User: Display results
```

## Key Components

### 1. Shop Drawing Comparison Agents

- **ReceptionistAgentSDK** (`openhands_poc/agents/receptionist_sdk.py`)
  - Classifies documents by equipment type
  - Analyzes snippets (2000 chars) for efficiency
  - Returns: classification keyword or error state

- **ExpertAgentSDK** (`openhands_poc/agents/expert_sdk.py`)
  - Generates comparison tables
  - Uses smart extraction (70-90% token reduction)
  - Returns: 6-column markdown table

- **CustomPromptAgent** (`openhands_poc/agents/custom_agent.py`)
  - Base class for all SDK agents
  - Loads system prompts from local `prompts/` folder

### 2. Everything File Search Agents

- **SearchAgentSDK** (`openhands_poc/agents/search_sdk.py`)
  - Translates natural language to Everything syntax
  - Manages MCP connection via FastMCP
  - Supports iterative refinement (max 3 rounds)
  - Optional parallel verification

- **VerifierAgent** (`openhands_poc/agents/verifier_agent.py`)
  - Lightweight document relevance checker
  - Parallel execution (ThreadPoolExecutor, 5 workers)
  - Early stopping when target count reached
  - Returns: confidence scores + reasoning

### 3. File Processing

- **file_processor.py** (`openhands_poc/utils/file_processor.py`)
  - Smart keyword-based extraction
  - Multi-format support (PDF, DOCX, PNG)
  - Context window management
  - Extraction statistics

### 4. Output Generation

- **create_comparison_report()** (`gradio_demo.py`)
  - Parses markdown table with literal `\n` conversion
  - Handles Rich console formatting artifacts
  - Creates Excel with proper row/column layout
  - Adds blank row after header for readability

### 5. MCP Integration

- **FastMCP Client** with StdioTransport
  - Launches Everything MCP server as subprocess
  - Tool argument wrapping: `{"base": {...}}`
  - Custom environment variables
  - Async context manager pattern

## Configuration

### Environment Variables (.env)
```
AZURE_OPENAI_API_KEY=<azure-openai-key>
AZURE_OPENAI_ENDPOINT=<endpoint-url>
AZURE_DEPLOYMENT_NAME=<deployment-name>
EVERYTHING_SDK_PATH=<path-to-everything-sdk-dll>
```

### Agent Configuration
- **LLM**: Azure OpenAI with 8192 max tokens
- **Temperature**: 0.7
- **Reasoning Effort**: Low (minimize reasoning tokens)
- **API Version**: 2025-03-01-preview

## Data Flow

### Shop Drawing Comparison
1. User uploads 3 files
2. Receptionist analyzes snippets (2000 chars each)
3. Classification returned (e.g., "boiler")
4. Expert extracts relevant sections by keyword
5. Full comparison table generated
6. Markdown converted to Excel
7. User downloads report + views logs

### Everything File Search
1. User enters search query
2. SearchAgent translates to Everything syntax
3. MCP server returns file paths
4. (Optional) Verifier agents check relevance in parallel
5. Results filtered to target count
6. User sees matched files with confidence scores

## Error Handling

- **Classification Errors**: "unsupported", "mismatched"
- **SDK Fallback**: Automatic fallback to legacy if SDK unavailable
- **MCP Errors**: Graceful handling with error messages
- **File Processing**: Timeout protection (30s snippets, unlimited full)

## Logging

- **Structured Logging** (`openhands_poc/logging_config.py`)
  - Agent lifecycle tracking
  - LLM call logging
  - File processing statistics
  - Real-time log viewer in UI

## Performance Optimizations

1. **Smart Extraction**: 70-90% token reduction via keyword filtering
2. **Parallel Verification**: 5 workers with early stopping
3. **Wave Processing**: 15 files per wave for verification
4. **Connection Pooling**: Persistent `requests.Session`
5. **Snippet-Based Classification**: 2000 chars vs full document