File size: 3,451 Bytes
d8328bf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# NexaSci Agent Kit - Project Summary

## What This Project Does

A complete local-first scientific agent system that:
- Uses a 10B Falcon model (NexaSci Assistant) for reasoning and tool calling
- Executes Python code in a sandboxed environment
- Searches and retrieves scientific papers from arXiv
- Performs semantic search over a local corpus of embedded papers
- Orchestrates multi-turn agent loops with tool usage

## Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent Window   β”‚  ← User interface (CLI or Web UI)
β”‚  (Controller)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€β”€β†’ Model Server (port 8001) ──→ GPU (10B Model)
         β”‚
         └──→ Tool Server (port 8000)
              β”œβ”€β”€ python.run (sandboxed execution)
              β”œβ”€β”€ papers.search (arXiv API)
              β”œβ”€β”€ papers.fetch (paper metadata)
              └── papers.search_corpus (local semantic search)
```

## Key Components

1. **Model Server** (`agent/model_server.py`)
   - HTTP API for model inference
   - Loads model once, serves multiple requests
   - GPU-accelerated

2. **Tool Server** (`tools/server.py`)
   - FastAPI server exposing all tools
   - Python sandbox with resource limits
   - Paper search and retrieval

3. **Agent Controller** (`agent/controller.py`)
   - Orchestrates LLM ↔ tool loop
   - Parses tool calls and final responses
   - Manages conversation state

4. **Model Client** (`agent/client_llm.py`, `agent/client_llm_remote.py`)
   - Local model loading
   - Remote model server client
   - Message formatting and generation

## Files Structure

```
Agent_kit/
β”œβ”€β”€ agent/              # Core agent code
β”‚   β”œβ”€β”€ model_server.py    # HTTP model server
β”‚   β”œβ”€β”€ controller.py      # Agent orchestration
β”‚   β”œβ”€β”€ client_llm.py       # Local model client
β”‚   β”œβ”€β”€ client_llm_remote.py # Remote model client
β”‚   └── config.yaml         # Configuration
β”œβ”€β”€ tools/              # Tool implementations
β”‚   β”œβ”€β”€ server.py          # FastAPI tool server
β”‚   β”œβ”€β”€ python_sandbox.py  # Sandboxed Python executor
β”‚   └── paper_sources/     # Paper search clients
β”œβ”€β”€ examples/           # Example scripts and prompts
β”œβ”€β”€ scripts/            # Utility scripts
β”œβ”€β”€ pipeline/           # Corpus building pipeline
β”œβ”€β”€ Dockerfile          # Docker image definition
β”œβ”€β”€ docker-compose.yml  # Docker orchestration
└── README.md          # Main documentation
```

## Setup Options

1. **Docker** (Recommended for reproducibility)
   - `docker-compose up` starts everything
   - GPU support via NVIDIA Container Toolkit

2. **Manual** (For development)
   - Three terminal setup
   - Model server, tool server, agent window

## Configuration

All settings in `agent/config.yaml`:
- Model paths and settings
- Generation parameters
- Tool server URLs
- Sandbox limits

## Testing

- `examples/test_model_server.py` - Test model server connection
- `examples/simple_test.py` - Basic generation test
- `examples/demo_agent.py` - Full agent demo with tool usage

## Deployment

See `DEPLOYMENT.md` for:
- Docker deployment
- Remote GPU box setup
- Health checks and monitoring

## Next Steps

1. Push to remote repository
2. Set up CI/CD (optional)
3. Add more tools (optional)
4. Build corpus for local search (optional)