# NexaSci Agent Kit - Project Summary

## What This Project Does

A complete local-first scientific agent system that:
- Uses a 10B Falcon model (NexaSci Assistant) for reasoning and tool calling
- Executes Python code in a sandboxed environment
- Searches and retrieves scientific papers from arXiv
- Performs semantic search over a local corpus of embedded papers
- Orchestrates multi-turn agent loops with tool usage

## Architecture

```
┌─────────────────┐
│  Agent Window   │  ← User interface (CLI or Web UI)
│  (Controller)   │
└────────┬────────┘
         │
         ├──→ Model Server (port 8001) ──→ GPU (10B Model)
         │
         └──→ Tool Server (port 8000)
              ├── python.run (sandboxed execution)
              ├── papers.search (arXiv API)
              ├── papers.fetch (paper metadata)
              └── papers.search_corpus (local semantic search)
```

## Key Components

1. **Model Server** (`agent/model_server.py`)
   - HTTP API for model inference
   - Loads model once, serves multiple requests
   - GPU-accelerated

2. **Tool Server** (`tools/server.py`)
   - FastAPI server exposing all tools
   - Python sandbox with resource limits
   - Paper search and retrieval

3. **Agent Controller** (`agent/controller.py`)
   - Orchestrates LLM ↔ tool loop
   - Parses tool calls and final responses
   - Manages conversation state

4. **Model Client** (`agent/client_llm.py`, `agent/client_llm_remote.py`)
   - Local model loading
   - Remote model server client
   - Message formatting and generation

## Files Structure

```
Agent_kit/
├── agent/              # Core agent code
│   ├── model_server.py    # HTTP model server
│   ├── controller.py      # Agent orchestration
│   ├── client_llm.py       # Local model client
│   ├── client_llm_remote.py # Remote model client
│   └── config.yaml         # Configuration
├── tools/              # Tool implementations
│   ├── server.py          # FastAPI tool server
│   ├── python_sandbox.py  # Sandboxed Python executor
│   └── paper_sources/     # Paper search clients
├── examples/           # Example scripts and prompts
├── scripts/            # Utility scripts
├── pipeline/           # Corpus building pipeline
├── Dockerfile          # Docker image definition
├── docker-compose.yml  # Docker orchestration
└── README.md          # Main documentation
```

## Setup Options

1. **Docker** (Recommended for reproducibility)
   - `docker-compose up` starts everything
   - GPU support via NVIDIA Container Toolkit

2. **Manual** (For development)
   - Three terminal setup
   - Model server, tool server, agent window

## Configuration

All settings in `agent/config.yaml`:
- Model paths and settings
- Generation parameters
- Tool server URLs
- Sandbox limits

## Testing

- `examples/test_model_server.py` - Test model server connection
- `examples/simple_test.py` - Basic generation test
- `examples/demo_agent.py` - Full agent demo with tool usage

## Deployment

See `DEPLOYMENT.md` for:
- Docker deployment
- Remote GPU box setup
- Health checks and monitoring

## Next Steps

1. Push to remote repository
2. Set up CI/CD (optional)
3. Add more tools (optional)
4. Build corpus for local search (optional)