# Quick Start Guide - NexaSci Agent Kit This guide shows you how to run the complete agent system across three tmux panes. ## Architecture Overview The system runs in three coordinated tmux panes **on the GPU box**: 1. **Agent Tmux (GPU Box)** - Loads and serves the merged NexaSci model via HTTP API (port 8001) 2. **Tool Server Tmux** - FastAPI server exposing tools (Python sandbox, paper search, etc.) on port 8000 3. **Agent Window Tmux** - Executes the agent controller, connects to model server and tool server **Important:** All three panes should be on the **same GPU box** (SSH into the box first). ## Setup Instructions ### Prerequisites - GPU server with CUDA (RTX 5090 with 32GB VRAM recommended) - Python 3.10+ with dependencies installed - Three tmux panes/sessions **on the GPU box** ### Step 0: SSH into GPU Box ```bash ssh root@38.80.152.77 -p 30785 ``` **All commands below should be run on the GPU box after SSH'ing in.** ### Step 1: Merge the Model (One-time setup) In your **agent tmux pane**: ```bash cd /root/Agent_kit source .venv/bin/activate # Merge the post-trained LoRA adapter with the distilled model python scripts/merge_model.py \ --base-model "Allanatrix/Nexa_Sci_distilled_Falcon-10B" \ --adapter-path models/adapter_model.safetensors \ --output-dir models/merged \ --torch-dtype bfloat16 ``` This will: - Download the distilled model (~20GB, first time only) - Detect the LoRA rank (32) from your adapter - Merge the adapter into the model - Save to `models/merged/` **Update config to use merged model:** ```bash # Edit agent/config.yaml and set: # merged_path: "./models/merged" ``` ### Step 2: Start the Model Server (Agent Tmux - GPU Box) In your **agent tmux pane** (where GPU is): ```bash cd /root/Agent_kit source .venv/bin/activate # Start model server (loads model and serves via HTTP) uvicorn agent.model_server:app --host 0.0.0.0 --port 8001 ``` You should see: ``` Loading NexaSci model (this may take 30-60 seconds)... ✓ CUDA available: NVIDIA GeForce RTX 5090 ✓ Model loaded on GPU: cuda:0 Model server ready! Listening on http://0.0.0.0:8001 ``` **Keep this running** - this is where the model lives. ### Step 3: Start the Tool Server In your **tool server tmux pane**: ```bash cd /root/Agent_kit source .venv/bin/activate # Start FastAPI tool server uvicorn tools.server:app --host 0.0.0.0 --port 8000 ``` You should see: ``` INFO: Started server process INFO: Uvicorn running on http://0.0.0.0:8000 ``` **Keep this running** - the agent will connect to it. ### Step 4: Enable Remote Model in Config The config should already have: ```yaml model_server: base_url: "http://127.0.0.1:8001" enabled: true ``` **Important:** If running from the same box (which you should), `127.0.0.1` is correct. ### Step 5: Test Model Server Connection In your **agent window tmux pane**: ```bash cd /root/Agent_kit source .venv/bin/activate # Test the connection python examples/test_model_server.py ``` You should see: ``` ✓ Health check passed ✓ Generation successful ✓ Remote client works ✓ All tests passed! ``` ### Step 6: Run the Agent (Agent Window Tmux) In your **agent window tmux pane**: ```bash cd /root/Agent_kit source .venv/bin/activate # Run the interactive demo (connects to model server) python examples/demo_agent.py \ --prompt "Design a Python simulation to model nanoparticle diffusion. Include visualization and cite relevant literature." ``` The agent will: - Connect to model server (port 8001) - no local model loading - Connect to tool server (port 8000) - Show all reasoning, tool calls, and results ## Troubleshooting ### Connection Refused If you get "Connection refused" when testing: 1. **Make sure you're on the GPU box:** ```bash # SSH into the box first ssh root@38.80.152.77 -p 30785 ``` 2. **Check if servers are running:** ```bash # Check model server curl http://127.0.0.1:8001/health # Check tool server curl http://127.0.0.1:8000/docs ``` 3. **Verify servers are listening:** ```bash netstat -tlnp | grep -E '8000|8001' ``` ### GPU Not Available If `gpu_available: false`: 1. Check CUDA: ```bash python scripts/check_cuda.py nvidia-smi ``` 2. Reinstall PyTorch with CUDA: ```bash pip uninstall -y torch torchvision torchaudio pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 ``` ### Model Server Not Loading - Check GPU memory: `nvidia-smi` - Check logs in agent tmux for errors - Verify model path in `agent/config.yaml` ## Architecture Diagram ``` ┌─────────────────┐ │ Agent Window │ │ (Controller) │ └────────┬────────┘ │ ├──→ Remote Client ──→ Model Server (port 8001) ──→ GPU Box │ └──→ Tool Client ──→ Tool Server (port 8000) ├── python.run ├── papers.search ├── papers.fetch └── papers.search_corpus ```