--- license: apache-2.0 title: Nexa_labs sdk: gradio emoji: 💻 colorFrom: yellow colorTo: indigo pinned: true sdk_version: 6.0.0 short_description: A discovery engine --- # NexaSci Agent Kit A local-first scientific agent stack featuring the NexaSci Assistant (10B model), tool calling, Python sandbox execution, and scientific paper retrieval. ## Quick Start ### Prerequisites - **GPU**: NVIDIA GPU with CUDA support (RTX 5090 32GB recommended) - **Docker**: Docker and Docker Compose with NVIDIA Container Toolkit - **Python**: 3.10+ (if running without Docker) ### Option 1: Docker (Recommended) 1. **Clone the repository:** ```bash git clone cd Agent_kit ``` 2. **Build and start services:** ```bash docker-compose up --build ``` This starts: - Model server on port 8001 (GPU-accelerated) - Tool server on port 8000 - Agent demo (one-time run) 3. **Run agent interactively:** ```bash docker-compose run --rm agent python examples/demo_agent.py \ --prompt "Your prompt here" ``` ### Option 2: Manual Setup 1. **Install dependencies:** ```bash pip install -r requirements.txt ``` 2. **Merge model (one-time):** ```bash python scripts/merge_model.py \ --base-model "Allanatrix/Nexa_Sci_distilled_Falcon-10B" \ --adapter-path models/adapter_model.safetensors \ --output-dir models/merged \ --torch-dtype bfloat16 ``` 3. **Start model server (Terminal 1):** ```bash uvicorn agent.model_server:app --host 0.0.0.0 --port 8001 ``` 4. **Start tool server (Terminal 2):** ```bash uvicorn tools.server:app --host 0.0.0.0 --port 8000 ``` 5. **Run agent (Terminal 3):** ```bash # Enable remote model in config.yaml # Set model_server.enabled: true python examples/demo_agent.py --prompt "Your prompt here" ``` ## Components - **NexaSci Assistant (LLM)**: 10B Falcon model post-trained for tool calling - **Model Server**: HTTP API for model inference (port 8001) - **Tool Server**: FastAPI server exposing tools (port 8000) - `python.run`: Sandboxed Python execution - `papers.search`: arXiv paper search - `papers.fetch`: Fetch paper metadata - `papers.search_corpus`: Local corpus semantic search - **Agent Controller**: Orchestrates LLM ↔ tool server loop ## Tools ### Python Sandbox Execute Python code with resource limits: - Timeout: 10 seconds - Memory: 2GB - Allowed modules: numpy, scipy, pandas, matplotlib, sympy, seaborn ### Paper Search - **arXiv Search**: Search papers by query - **Paper Fetch**: Get detailed metadata - **Corpus Search**: Semantic search over local SPECTER2-embedded papers ## Example Prompts See `examples/sample_prompts.md` for example prompts that showcase: - Python code generation - Literature search and citation - Experimental design - Combined reasoning workflows ## Configuration Edit `agent/config.yaml` to configure: - Model paths and settings - Generation parameters - Tool server URLs - Sandbox limits ## Project Structure ``` Agent_kit/ ├── agent/ # Agent controller and model client │ ├── model_server.py # HTTP server for model inference │ ├── controller.py # Agent orchestration loop │ └── config.yaml # Configuration ├── tools/ # Tool implementations │ ├── server.py # FastAPI tool server │ ├── python_sandbox.py │ └── paper_sources/ # arXiv and corpus clients ├── examples/ # Example scripts and prompts ├── scripts/ # Utility scripts (merge model, etc.) ├── pipeline/ # Corpus building pipeline └── docker-compose.yml # Docker orchestration ``` ## Docker Details The Docker setup includes: - **Base Image**: `nvidia/cuda:12.1.0-runtime-ubuntu22.04` - **GPU Support**: NVIDIA Container Toolkit required - **Volumes**: Model weights, cache, and data directories ### Build Image: ```bash docker build -t nexasci-agent:latest . ``` ### Run Services: ```bash docker-compose up ``` ## Troubleshooting ### GPU Not Available ```bash # Check CUDA python scripts/check_cuda.py nvidia-smi # Reinstall PyTorch with CUDA pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 ``` ### Connection Refused - Ensure all services are running on the same machine - Check ports 8000 and 8001 are not in use - Verify `model_server.enabled: true` in `agent/config.yaml` ### Model Loading Issues - Check GPU memory: `nvidia-smi` - Verify model path in config - Check HuggingFace cache: `~/.cache/huggingface/` ## Documentation - **Quick Start**: See `QUICKSTART.md` for detailed setup - **Specification**: See `Spec.md` for architecture details - **Sample Prompts**: See `examples/sample_prompts.md` ## Acknowledgments - Model: `Allanatrix/Nexa_Sci_distilled_Falcon-10B` - SPECTER2: `allenai/specter2_base` - Built with FastAPI, Transformers, and PyTorch