---
license: apache-2.0
title: Nexa_labs
sdk: gradio
emoji: 💻
colorFrom: yellow
colorTo: indigo
pinned: true
sdk_version: 6.0.0
short_description: A discovery engine
---
# NexaSci Agent Kit

A local-first scientific agent stack featuring the NexaSci Assistant (10B model), tool calling, Python sandbox execution, and scientific paper retrieval.

## Quick Start

### Prerequisites

- **GPU**: NVIDIA GPU with CUDA support (RTX 5090 32GB recommended)
- **Docker**: Docker and Docker Compose with NVIDIA Container Toolkit
- **Python**: 3.10+ (if running without Docker)

### Option 1: Docker (Recommended)

1. **Clone the repository:**
   ```bash
   git clone <your-repo-url>
   cd Agent_kit
   ```

2. **Build and start services:**
   ```bash
   docker-compose up --build
   ```

   This starts:
   - Model server on port 8001 (GPU-accelerated)
   - Tool server on port 8000
   - Agent demo (one-time run)

3. **Run agent interactively:**
   ```bash
   docker-compose run --rm agent python examples/demo_agent.py \
     --prompt "Your prompt here"
   ```

### Option 2: Manual Setup

1. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

2. **Merge model (one-time):**
   ```bash
   python scripts/merge_model.py \
     --base-model "Allanatrix/Nexa_Sci_distilled_Falcon-10B" \
     --adapter-path models/adapter_model.safetensors \
     --output-dir models/merged \
     --torch-dtype bfloat16
   ```

3. **Start model server (Terminal 1):**
   ```bash
   uvicorn agent.model_server:app --host 0.0.0.0 --port 8001
   ```

4. **Start tool server (Terminal 2):**
   ```bash
   uvicorn tools.server:app --host 0.0.0.0 --port 8000
   ```

5. **Run agent (Terminal 3):**
   ```bash
   # Enable remote model in config.yaml
   # Set model_server.enabled: true
   
   python examples/demo_agent.py --prompt "Your prompt here"
   ```

## Components

- **NexaSci Assistant (LLM)**: 10B Falcon model post-trained for tool calling
- **Model Server**: HTTP API for model inference (port 8001)
- **Tool Server**: FastAPI server exposing tools (port 8000)
  - `python.run`: Sandboxed Python execution
  - `papers.search`: arXiv paper search
  - `papers.fetch`: Fetch paper metadata
  - `papers.search_corpus`: Local corpus semantic search
- **Agent Controller**: Orchestrates LLM ↔ tool server loop

## Tools

### Python Sandbox
Execute Python code with resource limits:
- Timeout: 10 seconds
- Memory: 2GB
- Allowed modules: numpy, scipy, pandas, matplotlib, sympy, seaborn

### Paper Search
- **arXiv Search**: Search papers by query
- **Paper Fetch**: Get detailed metadata
- **Corpus Search**: Semantic search over local SPECTER2-embedded papers

## Example Prompts

See `examples/sample_prompts.md` for example prompts that showcase:
- Python code generation
- Literature search and citation
- Experimental design
- Combined reasoning workflows

## Configuration

Edit `agent/config.yaml` to configure:
- Model paths and settings
- Generation parameters
- Tool server URLs
- Sandbox limits

## Project Structure

```
Agent_kit/
├── agent/              # Agent controller and model client
│   ├── model_server.py # HTTP server for model inference
│   ├── controller.py   # Agent orchestration loop
│   └── config.yaml     # Configuration
├── tools/              # Tool implementations
│   ├── server.py       # FastAPI tool server
│   ├── python_sandbox.py
│   └── paper_sources/  # arXiv and corpus clients
├── examples/           # Example scripts and prompts
├── scripts/            # Utility scripts (merge model, etc.)
├── pipeline/           # Corpus building pipeline
└── docker-compose.yml  # Docker orchestration
```

## Docker Details

The Docker setup includes:
- **Base Image**: `nvidia/cuda:12.1.0-runtime-ubuntu22.04`
- **GPU Support**: NVIDIA Container Toolkit required
- **Volumes**: Model weights, cache, and data directories

### Build Image:
```bash
docker build -t nexasci-agent:latest .
```

### Run Services:
```bash
docker-compose up
```

## Troubleshooting

### GPU Not Available
```bash
# Check CUDA
python scripts/check_cuda.py
nvidia-smi

# Reinstall PyTorch with CUDA
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

### Connection Refused
- Ensure all services are running on the same machine
- Check ports 8000 and 8001 are not in use
- Verify `model_server.enabled: true` in `agent/config.yaml`

### Model Loading Issues
- Check GPU memory: `nvidia-smi`
- Verify model path in config
- Check HuggingFace cache: `~/.cache/huggingface/`

## Documentation

- **Quick Start**: See `QUICKSTART.md` for detailed setup
- **Specification**: See `Spec.md` for architecture details
- **Sample Prompts**: See `examples/sample_prompts.md`

## Acknowledgments

- Model: `Allanatrix/Nexa_Sci_distilled_Falcon-10B`
- SPECTER2: `allenai/specter2_base`
- Built with FastAPI, Transformers, and PyTorch