Nexa_Labs / PROJECT_SUMMARY.md
Allanatrix's picture
Upload 57 files
d8328bf verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

NexaSci Agent Kit - Project Summary

What This Project Does

A complete local-first scientific agent system that:

  • Uses a 10B Falcon model (NexaSci Assistant) for reasoning and tool calling
  • Executes Python code in a sandboxed environment
  • Searches and retrieves scientific papers from arXiv
  • Performs semantic search over a local corpus of embedded papers
  • Orchestrates multi-turn agent loops with tool usage

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent Window   β”‚  ← User interface (CLI or Web UI)
β”‚  (Controller)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€β”€β†’ Model Server (port 8001) ──→ GPU (10B Model)
         β”‚
         └──→ Tool Server (port 8000)
              β”œβ”€β”€ python.run (sandboxed execution)
              β”œβ”€β”€ papers.search (arXiv API)
              β”œβ”€β”€ papers.fetch (paper metadata)
              └── papers.search_corpus (local semantic search)

Key Components

  1. Model Server (agent/model_server.py)

    • HTTP API for model inference
    • Loads model once, serves multiple requests
    • GPU-accelerated
  2. Tool Server (tools/server.py)

    • FastAPI server exposing all tools
    • Python sandbox with resource limits
    • Paper search and retrieval
  3. Agent Controller (agent/controller.py)

    • Orchestrates LLM ↔ tool loop
    • Parses tool calls and final responses
    • Manages conversation state
  4. Model Client (agent/client_llm.py, agent/client_llm_remote.py)

    • Local model loading
    • Remote model server client
    • Message formatting and generation

Files Structure

Agent_kit/
β”œβ”€β”€ agent/              # Core agent code
β”‚   β”œβ”€β”€ model_server.py    # HTTP model server
β”‚   β”œβ”€β”€ controller.py      # Agent orchestration
β”‚   β”œβ”€β”€ client_llm.py       # Local model client
β”‚   β”œβ”€β”€ client_llm_remote.py # Remote model client
β”‚   └── config.yaml         # Configuration
β”œβ”€β”€ tools/              # Tool implementations
β”‚   β”œβ”€β”€ server.py          # FastAPI tool server
β”‚   β”œβ”€β”€ python_sandbox.py  # Sandboxed Python executor
β”‚   └── paper_sources/     # Paper search clients
β”œβ”€β”€ examples/           # Example scripts and prompts
β”œβ”€β”€ scripts/            # Utility scripts
β”œβ”€β”€ pipeline/           # Corpus building pipeline
β”œβ”€β”€ Dockerfile          # Docker image definition
β”œβ”€β”€ docker-compose.yml  # Docker orchestration
└── README.md          # Main documentation

Setup Options

  1. Docker (Recommended for reproducibility)

    • docker-compose up starts everything
    • GPU support via NVIDIA Container Toolkit
  2. Manual (For development)

    • Three terminal setup
    • Model server, tool server, agent window

Configuration

All settings in agent/config.yaml:

  • Model paths and settings
  • Generation parameters
  • Tool server URLs
  • Sandbox limits

Testing

  • examples/test_model_server.py - Test model server connection
  • examples/simple_test.py - Basic generation test
  • examples/demo_agent.py - Full agent demo with tool usage

Deployment

See DEPLOYMENT.md for:

  • Docker deployment
  • Remote GPU box setup
  • Health checks and monitoring

Next Steps

  1. Push to remote repository
  2. Set up CI/CD (optional)
  3. Add more tools (optional)
  4. Build corpus for local search (optional)