Skip to content

Tools Architecture

DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.

SearchTool Protocol

All tools implement the SearchTool protocol from src/tools/base.py:

class SearchTool(Protocol):
    @property
    def name(self) -> str: ...
    
    async def search(
        self, 
        query: str, 
        max_results: int = 10
    ) -> list[Evidence]: ...

Rate Limiting

All tools use the @retry decorator from tenacity:

@retry(
    stop=stop_after_attempt(3), 
    wait=wait_exponential(...)
)
async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
    # Implementation

Tools with API rate limits implement _rate_limit() method and use shared rate limiters from src/tools/rate_limiter.py.

Error Handling

Tools raise custom exceptions:

  • SearchError: General search failures
  • RateLimitError: Rate limit exceeded

Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).

Query Preprocessing

Tools use preprocess_query() from src/tools/query_utils.py to:

  • Remove noise from queries
  • Expand synonyms
  • Normalize query format

Evidence Conversion

All tools convert API responses to Evidence objects with:

  • Citation: Title, URL, date, authors
  • content: Evidence text
  • relevance_score: 0.0-1.0 relevance score
  • metadata: Additional metadata

Missing fields are handled gracefully with defaults.

Tool Implementations

PubMed Tool

File: src/tools/pubmed.py

API: NCBI E-utilities (ESearch → EFetch)

Rate Limiting: - 0.34s between requests (3 req/sec without API key) - 0.1s between requests (10 req/sec with NCBI API key)

Features: - XML parsing with xmltodict - Handles single vs. multiple articles - Query preprocessing - Evidence conversion with metadata extraction

ClinicalTrials Tool

File: src/tools/clinicaltrials.py

API: ClinicalTrials.gov API v2

Important: Uses requests library (NOT httpx) because WAF blocks httpx TLS fingerprint.

Execution: Runs in thread pool: await asyncio.to_thread(requests.get, ...)

Filtering: - Only interventional studies - Status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION

Features: - Parses nested JSON structure - Extracts trial metadata - Evidence conversion

Europe PMC Tool

File: src/tools/europepmc.py

API: Europe PMC REST API

Features: - Handles preprint markers: [PREPRINT - Not peer-reviewed] - Builds URLs from DOI or PMID - Checks pubTypeList for preprint detection - Includes both preprints and peer-reviewed articles

RAG Tool

File: src/tools/rag_tool.py

Purpose: Semantic search within collected evidence

Implementation: Wraps LlamaIndexRAGService

Features: - Returns Evidence from RAG results - Handles evidence ingestion - Semantic similarity search - Metadata preservation

Search Handler

File: src/tools/search_handler.py

Purpose: Orchestrates parallel searches across multiple tools

Features: - Uses asyncio.gather() with return_exceptions=True - Aggregates results into SearchResult - Handles tool failures gracefully - Deduplicates results by URL

Tool Registration

Tools are registered in the search handler:

from src.tools.pubmed import PubMedTool
from src.tools.clinicaltrials import ClinicalTrialsTool
from src.tools.europepmc import EuropePMCTool

search_handler = SearchHandler(
    tools=[
        PubMedTool(),
        ClinicalTrialsTool(),
        EuropePMCTool(),
    ]
)

See Also