DETERMINATOR / PDF_REPORT_INTEGRATION.md
Joseph Pollack
adds youtube video
25435fb unverified

A newer version of the Gradio SDK is available: 6.0.2

Upgrade

PDF Report Generation Integration

Summary

Integrated PDF generation functionality into the report file service using utilities from folder/utils copy. Reports can now be automatically converted to PDF format as a final step.

Changes Made

1. Added PDF Conversion Utilities

Files Created:

  • src/utils/md_to_pdf.py - Markdown to PDF conversion utility
  • src/utils/markdown.css - CSS styling for PDF output

Features:

  • Uses md2pdf library for conversion
  • Includes error handling and graceful fallback
  • Supports custom CSS styling
  • Logs conversion status

2. Enhanced ReportFileService

File: src/services/report_file_service.py

Changes:

  • Added _save_pdf() method to generate PDF from markdown
  • Updated save_report_multiple_formats() to implement PDF generation
  • PDF is generated when report_file_format is set to "md_pdf"
  • Both markdown and PDF files are saved and returned

Method Signature:

def _save_pdf(
    self,
    report_content: str,
    query: str | None = None,
) -> str:
    """Save report as PDF. Returns path to PDF file."""

3. Updated Graph Orchestrator

File: src/orchestrator/graph_orchestrator.py

Changes:

  • Updated synthesizer node to use save_report_multiple_formats()
  • Updated writer node to use save_report_multiple_formats()
  • Both nodes now return PDF paths in result dict when available
  • Result includes both file (markdown) and files (both formats) keys

Result Format:

{
    "message": final_report,  # Report content
    "file": "/path/to/report.md",  # Markdown file
    "files": ["/path/to/report.md", "/path/to/report.pdf"]  # Both formats
}

Configuration

PDF generation is controlled by the report_file_format setting in src/utils/config.py:

report_file_format: Literal["md", "md_html", "md_pdf"] = Field(
    default="md",
    description="File format(s) to save reports in."
)

Options:

  • "md" - Save only markdown (default)
  • "md_html" - Save markdown + HTML (not yet implemented)
  • "md_pdf" - Save markdown + PDF ✅ Now implemented

Usage

Enable PDF Generation

Set the environment variable or update settings:

REPORT_FILE_FORMAT=md_pdf

Or in code:

from src.utils.config import settings
settings.report_file_format = "md_pdf"

Dependencies

PDF generation requires the md2pdf library:

pip install md2pdf

If md2pdf is not installed, the system will:

  • Log a warning
  • Continue with markdown-only saving
  • Not fail the report generation

File Output

When PDF generation is enabled:

  1. Markdown file is always saved first
  2. PDF is generated from the markdown content
  3. Both file paths are returned in the result
  4. Gradio interface can display/download both files

Error Handling

  • If PDF generation fails, markdown file is still saved
  • Errors are logged but don't interrupt report generation
  • Graceful fallback ensures reports are always available

Integration Points

PDF generation is automatically triggered when:

  1. Graph orchestrator synthesizer node completes
  2. Graph orchestrator writer node completes
  3. save_report_multiple_formats() is called
  4. report_file_format is set to "md_pdf"

Future Enhancements

  • HTML format support (md_html)
  • Custom PDF templates
  • PDF metadata (title, author, keywords)
  • PDF compression options
  • Batch PDF generation