Roslyn-Stone / DYNAMIC_COMPILATION_BEST_PRACTICES.md
dylanlangston's picture
Add files using upload-large-folder tool
e462aae verified

Dynamic Compilation Best Practices

This document explains the best practices for dynamically compiling and executing C# code at runtime, based on Laurent Kempé's article "Dynamically compile and run code using .NET Core 3.0" (article link).

Overview

Dynamic compilation enables scenarios such as:

  • Plugin architectures
  • REPL (Read-Eval-Print Loop) implementations
  • Code evaluation services
  • Hot-reloading of code without restarting the application
  • Runtime code generation and execution

Key Concepts

1. AssemblyLoadContext (Critical for .NET Core 3.0+)

What it is: A mechanism introduced in .NET Core that provides control over assembly loading and enables assembly unloading.

Why it matters:

  • Memory Management: Without proper unloading, dynamically loaded assemblies stay in memory forever
  • Isolation: Each context provides isolation between different versions of assemblies
  • Hot Reload: Enables recompilation and reloading of code at runtime
  • Resource Cleanup: Properly releases memory when assemblies are no longer needed

Implementation:

public class UnloadableAssemblyLoadContext : AssemblyLoadContext
{
    public UnloadableAssemblyLoadContext() 
        : base(isCollectible: true) // CRITICAL: isCollectible must be true
    { }

    protected override Assembly? Load(AssemblyName assemblyName)
    {
        // Return null to use default loading behavior
        // This delegates to the default context for framework assemblies
        return null;
    }
}

Key Points:

  • isCollectible: true: This is the critical parameter that enables assembly unloading
  • Must be used for any dynamically loaded assemblies that should be unloadable
  • Assemblies loaded in collectible contexts can be garbage collected after Unload() is called

2. WeakReference for Tracking Unloading

Purpose: Verify that assemblies are actually unloaded and garbage collected.

Implementation:

var context = new UnloadableAssemblyLoadContext();
WeakReference contextWeakRef = new(context, trackResurrection: true);

try
{
    // Load and execute assembly
    var assembly = context.LoadFromStream(assemblyStream);
    // ... execute code ...
}
finally
{
    // Unload the context
    context.Unload();
    
    // Verify unloading by forcing garbage collection
    for (int i = 0; i < 10 && contextWeakRef.IsAlive; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    // If contextWeakRef.IsAlive is still true, something is holding a reference
}

Key Points:

  • trackResurrection: true: Tracks the object even if it has a finalizer
  • After Unload(), the weak reference should become dead after garbage collection
  • If the weak reference stays alive, it indicates a memory leak (something is holding a reference)

3. Roslyn Compilation API

Two Approaches:

A. Roslyn Scripting API (Simpler, for REPL)

using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.Scripting;

var options = ScriptOptions.Default
    .WithReferences(typeof(object).Assembly)
    .WithImports("System", "System.Linq");

var result = await CSharpScript.RunAsync("1 + 1", options);

Pros:

  • Very simple API
  • Built-in state management between executions
  • Good for REPL scenarios

Cons:

  • Less control over compilation
  • Cannot easily unload assemblies
  • Not suitable when assembly isolation is needed

B. CSharpCompilation API (More Control)

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;

var syntaxTree = CSharpSyntaxTree.ParseText(code);

var references = new[]
{
    MetadataReference.CreateFromFile(typeof(object).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Console).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location)
};

var compilation = CSharpCompilation.Create(
    "DynamicAssembly",
    syntaxTrees: new[] { syntaxTree },
    references: references,
    options: new CSharpCompilationOptions(
        OutputKind.DynamicallyLinkedLibrary,
        optimizationLevel: OptimizationLevel.Release
    )
);

using var ms = new MemoryStream();
var emitResult = compilation.Emit(ms);

if (emitResult.Success)
{
    ms.Seek(0, SeekOrigin.Begin);
    var assembly = context.LoadFromStream(ms);
}

Pros:

  • Full control over compilation process
  • Can emit to memory streams for loading in custom contexts
  • Better error diagnostics
  • Suitable for production scenarios

Cons:

  • More verbose
  • Requires manual reference management

4. Reference Management

Critical: All assemblies and types used in the dynamic code must have their metadata references added to the compilation.

Common References:

var references = new List<MetadataReference>
{
    // Core runtime
    MetadataReference.CreateFromFile(typeof(object).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Console).Assembly.Location),
    
    // LINQ
    MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location),
    
    // System.Runtime (critical for .NET Core)
    MetadataReference.CreateFromFile(Assembly.Load("System.Runtime").Location),
    
    // Collections
    MetadataReference.CreateFromFile(Assembly.Load("System.Collections").Location),
    
    // For async/await
    MetadataReference.CreateFromFile(typeof(Task).Assembly.Location)
};

Finding Additional References:

// For a specific type you need
var type = typeof(SomeType);
var reference = MetadataReference.CreateFromFile(type.Assembly.Location);

// For framework assemblies
var assembly = Assembly.Load("AssemblyName");
var reference = MetadataReference.CreateFromFile(assembly.Location);

5. Entry Point Discovery

When executing compiled assemblies, you need to find the entry point:

private static MethodInfo? FindEntryPoint(Assembly assembly)
{
    // Traditional Main method
    var programType = assembly.GetTypes()
        .FirstOrDefault(t => t.Name == "Program");
    if (programType != null)
    {
        var mainMethod = programType.GetMethod("Main", 
            BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);
        if (mainMethod != null)
            return mainMethod;
    }
    
    // Top-level statements (C# 9+)
    var entryPoint = assembly.GetTypes()
        .SelectMany(t => t.GetMethods(
            BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic))
        .FirstOrDefault(m => m.Name == "<Main>$");
    
    return entryPoint;
}

Execution:

var entryPoint = FindEntryPoint(assembly);
var parameters = entryPoint.GetParameters().Length == 0 
    ? null 
    : new object[] { Array.Empty<string>() };

var result = entryPoint.Invoke(null, parameters);

// Handle async returns
if (result is Task task)
{
    await task;
}

6. Console Output Capture

For REPL scenarios, capture Console output:

var outputBuilder = new StringBuilder();
var originalOut = Console.Out;

try
{
    using var outputWriter = new StringWriter(outputBuilder);
    Console.SetOut(outputWriter);
    
    // Execute code
    
    await outputWriter.FlushAsync();
    var output = outputBuilder.ToString();
}
finally
{
    Console.SetOut(originalOut);
}

Implementation in RoslynStone

Architecture Decision

We use both approaches strategically:

  1. RoslynScriptingService (Scripting API)

    • Used for REPL functionality
    • State preservation between executions
    • Simple expression evaluation
    • Quick prototyping
  2. CompilationService + AssemblyExecutionService (Compilation API)

    • Used for file execution
    • Proper assembly unloading
    • Memory isolation
    • Production-grade execution

Services Created

CompilationService

// Compiles C# code to in-memory assemblies
public class CompilationService
{
    public CompilationResult Compile(string code, string? assemblyName = null)
    {
        // Uses CSharpCompilation API
        // Returns MemoryStream with compiled assembly
    }
}

AssemblyExecutionService

// Executes assemblies in unloadable contexts
public class AssemblyExecutionService
{
    public async Task<AssemblyExecutionResult> ExecuteFileAsync(
        string filePath, 
        CancellationToken cancellationToken = default)
    {
        // 1. Compile code
        // 2. Create UnloadableAssemblyLoadContext
        // 3. Load assembly from stream
        // 4. Find and invoke entry point
        // 5. Unload context
        // 6. Verify unloading with WeakReference
    }
}

UnloadableAssemblyLoadContext

// Custom context for assembly isolation
public class UnloadableAssemblyLoadContext : AssemblyLoadContext
{
    public UnloadableAssemblyLoadContext() 
        : base(isCollectible: true) { }
}

Best Practices Summary

✅ DO

  1. Use AssemblyLoadContext for any dynamically loaded assemblies
  2. Set isCollectible: true when creating the context
  3. Use WeakReference to verify unloading
  4. Call Unload() and force garbage collection
  5. Manage metadata references carefully
  6. Capture and handle compilation errors properly
  7. Find entry points for both traditional and top-level statements
  8. Handle async return types (Task, Task)
  9. Capture console output if needed
  10. Dispose MemoryStreams after loading assemblies

❌ DON'T

  1. Don't load assemblies in the default context if you need to unload them
  2. Don't forget to call Unload() on the context
  3. Don't hold references to objects from the unloaded context
  4. Don't use the Scripting API when you need assembly unloading
  5. Don't forget required assembly references (System.Runtime is critical)
  6. Don't emit to disk unless necessary (use MemoryStream)
  7. Don't forget to reset Console.Out after capturing output
  8. Don't ignore compilation diagnostics
  9. Don't assume synchronous execution (handle Task returns)
  10. Don't forget to flush output writers before reading captured output

Memory Management Pattern

// Correct pattern for dynamic compilation and execution
var context = new UnloadableAssemblyLoadContext();
WeakReference weakRef = new(context, trackResurrection: true);

try
{
    // 1. Compile
    var compilation = /* ... */;
    using var ms = new MemoryStream();
    var result = compilation.Emit(ms);
    
    // 2. Load
    ms.Seek(0, SeekOrigin.Begin);
    var assembly = context.LoadFromStream(ms);
    
    // 3. Execute
    var entryPoint = FindEntryPoint(assembly);
    entryPoint.Invoke(null, parameters);
}
finally
{
    // 4. Unload
    context.Unload();
    
    // 5. Verify unloading
    for (int i = 0; i < 10 && weakRef.IsAlive; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    if (weakRef.IsAlive)
    {
        // Memory leak detected - something is still holding a reference
        Console.WriteLine("Warning: Assembly context was not unloaded");
    }
}

Security Considerations

  1. Code Execution Risk: Dynamic compilation executes arbitrary code

    • Run in sandboxed environments
    • Implement code review/validation
    • Use least-privilege execution
  2. Resource Limits:

    • Set execution timeouts
    • Monitor memory usage
    • Limit CPU usage
  3. Assembly References:

    • Only add necessary references
    • Avoid loading privileged assemblies
    • Validate assembly sources

Performance Considerations

  1. First Compilation: ~500-1000ms (includes JIT)
  2. Subsequent Compilations: ~200-300ms
  3. Unloading: ~50-100ms (with forced GC)
  4. Memory: Each loaded assembly context adds ~1-5MB overhead

Optimization Tips:

  • Cache compilation results when possible
  • Reuse AssemblyLoadContext instances for similar operations
  • Batch multiple compilations
  • Use OptimizationLevel.Release for production

Testing

Essential tests to include:

[Fact]
public async Task Assembly_CanBeUnloaded()
{
    WeakReference weakRef = null;
    
    {
        var context = new UnloadableAssemblyLoadContext();
        weakRef = new WeakReference(context, trackResurrection: true);
        
        // Load and execute assembly
        
        context.Unload();
    }
    
    // Force GC
    for (int i = 0; i < 10; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    Assert.False(weakRef.IsAlive, "Assembly context was not unloaded");
}

References

Conclusion

The key insight from Laurent Kempé's approach is that AssemblyLoadContext with isCollectible: true is essential for proper memory management in dynamic compilation scenarios. Without it, every dynamically loaded assembly stays in memory forever, leading to memory leaks.

Combined with proper use of:

  • Roslyn's CSharpCompilation API
  • WeakReference for verification
  • Correct reference management
  • Proper entry point discovery

This approach enables production-grade dynamic code execution with full control over memory lifecycle.