Roslyn-Stone / DYNAMIC_COMPILATION_BEST_PRACTICES.md

dylanlangston

Add files using upload-large-folder tool

e462aae verified 11 days ago

preview code

raw

history blame contribute delete

13.8 kB

Dynamic Compilation Best Practices

This document explains the best practices for dynamically compiling and executing C# code at runtime, based on Laurent Kempé's article "Dynamically compile and run code using .NET Core 3.0" (article link).

Overview

Dynamic compilation enables scenarios such as:

Plugin architectures
REPL (Read-Eval-Print Loop) implementations
Code evaluation services
Hot-reloading of code without restarting the application
Runtime code generation and execution

Key Concepts

1. AssemblyLoadContext (Critical for .NET Core 3.0+)

What it is: A mechanism introduced in .NET Core that provides control over assembly loading and enables assembly unloading.

Why it matters:

Memory Management: Without proper unloading, dynamically loaded assemblies stay in memory forever
Isolation: Each context provides isolation between different versions of assemblies
Hot Reload: Enables recompilation and reloading of code at runtime
Resource Cleanup: Properly releases memory when assemblies are no longer needed

Implementation:

public class UnloadableAssemblyLoadContext : AssemblyLoadContext
{
    public UnloadableAssemblyLoadContext() 
        : base(isCollectible: true) // CRITICAL: isCollectible must be true
    { }

    protected override Assembly? Load(AssemblyName assemblyName)
    {
        // Return null to use default loading behavior
        // This delegates to the default context for framework assemblies
        return null;
    }
}

Key Points:

isCollectible: true: This is the critical parameter that enables assembly unloading
Must be used for any dynamically loaded assemblies that should be unloadable
Assemblies loaded in collectible contexts can be garbage collected after Unload() is called

2. WeakReference for Tracking Unloading

Purpose: Verify that assemblies are actually unloaded and garbage collected.

Implementation:

var context = new UnloadableAssemblyLoadContext();
WeakReference contextWeakRef = new(context, trackResurrection: true);

try
{
    // Load and execute assembly
    var assembly = context.LoadFromStream(assemblyStream);
    // ... execute code ...
}
finally
{
    // Unload the context
    context.Unload();
    
    // Verify unloading by forcing garbage collection
    for (int i = 0; i < 10 && contextWeakRef.IsAlive; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    // If contextWeakRef.IsAlive is still true, something is holding a reference
}

Key Points:

trackResurrection: true: Tracks the object even if it has a finalizer
After Unload(), the weak reference should become dead after garbage collection
If the weak reference stays alive, it indicates a memory leak (something is holding a reference)

3. Roslyn Compilation API

Two Approaches:

A. Roslyn Scripting API (Simpler, for REPL)

using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.Scripting;

var options = ScriptOptions.Default
    .WithReferences(typeof(object).Assembly)
    .WithImports("System", "System.Linq");

var result = await CSharpScript.RunAsync("1 + 1", options);

Pros:

Very simple API
Built-in state management between executions
Good for REPL scenarios

Cons:

Less control over compilation
Cannot easily unload assemblies
Not suitable when assembly isolation is needed

B. CSharpCompilation API (More Control)

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;

var syntaxTree = CSharpSyntaxTree.ParseText(code);

var references = new[]
{
    MetadataReference.CreateFromFile(typeof(object).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Console).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location)
};

var compilation = CSharpCompilation.Create(
    "DynamicAssembly",
    syntaxTrees: new[] { syntaxTree },
    references: references,
    options: new CSharpCompilationOptions(
        OutputKind.DynamicallyLinkedLibrary,
        optimizationLevel: OptimizationLevel.Release
    )
);

using var ms = new MemoryStream();
var emitResult = compilation.Emit(ms);

if (emitResult.Success)
{
    ms.Seek(0, SeekOrigin.Begin);
    var assembly = context.LoadFromStream(ms);
}

Pros:

Full control over compilation process
Can emit to memory streams for loading in custom contexts
Better error diagnostics
Suitable for production scenarios

Cons:

More verbose
Requires manual reference management

4. Reference Management

Critical: All assemblies and types used in the dynamic code must have their metadata references added to the compilation.

Common References:

var references = new List<MetadataReference>
{
    // Core runtime
    MetadataReference.CreateFromFile(typeof(object).Assembly.Location),
    MetadataReference.CreateFromFile(typeof(Console).Assembly.Location),
    
    // LINQ
    MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location),
    
    // System.Runtime (critical for .NET Core)
    MetadataReference.CreateFromFile(Assembly.Load("System.Runtime").Location),
    
    // Collections
    MetadataReference.CreateFromFile(Assembly.Load("System.Collections").Location),
    
    // For async/await
    MetadataReference.CreateFromFile(typeof(Task).Assembly.Location)
};

Finding Additional References:

// For a specific type you need
var type = typeof(SomeType);
var reference = MetadataReference.CreateFromFile(type.Assembly.Location);

// For framework assemblies
var assembly = Assembly.Load("AssemblyName");
var reference = MetadataReference.CreateFromFile(assembly.Location);

5. Entry Point Discovery

When executing compiled assemblies, you need to find the entry point:

private static MethodInfo? FindEntryPoint(Assembly assembly)
{
    // Traditional Main method
    var programType = assembly.GetTypes()
        .FirstOrDefault(t => t.Name == "Program");
    if (programType != null)
    {
        var mainMethod = programType.GetMethod("Main", 
            BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);
        if (mainMethod != null)
            return mainMethod;
    }
    
    // Top-level statements (C# 9+)
    var entryPoint = assembly.GetTypes()
        .SelectMany(t => t.GetMethods(
            BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic))
        .FirstOrDefault(m => m.Name == "<Main>$");
    
    return entryPoint;
}

Execution:

var entryPoint = FindEntryPoint(assembly);
var parameters = entryPoint.GetParameters().Length == 0 
    ? null 
    : new object[] { Array.Empty<string>() };

var result = entryPoint.Invoke(null, parameters);

// Handle async returns
if (result is Task task)
{
    await task;
}

6. Console Output Capture

For REPL scenarios, capture Console output:

var outputBuilder = new StringBuilder();
var originalOut = Console.Out;

try
{
    using var outputWriter = new StringWriter(outputBuilder);
    Console.SetOut(outputWriter);
    
    // Execute code
    
    await outputWriter.FlushAsync();
    var output = outputBuilder.ToString();
}
finally
{
    Console.SetOut(originalOut);
}

Implementation in RoslynStone

Architecture Decision

We use both approaches strategically:

RoslynScriptingService (Scripting API)
- Used for REPL functionality
- State preservation between executions
- Simple expression evaluation
- Quick prototyping
CompilationService + AssemblyExecutionService (Compilation API)
- Used for file execution
- Proper assembly unloading
- Memory isolation
- Production-grade execution

Services Created

CompilationService

// Compiles C# code to in-memory assemblies
public class CompilationService
{
    public CompilationResult Compile(string code, string? assemblyName = null)
    {
        // Uses CSharpCompilation API
        // Returns MemoryStream with compiled assembly
    }
}

AssemblyExecutionService

// Executes assemblies in unloadable contexts
public class AssemblyExecutionService
{
    public async Task<AssemblyExecutionResult> ExecuteFileAsync(
        string filePath, 
        CancellationToken cancellationToken = default)
    {
        // 1. Compile code
        // 2. Create UnloadableAssemblyLoadContext
        // 3. Load assembly from stream
        // 4. Find and invoke entry point
        // 5. Unload context
        // 6. Verify unloading with WeakReference
    }
}

UnloadableAssemblyLoadContext

// Custom context for assembly isolation
public class UnloadableAssemblyLoadContext : AssemblyLoadContext
{
    public UnloadableAssemblyLoadContext() 
        : base(isCollectible: true) { }
}

Best Practices Summary

✅ DO

Use AssemblyLoadContext for any dynamically loaded assemblies
Set isCollectible: true when creating the context
Use WeakReference to verify unloading
Call Unload() and force garbage collection
Manage metadata references carefully
Capture and handle compilation errors properly
Find entry points for both traditional and top-level statements
Handle async return types (Task, Task)
Capture console output if needed
Dispose MemoryStreams after loading assemblies

❌ DON'T

Don't load assemblies in the default context if you need to unload them
Don't forget to call Unload() on the context
Don't hold references to objects from the unloaded context
Don't use the Scripting API when you need assembly unloading
Don't forget required assembly references (System.Runtime is critical)
Don't emit to disk unless necessary (use MemoryStream)
Don't forget to reset Console.Out after capturing output
Don't ignore compilation diagnostics
Don't assume synchronous execution (handle Task returns)
Don't forget to flush output writers before reading captured output

Memory Management Pattern

// Correct pattern for dynamic compilation and execution
var context = new UnloadableAssemblyLoadContext();
WeakReference weakRef = new(context, trackResurrection: true);

try
{
    // 1. Compile
    var compilation = /* ... */;
    using var ms = new MemoryStream();
    var result = compilation.Emit(ms);
    
    // 2. Load
    ms.Seek(0, SeekOrigin.Begin);
    var assembly = context.LoadFromStream(ms);
    
    // 3. Execute
    var entryPoint = FindEntryPoint(assembly);
    entryPoint.Invoke(null, parameters);
}
finally
{
    // 4. Unload
    context.Unload();
    
    // 5. Verify unloading
    for (int i = 0; i < 10 && weakRef.IsAlive; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    if (weakRef.IsAlive)
    {
        // Memory leak detected - something is still holding a reference
        Console.WriteLine("Warning: Assembly context was not unloaded");
    }
}

Security Considerations

Code Execution Risk: Dynamic compilation executes arbitrary code
- Run in sandboxed environments
- Implement code review/validation
- Use least-privilege execution
Resource Limits:
- Set execution timeouts
- Monitor memory usage
- Limit CPU usage
Assembly References:
- Only add necessary references
- Avoid loading privileged assemblies
- Validate assembly sources

Performance Considerations

First Compilation: ~500-1000ms (includes JIT)
Subsequent Compilations: ~200-300ms
Unloading: ~50-100ms (with forced GC)
Memory: Each loaded assembly context adds ~1-5MB overhead

Optimization Tips:

Cache compilation results when possible
Reuse AssemblyLoadContext instances for similar operations
Batch multiple compilations
Use OptimizationLevel.Release for production

Testing

Essential tests to include:

[Fact]
public async Task Assembly_CanBeUnloaded()
{
    WeakReference weakRef = null;
    
    {
        var context = new UnloadableAssemblyLoadContext();
        weakRef = new WeakReference(context, trackResurrection: true);
        
        // Load and execute assembly
        
        context.Unload();
    }
    
    // Force GC
    for (int i = 0; i < 10; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
    
    Assert.False(weakRef.IsAlive, "Assembly context was not unloaded");
}

References

Conclusion

The key insight from Laurent Kempé's approach is that AssemblyLoadContext with isCollectible: true is essential for proper memory management in dynamic compilation scenarios. Without it, every dynamically loaded assembly stays in memory forever, leading to memory leaks.

Combined with proper use of:

Roslyn's CSharpCompilation API
WeakReference for verification
Correct reference management
Proper entry point discovery

This approach enables production-grade dynamic code execution with full control over memory lifecycle.