Qwen3-Coder-30B-A3B-n8n-Workflow-Generator

n8nbuilder.dev

Fine-tuned Qwen3-Coder-30B-A3B-Instruct model specialized for generating n8n workflow JSONs from natural language descriptions.

Model Description

This model is a QLoRA fine-tuned version of Qwen/Qwen3-Coder-30B-A3B-Instruct on the n8nbuilder-n8n-workflows-dataset, containing +2.5K n8n workflow templates.

Key Features:

  • MoE Architecture: Mixture of Experts (A3B) architecture for efficient inference
  • 30B Parameters: Large-scale model with 30 billion parameters
  • Faster Inference: Optimized MoE design enables faster inference times compared to dense models
  • Apple Silicon Optimized: MLX Q4 quantized version available for Mac M4 Pro and other Apple Silicon devices

Training Details:

  • Base Model: Qwen/Qwen3-Coder-30B-A3B-Instruct
  • Method: QLoRA (4-bit quantization)
  • LoRA Rank: 8
  • LoRA Alpha: 16
  • LoRA Dropout: 0.05
  • Training Steps: 451 (3 epochs)
  • Sequence Length: 8192 tokens
  • Learning Rate: 1e-4
  • Total Sequences: 2,308
  • Total Tokens: 28,426,032

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

system_prompt = "You are an expert n8n workflow generation assistant. Your goal is to create valid, efficient, and functional n8n workflow configurations."

user_input = "Create a workflow that monitors a RSS feed and sends new items to Discord."

prompt = f"{system_prompt}\n\n{user_input}"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=4096,
    temperature=0.7,
    do_sample=True
)

workflow_json = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(workflow_json)

MLX (Apple Silicon)

# Download MLX Q4 model
mlx_lm.generate \
  --model mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator/mlx-q4 \
  --prompt "You are an expert n8n workflow generation assistant. Your goal is to create valid, efficient, and functional n8n workflow configurations.\n\nCreate a workflow that sends Slack notifications when GitHub issues are created." \
  --max-tokens 4096 \
  --temp 0.7

Using LoRA Adapter

If you want to load the base model and apply the LoRA adapter separately:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model_name = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
adapter_name = "mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator"

tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base_model, adapter_name)
model = model.merge_and_unload()  # Optional: merge adapter into base model

Training Data

This model was fine-tuned on the n8nbuilder-n8n-workflows-dataset, which contains:

Architecture: Mixture of Experts (MoE)

This model uses the A3B (Activate 3 Billion) MoE architecture, which means:

  • Total Parameters: 30B parameters
  • Active Parameters: Only ~3B parameters are activated per token during inference
  • Efficiency: Significantly faster inference compared to dense 30B models
  • Expert Routing: Intelligent routing mechanism selects the most relevant experts for each token

The MoE architecture enables this model to maintain the quality of a 30B parameter model while achieving inference speeds closer to a 3B parameter model.

Performance

  • Training: Fine-tuned on Fireworks.ai infrastructure
  • Inference: Optimized for fast inference with MoE architecture
  • MLX Performance: Excellent performance on Apple Silicon (M4 Pro) with Q4 quantization
  • Memory Efficient: QLoRA training enables efficient fine-tuning with minimal memory overhead

Model Files

This repository contains:

  • Adapter: LoRA adapter weights (adapter_model.safetensors, adapter_config.json)
  • Merged Model: Full fine-tuned model (base + adapter merged)
  • MLX Q4: Quantized model optimized for Apple Silicon (mlx-q4/ directory)

Limitations

  • Generated workflows may require manual validation
  • Long workflows (>8192 tokens) may be truncated
  • Model trained on public templates only
  • MoE routing may occasionally select suboptimal experts

Citation

@model{qwen3_coder_n8n_2025,
  title={Qwen3-Coder-30B-A3B-n8n-Workflow-Generator},
  author={mbakgun},
  year={2025},
  base_model={Qwen/Qwen3-Coder-30B-A3B-Instruct},
  dataset={mbakgun/n8nbuilder-n8n-workflows-dataset},
  url={https://huggingface.co/mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator}
}

Acknowledgments

  • Qwen Team for the base model
  • n8n for the workflow automation platform
  • n8n-mcp for template indexing

License

Apache 2.0

Downloads last month
52
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator

Finetuned
(32)
this model

Dataset used to train mbakgun/qwen3-coder-30b-a3b-n8n-workflow-generator