WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published 7 days ago • 17
TiDAR: Think in Diffusion, Talk in Autoregression Paper • 2511.08923 • Published 25 days ago • 110
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4 • 27
Running on CPU Upgrade Featured 2.53k The Smol Training Playbook 📚 2.53k The secrets to building world-class LLMs
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-hadamard-transform 17B • Updated Nov 5 • 8
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-identity-transform 17B • Updated Nov 5 • 9
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-rtn-hadamard-transform 17B • Updated Nov 5 • 7
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-rtn-identity-transform 17B • Updated Nov 5 • 5
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-hadamard-transform 7B • Updated Nov 5 • 113
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-identity-transform-actorder 7B • Updated Nov 5 • 5
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-hadamard-transform-actorder 7B • Updated Nov 5 • 5
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-hadamard-transform-fake_quant Updated Nov 3 • 4 • 1