MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper โข 2509.16197 โข Published Sep 19 โข 56
TransMLA: Multi-head Latent Attention Is All You Need Paper โข 2502.07864 โข Published Feb 11 โข 58