Aditya Kumar Singh's picture

4 6 2

Aditya Kumar Singh

rodo

·

http://rodosingh.github.io/

AI & ML interests

Multimodal Learning

Organizations

upvoted 5 papers 9 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 22

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 96

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17 • 17

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 30

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

upvoted a collection about 1 year ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 348