Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Aditya Kumar Singh's picture
4 6 2

Aditya Kumar Singh

rodo
·
http://rodosingh.github.io/
  • rodosingh23
  • rodosingh

AI & ML interests

Multimodal Learning

Organizations

AMD's profile picture AIG-GenAI's profile picture

upvoted 5 papers 9 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 22

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 96

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17 • 17

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 30

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
upvoted a collection about 1 year ago

Qwen2.5-Coder

Collection
Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 348
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs