view article Article 📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think 9 days ago • 3
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published 13 days ago • 25
SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers Paper • 2507.20527 • Published Jul 28 • 5
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 26
HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption Paper • 2310.01779 • Published Oct 3, 2023 • 4