Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
-
Updated
Apr 6, 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
We introduce the YesBut-v2, a benchmark for assessing AI's ability to interpret juxtaposed comic panels with contradictory narratives. Unlike existing benchmarks, it emphasizes visual understanding, comparative reasoning, and social knowledge.
Add a description, image, and links to the mllm-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the mllm-reasoning topic, visit your repo's landing page and select "manage topics."