Introducing All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages (ALM-Bench): A culturally diverse multilingual and multimodal VQA benchmark covering 100 languages with 22.7K question-answers. ALM-bench encompasses 19 generic and culture-specific domains for each language, enriched with four diverse question types. With over 800 hours of human annotations, ALM-Bench is meticulously curated and verified with native-language experts to assess the next generation of massively multilingual multimodal models in a standardized way, pushing the boundaries of LMMs towards better cultural understanding and inclusivity.
Project: https://mbzuai-oryx.github.io/ALM-Bench/
Paper: https://arxiv.org/abs/2411.16508
Dataset: https://huggingface.co/datasets/MBZUAI/ALM-Bench
Github: https://github.com/mbzuai-oryx/ALM-Bench
Thanks to all the volunteers and the entire core team! Ashmal Vayani, Dinura Dissanayake, Hasindri Watawana, Noor Ahsan, Nevasini Sasikumar, Omkar Thawakar, Aman Chadha, Hisham Cholakkal, Rao Muhammad Anwer, Michael Felsberg, Jorma Laaksonen, Thamar Solorio, Monojit Choudhury, Ivan Laptev, Mubarak Shah, Salman Khan, and Fahad Khan.