About Me
I am Zhiyu Zhou (ๅจๅฟๅฎ in Chinese), a senior undergraduate student in Software Engineering at Jilin University, advised by Assoc. Prof. Hongxia Xie.
My current research focuses on multimodal large language models (MLLMs), embodied AI, and computer vision. I am particularly interested in the synergy among perception, understanding, and action, with the goal of building intelligent systems that can interact more naturally with the physical world and better align with human-like cognitive processes.
My long-term goal is to advance artificial intelligence from understanding toward perception and action in the real world.
Email: zhouzy1622@mails.jlu.edu.cn
News
- 2026.02: Our work MindPower was accepted to CVPR 2026.
Publications
PinpointQA: A Dataset and Benchmark for Small Object-Centric Spatial Understanding in Indoor Videos
PinpointQA focuses on small object-centric spatial understanding in indoor videos, providing a dataset and benchmark for evaluating fine-grained spatial perception and reasoning.
MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
CVPR 2026. MindPower explores Theory-of-Mind reasoning in vision-language-model-based embodied agents, aiming to improve how agents infer intentions, beliefs, and goals during embodied interaction.
Education
Expected
B.Eng. in Software Engineering
Experience
Affective Vision Computing (AVC) Lab, Jilin University
Honors and Awards
Jilin University