Zheng, K. (CSE) – Towards Generalist Embodied World Models: From Neuro-Symbolic Interaction to Self-Evolving 3D World Generation
Artificial intelligence is moving beyond passive perception toward systems that can understand, interact with, and generate the world. This dissertation studies generalist embodied world models that connect language, vision, action, and 3D scene representations. It explores how multimodal systems can ground human instructions in physical environments, reason over long-horizon tasks, generate coherent text-and-visual content, and […]