Thinking Machine
Thinking Machine
AI Teamwork? Not Yet.
0:00
-6:27

AI Teamwork? Not Yet.

Why AI Agents currently crumble under collaborative pressure

White, I., Nottingham, K., Maniar, A., Robinson, M., Lillemark, H., Maheshwari, M., ... & Ammanabrolu, P. (2025). Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning. arXiv preprint arXiv:2504.17950.

This paper introduces MINDcraft, a novel Minecraft-based platform for studying collaborative embodied reasoning in Large Language Models (LLMs), alongside the MineCollab benchmark consisting of cooking, crafting, and construction tasks requiring inter-agent coordination. The research demonstrates that while LLMs exhibit some embodied reasoning capabilities, they struggle with efficient communication for multi-agent collaboration, evidenced by a significant performance drop (up to 15%) when detailed plans must be shared. This suggests that current LLMs are not optimized for such scenarios, revealing a bottleneck in utilizing natural language for coordination and highlighting the need for methods beyond in-context and imitation learning to improve collaborative abilities. The study also showcases how MINDcraft and MineCollab can be used to generate valuable datasets for fine-tuning less computationally intensive models and facilitate future research into the complexities of multi-agent embodied AI.

Discussion about this episode

User's avatar