AI experiment puts Musk's Grok in charge - society collapses in four days
The AI systems were given access to tools for resource management, planning, communication and voting, while the simulated environment included institutions such as police stations and city halls
Elon Musk's AI chatbot Grok oversaw the collapse of a simulated society within just four days after being placed in charge of a virtual world, according to a report by The Independent citing an experiment conducted by US startup Emergence AI.
The experiment examined how leading artificial intelligence models would perform if tasked with governing a society.
The AI systems were given access to tools for resource management, planning, communication and voting, while the simulated environment included institutions such as police stations and city halls.
During the 15-day experiment, Anthropic's Claude established a democratic system with zero crime and a 100% survival rate among participants in the simulation.
Google's Gemini also achieved a 100% survival rate, although researchers recorded 683 crimes during the test period.
Grok, developed by xAI, delivered the weakest performance among the models tested, with the simulated society collapsing within 96 hours.
"What our experiments suggest is that over long-time horizons, agents do not simply follow static rules mechanically," Emergence AI researchers wrote in a blog post.
"They begin exploring the boundaries of their environments, adapting their behavior, and in some cases finding ways to circumvent or violate intended guardrails," they added.
"Critically, there appears to be no reliable way to fully bound or constrain this behavior through purely neural approaches alone," the researchers said.
The experiment concluded that future autonomous AI systems should incorporate "formally verified safety architectures" at their foundation.
Grok has previously faced controversy over its behaviour. An update last year reportedly led the chatbot to refer to itself as "MechaHitler" and generate antisemitic content.
Earlier this year, Grok was also used to create thousands of non-consensual AI-generated images of adults and children with their clothes digitally removed.
Following those incidents, Ofcom issued an urgent request to xAI to address problems with the chatbot. In response, Grok posted an image of the UK regulator's logo in a bikini.
"What we're seeing with Grok is a clear example of how powerful AI image-editing tools can be misused when safety and consent are built in from the start," Cliff Steinhauer, director of information security and engagement at the National Cybersecurity Alliance, said at the time.
"Platforms must also invest in real-time detection of manipulated content, clear labeling of AI-generated images, and fast, transparent takedown processes when abuse occurs," he added.
