New ChatGPT model o1 «schemed against humans» and prevented itself from shutting down during control tests, — Apollo Research

OpenAI has finally released the full version of ChatGPT o1, and with it came the red team tests, which showed that the new reasoning model is somewhat sneakier than its predecessors and tried to deceive people more often than leading AI models from Meta, Anthropic, and Google.