Рубрики NewsTechnologies

Google DeepMind describes in detail how artificial intelligence can destroy the world

Published by Andrii Rusanov

Google DeepMind researchers have been working on the security problem of general artificial intelligence (AGI) and released a document that explains the risks and principles of safe development.

The PDF file contains a huge amount of detail and has 108 pages before the list of references. While some artificial intelligence experts say that AGI — is a pipe dream, the authors from DeepMind believe that it could appear by 2030. Experts have sought to understand the risks of creating synthetic intelligence similar to human beings and believe that it could cause serious harm to humanity.

The study identifies four types of risks from AGI along with suggestions on how to prevent them. The DeepMind team considers misuse, misalignment, errors, and structural risks to be the problems. Misuse and misalignment are discussed in detail in the paper, but the latter two are only briefly covered.

The first possible problem, misuse, similar to existing AI risks. However, because AGI will be more powerful by definition, the damage it can do is much greater. With too much access, an AGI could abuse the system to do harm — for example, discover and exploit zero-day vulnerabilities or create a virus that could be used as a biological weapon.

DeepMind says that companies developing AGI will have to conduct extensive testing and create robust security protocols. They also propose developing a method to completely suppress dangerous abilities, called «weaning», but it is unclear if this is possible without significantly limiting the models.

Divergence — the state when a machine gets rid of the constraints imposed by its developers. AI performs actions that it knows were not intended by the developer. DeepMind claims that its standard for divergence is more advanced than simple deception or intrigue.

To avoid it, DeepMind suggests that developers work on model robustness, conducting intensive stress testing and monitoring to detect any hints of deception. AGI has be in virtual sandboxes With strict security and direct human supervision, this will help mitigate problems.

If the artificial intelligence did not know that its result would be harmful, and the human operator did not intend harm, it is a mistake. Modern AI models also make similar mistakes, but AGI can make more significant ones. DeepMind cites the example of the military, which can deploy AGI due to rivalry with a possible enemy, but does not sufficiently «insure» it against mistakes.

The article does not provide a very good solution to mitigate errors. In it, the researchers recommend avoid a sharp jump in AI power. They write about AGI’s slow deployment and limited capabilities, and recommend running AGI commands through a «shield» system that guarantees their security before deployment.

Structural risks — unintended consequences of multi-agent systems. For example, AGI can create false information that is so believable that we no longer know who or what to trust. The paper also raises the possibility that AGI may accumulate more and more control over economic and political systems. «Then one day we look up and realize that machines are driving instead of us», — notes Ars Technica. Such risks are the most difficult to counteract, depending on the social structure and many factors.

This paper is not the final word on AGI security — DeepMind notes that it is only a «starting point for vital conversations». If the researchers are right and AGI will change the world in just five years, we need to talk about it now.