9: Threats and prospects. (Threats and promises.) - Artificial intelligence

Lecture

Это продолжение увлекательной статьи про искусственный интеллект и риски.

...

brought the control rod to its last position. Fermi made several measurements and calculations, and then began the process of lifting the rod again in small steps. At 3:25 pm Fermi ordered the rod to be raised another 12 inches. "This should give a result," said Fermi. “Now it will become self-sustaining. The schedule will grow and grow without alignment. ”

Herbert Anderson tells (Rhodes, 1986):

“In the beginning, you could hear the sound of a neutron counter, a snap-click. Then the cracks began to appear more and more often, and after a while they merged into a roar; the counter did not have time for them anymore. Now I had to switch to a graphic recorder. But when this was done, they all stared in sudden silence at the increasing deviation of the pen of the recorder. It was a significant silence. Everyone understood the significance of this switch; we were on a higher intensity mode and the counters could no longer handle this situation. Again and again, the scale of the recorder had to change in order to adjust to the intensity of the neutrons, which grew more and more rapidly. Suddenly Fermi raised his hand. "The reactor has reached criticality," he announced. Никто из присутствующих не имел на этот счёт никаких сомнений».

Fermi let the reactor work for 28 minutes, with a neutron intensity doubling rate of two minutes. The first critical reaction was K at 1.0006. But even at K = 1.0006, the reactor was controlled only because some of the neutrons from the uranium fission linger - they are obtained by the decay of short-lived fission products. For every 100 decays of U235, 242 neutrons are emitted almost instantly (0.0001 s) and 1.58 neutrons are emitted on average in ten seconds. Since the average lifetime of a neutron is ~ 0.1 seconds, which means 1200 generations in 2 minutes, and a doubling time of 2 minutes, because multiplying 1.0006 by 1200 approximately gives 2. The nuclear reaction, which is instantly critical (prompt critical), reaches criticality without the contribution of pending neutrons. If the Fermi reactor would be instantaneously critical with k = 1.0006, the neutron intensity would double every tenth of a second.

The first moral of this story is that mixing the rate of research of AI with the speed of a real AI is similar to mixing the speed of physical research with the speed of nuclear reactions. There is a mixture of maps and territory. It took years to build this first reactor, through the efforts of a small group of physicists who did not publish many press releases. But when the reactor was built, interesting events occurred on the time scale of nuclear interactions, and not on the time scale of human communication. In the nuclear field, elementary interactions occur much faster than human neurons work. The same can be said about transistors.

Another moral is that there is a colossal difference between the situation when one self-improvement starts an average of 0.9994 further self-improvements, and when one self-improvement starts 1.0006 further self-improvements. A nuclear reactor crossed the threshold of criticality, not because physicists suddenly laid a lot of additional material in it. Physicists injected the substance slowly and evenly. Even if there is a smooth brain intellect curve as a function of the optimization pressure exerted before on this brain, the recursive self-improvement curve may contain a huge jump.

There are other reasons why an AI can make a sudden huge leap in intelligence. The view of Homo sapiens made a big leap in the effectiveness of intelligence, as a result of natural selection, which exerted more or less uniform pressure on hominids for millions of years, gradually expanding the brain and frontal cortex, setting up the software architecture. Several tens of thousands of years ago, the intellect of hominids crossed a certain key threshold and made a huge leap in efficiency in the real world; we moved from the caves to the skyscrapers in the blink of an eye of evolution. This happened with a constant selection pressure - there was no big jump in the optimizing power of evolution when people appeared. Our corresponding brain architecture also developed smoothly - the volume of our skull did not suddenly increase by two orders of magnitude. So it may happen that even if the AI develops from outside by human engineers, the curve of its intellectual efficiency will make a sharp leap.

Or, perhaps, someone will build a prototype of AI, which will show some promising results, and this demo version will attract an additional $ 100 million in venture capital, and this money will be purchased a thousand times more than supercomputers. I doubt that the enhancement of equipment by a factor of 1000 will lead to something like an increase in the intellectual potential by a factor of 1000 — but this doubt itself is not reliable if there is no possibility to make any analytical calculations.

Compared to chimpanzees, a person has a threefold advantage in the brain and a sixfold advantage in the frontal cortex, which means (a) programs are more important than equipment and (b) small increases in equipment can support large software improvements. And there is one more thing to consider.

Ultimately, the AI can make a seemingly sharp leap in intelligence only because of anthropomorphism, that is, the human inclination to think of the “village idiot” and Einstein as the extreme limits of the intellectual scale, and not as almost indistinguishable points on the scale of minds in general.
Any object more dumb than a dumb person may seem to us just dumb. One can imagine an “arrow of AI” slowly creeping along the scale of intelligence, passing levels of mouse and chimpanzees, while the AI is still dumb because the AI cannot speak or write scientific papers freely, and then the arrow of the AI crosses the thin line between ultra- idiot and einstein for a month or the same small period. I do not think that this scenario is convincing, mainly because I do not expect that the recursive self-improvement curve will creep linearly. But I will not be the first to point out that an AI is a moving target. Once the milestone is reached, it ceases to be an AI. This can only inspire procrastination.

Let's say, to continue the discussion that, based on everything we know (and this seems to me really possible), the AI has the ability to make a sudden, sharp, huge leap in intellectuality. What follows from this? First and foremost: it follows from this that the reaction that I often heard: “We should not care about Friendly AI, because we don’t have the AI itself” - is wrong or just suicidal. We cannot rely on the fact that we will have warning signals in advance before an AI is created; past technological revolutions usually did not telegraph themselves to people who lived at that time, whatever they may say. The math and technique of Friendly AI will not appear from nowhere when it is needed; it takes years to establish solid foundations. And we must solve the Friendly AI problem before a universal AI appears, and not after; I should not even talk about it. There will be difficulties with Friendly AI, because the field of AI research itself has little agreement and high entropy. But this does not mean that we should not worry about Friendly AI. This means that there will be difficulties. These two statements, unfortunately, are not even remotely equivalent.

The possibility of a sharp jump in intelligence also requires high standards for Friendly AI. A technique cannot rely on a programmer’s ability to observe AI against his will, rewrite AI against his will, threaten with superior military power, or that programmers can control the “reward button” that an intelligent AI will take from programmers, and so on. In fact, no one should proceed from these assumptions. The necessary protection is an AI that does not want to hurt you. Without this, no additional protection is safe. No system is safe if it is looking for ways to destroy its security. If AI hurts humanity in any sense, you had to do something wrong on a very deep level, twisting your basic premises. You make a shotgun, direct it to your foot and pull the hook. You deliberately set in motion some kind of cognitive dynamics, which, under certain circumstances, will tend to harm you. This is the wrong behavior for this dynamic; write code that does something else instead.

For about the same reasons, Friendly AI programmers should assume that the AI has full access to its source code. If an AI wants to modify itself to be no more Friendly, Friendliness has already failed at the moment the AI created such an intention. Any solution that relies on the AI not being able to modify itself will be destroyed in one way or another, and will be destroyed even if the AI decides never to modify itself. I'm not saying that this should be the only precaution, but the main and irreplaceable precaution will be that you create an AI that does not want to harm humanity.

To avoid the fallacy of the Gigante Cheesecake, we must say that the ability to improve oneself does not mean choosing to do it. Successful implementation of the Friendly AI technique can create an AI that has the potential to grow more quickly, but choose to grow more slowly and along a more manageable curve.

Even in this case, after the AI passes the critical threshold of recursive self-improvement, you will be operating in a much more dangerous mode. If friendliness fails, the AI can decide to head at full speed toward self-improvement — metaphorically speaking, it will become instantly critical.

I tend to assume potentially arbitrarily large leaps in intellectuality, because this is (a) a conservative assumption; (b) it rejects proposals to build an AI without a real understanding of it; and (c) large potential jumps seem to me most likely in the real world. If I find a field of knowledge in which a conservative view of the prospects for risk management assumes a slow improvement in AI, then I will demand that this plan not become catastrophic if the AI slows down at the near-human stage for years or longer. This is not an area for which I would like to offer narrow intervals of confidence.

8: Equipment. (Hardware.)

People tend to think of big computers as a key AI factor. This, to put it mildly, is a very dubious statement. Non-futurists, discussing AI, usually talk about the progress of computer equipment, because it is easy to measure - unlike understanding intelligence. Not because there is no progress, but because this progress cannot be expressed in neat graphs of computer presentations. It is difficult to report improvements in understanding, and therefore less is reported. Instead of thinking about the “minimum” level of equipment that is “necessary” for AI, let us think better about the minimum level of understanding of the researcher, which decreases as the equipment improves. The better the computer equipment, the less understanding you need to build an AI. The extreme case is natural selection, which used amazing amounts of brute computer power to create human intelligence, without using any understanding, only the random storage of random mutations.

Increasing computer power makes making AI easier, but there are no obvious reasons why increasing computer power will help make AI Friendly. The increasing power of computers makes it easier to use brute force, as well as the combination of poorly understood, but working techniques. Moore's law steadily reduces the barrier that prevents us from building AI without a deep understanding of thinking.

It is acceptable to fail in attempts to create both AI and Friendly AI. It is acceptable to achieve success in AI and Friendly AI. What is unacceptable is to create an AI and fail in creating a Friendly AI. Moore's law makes the latter much easier. “Easier,” but, thank God, not easy. I doubt that the AI will be simple when it is finally built - simply because there are groups of people who will make great efforts to build an AI, and one of them will succeed when the AI finally becomes possible through enormous efforts.

Moore's Law mediates the interaction between Friendly AI and other technologies, which adds the often overlooked global risk to other technologies. We can imagine that molecular nanotechnology is being developed by a soft multinational government consortium and they successfully managed to avoid the dangers of the physical level of nanotechnology. They did not directly prevent the replicator from spreading at random, and with much greater difficulty placed the global defense on the ground against hostile replicators; they have limited access to a basic level of nanotechnology, at the same time distributing custom nanoblocks, and so on. (See Phoenix and Treder, in the same compilation.) But, nevertheless, nanocomputers are becoming widespread because of the restrictions that have been undertaken are bypassed, or because no restrictions have been imposed. And then someone achieves brute force AI, which is not Friendly, and the case is over. This scenario is especially worrisome because incredibly powerful nanocomputers will be among the first, simplest and seemingly safest applications of nanotechnology.

What about regulatory control over supercomputers? I would definitely not rely on it to prevent the creation of AI; yesterday's supercomputers are tomorrow's laptops. The standard answer to the proposal for regulation is that when nanocomputers are outlawed, only outlawed people will possess them.

It is difficult to prove that the perceived benefits of limiting proliferation outweigh the inevitable risks of inaccurate dissemination. I myself will definitely not advocate regulatory restrictions on the use of supercomputers for AI research; this suggestion of questionable utility will be flooded with the entire AI community. But in the unlikely event that this proposal is accepted - which is very far from the current political process - I will not make significant efforts to fight it, because I don’t think that good guys need access to their modern supercomputers. Friendly AI is not about brute force.

I can imagine regulators that effectively control a small set of super-expensive computer resources, which are now called supercomputers. But computers are everywhere. This is not similar to nuclear non-proliferation, where the main direction is the control of plutonium and enriched uranium. Source materials for AI are already everywhere. This cat jumped out of the bag so far that it was already in your wristwatch, cell phone and dishwasher. This is also a special and unusual factor of AI as a global risk. We are not separated from the risky process by large visible installations, such as isotope centrifuges or particle accelerators, but only by insufficient knowledge. If one uses a too dramatic metaphor, it is just as if the subcritical masses of enriched uranium would set machines and ships around the world in motion before Leo Szilard first thought of a chain reaction.

9: Threats and prospects. (Threats and promises.)

This is a risky intellectual enterprise, - to try to predict exactly how exactly a benevolent AI will help humanity, or an unfriendly AI will hurt. There is a risk of systematic overlap: every added detail necessarily reduces the overall probability of the entire story, but subjects tend to attribute a greater likelihood to stories that include well-defined added details. (See Eliezer Yudkovsky. Systematic errors in reasoning, potentially affecting the assessment of global risks.) There is a risk - almost certainly - to fail, trying to imagine a future scenario; and there is the risk of the Giant Cheesecake erroneousness, which turns from opportunity into motivating force (that leaps from capability to motive).
Nevertheless, I will try to outline the threats and prospects. The future has a reputation for accomplishing feats that the past considered impossible. Future civilizations even violated what past civilizations considered (incorrectly, of course) the laws of physics. If the prophets of 1900 - and do not even think about the year 1000 - tried to limit the power of human civilization in a billion years, then some of the impossibilities they named would be overcome before the end of the century; turning lead into gold, for example. We remember that future civilizations surprised past civilizations, and therefore it became a cliché that we could not impose restrictions on our great-great-grandchildren. And yet, in the 20th century, in the 19th century, and in the 11th century, we were human.
We can distinguish three families of unreliable metaphors for representing the capabilities of a superior AI man:
- G-factor metaphor: inspired by differences in the individual level of intelligence between people. AI will patent new technologies, publish breakthrough articles, make money on the stock market, or lead political blocs.
- historical metaphor: inspired by the knowledge of the differences between past and future human civilizations. AI will quickly introduce a set of capabilities that is usually associated with human civilization in a hundred or a thousand years: molecular nanotechnology; interstellar travel; computers performing 1025 operations per second.

- Species metaphor: inspired by differences in the brain architecture between species. AI will master the magic.

The metaphor of the G-factor is most popular in modern futurology: when people think of intellectuality, they think of human genius, and not about people in general. In stories of hostile AI, G-metaphors are responsible for a “good story” in the spirit of Bostrom: namely, for an opponent powerful enough to create dramatic tension, but not powerful enough to instantly destroy heroes like flies, and ultimately , weak enough to lose in the last chapters of the book. Goliath vs. David is an example of a good story, but Goliath against a fruit fly is not.

If we assume the metaphor of the G-factor, then the risks of a global catastrophe in this scenario are relatively moderate: hostile AI is not a greater threat than hostile human genius.

If we assume a multiplicity of AI, then we have a metaphor of conflict between the AI tribe and the human tribe. If the AI tribe wins in a military conflict and exterminates people, then this is a global catastrophe of type of explosion (Bostrom, 2001). If the AI tribe dominates the world economically and gains effective control over the fate of intelligent life that has arisen on Earth, but the goals of the AI will not be interesting or worthwhile for us, it will be a screech in the spirit of Yelp, Whimper or Crunch. But how likely is it that the AI will bridge the entire huge gap from the amoeba to the village idiot, and then stop at the level of human genius? The fastest observed neuron is triggered 1000 times per second; the fastest axon transmits signals at a speed of 150 meters per second, at half a millionth of the speed of light; each synapse operation dissipates approximately 15,000 attojoules, which is a million times more than the thermodynamic minimum for irreversible calculations at room temperature (kT300 ln (2) = 0.003 attojoules per bit). It is physically possible to build a brain that calculates a million times faster than a human, without reducing size, working at low temperatures, using reversible computations and a quantum computer. If the human mind is thus accelerated, the subjective year of reflection will end in 31 physical seconds in the outside world, and the millennium will fly by in eight and a half hours. Vinge (1993) called such accelerated minds "weak superintelligence": the mind thinking like a human, but much faster.

We assume that an extremely fast mind will emerge, established at the core of a human technological civilization that will exist at this time. A failure of imagination would say: “It doesn’t matter how quickly he thinks, he can influence the world only with the speed of his manipulations; he cannot control machines faster than he orders human hands to work; therefore, a quick mind is not a great danger. ” There is no such law of nature according to which physical operations must drag on in seconds. The characteristic time for molecular reactions is measured in femtoseconds, sometimes in picoseconds.

Drexler (1992) analyzed controlled molecular manipulators that will perform> 106 molecular operations per second — tick this off in connection with the main theme of "million-fold acceleration." (The smallest physically significant time increment is usually considered the Planck interval, 5 • 10-44 seconds, and on this scale even dancing quarks seem to be statues.)

Imagine that humanity would be locked in a box and could affect the surrounding world only by means of frozen slow movements of the alien's tentacles, or mechanical hands, which would move at a speed of several microns per second. Then we would concentrate all our creative power on the search for the shortest way to build fast manipulators in the outside world. When thinking about fast manipulators, you immediately think about molecular nanotechnology — although there may be other ways. What is the shortest path that can lead you to nanotechnology in the slow external world, if you have eons of time to think about every move? The answer is that I do not know, because I have no eons of time for reflection. Here is one of the imaginary fast paths:

- to solve the problem of protein folding, to the extent that it is able to create DNA sequences whose folded peptide sequences correspond to specific functional tasks in complex chemical reactions.

- Send by email a set of DNA sequences to one or more laboratories that offer DNA synthesis, peptide sequencing and delivery via FedEx. (Many laboratories now offer such services, and some advertise a 72-hour full cycle time.)

- Find at least one person connected to the Internet who can be paid for, intimidated by blackmail, or fooled by the appropriate story so that he can receive FedEx shipments and mix them in a special environment.

- Synthesized proteins will form a very primitive “wet” nanosystem, like a ribosome, capable of accepting external instructions; Possibly, modulated acoustic waves, directed by the speaker to the beaker.

- Use this incredibly primitive nanosystem to build a more complex system, which then develops into molecular nanotechnology - or further.

Full time for the whole procedure would probably have an order of weeks from the moment when a quick intellect could solve the problem of protein folding. Of course, I completely invented this scenario. Perhaps for 19,500 years of subjective time (one week of physical time with a million times acceleration) I would find a simpler way. Perhaps you can pay for fast courier delivery with FedEx. Perhaps there are technologies, or small modifications of existing technologies, which are synagretically combined with simple protein mechanisms. Perhaps, if you are smart enough, you can use wave electric fields to change the paths of reactions in existing biochemical processes. I dont know. I'm not that smart.

The challenge is to tie your abilities together — analogous to what in the real world is combining weak vulnerabilities in a computer system to gain root access. If one path is blocked, you choose another, always looking for ways to increase your abilities and use them in a synergy. The intended goal is to build a fast infrastructure, that is, the means to manipulate the outside world on a large scale in a short time. Molecular nanotechnology satisfies these criteria, first, because its elementary operations occur quickly, and, second, because there is a ready-made set of perfect parts — atoms — that can be used for self-replication and exponential growth of the nanotechnological infrastructure. The path discussed above implies an AI receiving a high-speed infrastructure during the week — which sounds fast for a person with 200 Hz neurons, but is a much longer time for an AI.

As soon as the AI acquires a fast infrastructure, further events occur on the AI time scale, and not on the human time scale. (Except for the case when AI prefers to act on a human time scale.) With molecular nanotechnology, AI can (potentially) rewrite the entire solar system without any resistance.

An unfriendly AI with molecular infrastructure (or other fast infrastructure) does not have to worry about armies of marching robots, or blackmail or subtle economic interventions. An unfriendly AI has the ability to remake the entire substance of the solar system according to its optimization goal. It will be fatal for us if this AI does not take into account in its selection how this transformation will affect existing systems, such as biology and humans. This AI does not hate you, nor loves you, but you are made of atoms, which it can use in some other way. AI works on a different timeline than you; by the time your neurons finish thinking the words “I have to do something”, you have already lost. Friendly AI plus molecular nanotechnology is supposedly strong enough to solve any problem that can be solved by moving atoms or creative thinking. Precaution should be taken with regard to possible imaginative errors: cancer treatment is a popular modern goal for philanthropy, but it does not follow from this that Friendly AI with molecular nanotechnology will say to itself: “Now I will cure cancer.” Perhaps the best description of the problem is that human cells are non-programmable. If you solve this problem, it will cure cancer as a special case, and at the same time diabetes and obesity. Fast, positive intelligence, owning molecular nanotechnology, has the power to get rid of diseases, not cancer.

The last family of metaphors is related to species, and is based on interspecific differences of intelligence. Such an AI possesses magic - not in the sense of spells or potions, but in the sense that a wolf cannot understand how a gun works, or what kind of effort is required to make guns, or the nature of human power, which allows us to invent guns.

Vinge (1993) writes: “A strong superhumanity will not just be dispersed to great speed by the equivalent of the human mind. It is difficult to say what exactly superhumanism will be, but the difference is likely to be deep. Imagine the mind of a dog running at tremendous speed. Will millennia give a dog's life at least one human insight? ”

Видовая метафора является ближайшей аналогией а приори, но она не очень пригодна для создания детальных историй. Главный совет, которая даёт нам эта метафора, состоит в том, что нам лучше всего всё-таки сделать Дружественный ИИ, что есть хороший совет в любом случае. Единственную защиту, которую она предлагает от враждебного ИИ – это вообще его не строить, что тоже очень ценный совет. Абсолютная власть является консервативным инженерным предположением в отношении Дружественного ИИ, который был неправильно спроектирован. Если ИИ повредит вам с помощью магии, его Дружественность в любом случае ошибочна.

10: Local and major strategies of majority.

You can classify the proposed risk reduction strategies as follows:

- strategies that require unanimous cooperation - strategies that can be defeated by individual pests or small groups.
- strategies that require majoritarian strategy: majority of legislators in one country, or majority of voting people, or most countries in the UN: strategies that require the majority, but not all, of a certain large group to act in a certain way.
- Strategies that require local action - concentration of will, talent and funding, which reaches a threshold value for some specific task.

Unanimous strategies are not workable, which does not prevent people from continuing to offer them.

Majority strategies (most strategies) sometimes work if you have decades to do your job. A movement should be created, and years will pass before it is recognized as a force in public politics and until it defeats the opposition factions. Majoritarian strategies take considerable time and require tremendous effort. People have already tried to do this, and the story remembers several successes. But beware: historical books tend to selectively focus on those movements that have had an impact, unlike the majority, which have never influenced anything. There is an element of luck and the initial willingness of the public to listen. The critical moments of this strategy include elements beyond our control. If you do not want to devote your whole life to the promotion of a certain majority strategy, do not worry; and even the whole of consecrated life is not enough.

Local strategies are usually the most convincing. It’s not easy to get $ 100 million in security, and it’s not easy to achieve universal political change, but it’s still much easier to get 100 million than to push global political change. Two assumptions put forward in favor of a majoritarian strategy regarding AI:

- Most Friendly AIs can effectively protect the human species from non-Friendly AIs.

- The first built AI cannot by itself cause catastrophic damage.

This essentially repeats the situation in human civilization before the creation of nuclear and biological weapons: most people cooperate in the global social structure, and pests can cause some, but not catastrophic damage. Most AI researchers do not want to build an un-friendly AI. If someone knows how to make a stable Friendly AI - if the problem is not completely outside of modern knowledge and technology - researchers will learn successful results from each other and repeat them. Legislation may (for example) require researchers to publish their Friendliness strategies or penalize those researchers whose AI caused the damage; and although these laws will not prevent all errors, they can guarantee that most AIs will be built Friendly.

We can also present a scenario that involves a simple local strategy:

- the first AI cannot by itself cause catastrophic damage.

- Even if even one Friendly AI appears, this AI together with human institutions can drive away any number of non-Friendly AIs.

This lightweight scenario will endure if human institutions can reliably distinguish Friendly AI from unfriendly and give power that can be canceled into the hands of Friendly AI. Then we can assemble and choose our allies. The only requirement is that the Friendly AI problem be solvable (as opposed to being completely beyond human capabilities.)

Both of the above scenarios suggest that the first AI (the first powerful, universal AI) alone cannot cause globally catastrophic damage. More specific ideas that suggest this use the G-metaphor: AI as an analogue to specially gifted people. In Chapter 7, on speeds of gaining intelligence, I indicated several points why a huge, rapid leap in intelligence should be suspected.

- the distance from the idiot to Einstein, which looks great for us, is a small point on the scale of minds in general.

“The hominids made a dramatic jump in efficiency in the outside world, despite the fact that natural selection exerted approximately uniform pressure on their genome.

- AI can absorb an enormous amount of additional equipment after reaching a certain level of competence (that is, eating the Internet).

- There is a critical threshold for recursive self-improvement. One self-improvement, which gives an increment of 1.0006 times, is qualitatively different from the self-improvement, which gives an increment of 0.9994 times.

As described in Chapter 9, a sufficiently strong AI may take a very short time (from a human point of view) to achieve molecular nanotechnology, or another form of fast infrastructure. Now we can imagine the whole meaning of the one who will start the first (the first-mover effect) in superintelligence. The effect of the one who started first is that the outcome of intelligent life on Earth depends primarily on the features (makeup) of the mind that first reaches a certain key threshold of intelligence, such as the criticality of self-improvement. The two necessary assumptions are:

- The first AI, which has reached a certain critical threshold (that is, the criticality of self-enhancements), being not Friendly, can destroy the human species.

- If the first AI that reaches this level is Friendly, then it will be able to prevent the emergence of hostile AIs or the harm they cause to the human species; or find other original ways to ensure the survival and prosperity of intelligent life on Earth.

More than one scenario corresponds to the effect of the one that started first. Each of the following examples represents a different key threshold:

- A post-critical, self-improving AI reaches superintelligence in a matter of weeks or less. AI projects are rare enough, so no other AIs reach criticality before the first AI begins to be strong enough to overcome any resistance. The key threshold is the critical level of self-improvement.

- AI-1 solves the problem of protein coagulation three days earlier than AI-2. AI-1 reaches nanotechnology 6 hours earlier than II-2. With the help of fast manipulators substance II-1 can (potentially) disable the research and development of II-2 before its maturation. The runners are close, but the one who crosses the finish line first wins. The key threshold here is fast infrastructure.

- the AI that first absorbs the Internet can (potentially) prevent other AI from entering it. Then, through economic domination, hidden actions or blackmail or superior ability to social manipulation, the first AI stops or slows down other AI projects, so no other AI will arise. The key threshold is the absorption of a unique resource.

The human species, Homo sapiens, is the first to start. From the point of view of evolution, our cousins - chimpanzees - only lag behind us by the thickness of the hair. Homo sapiens got all the technological wonders because we got here a little earlier. Evolutionary biologists are still trying to figure out the order of key thresholds, because the species that started first had to be the first to cross so many thresholds: speech, technology, abstract thinking. We are still trying to understand what caused the domino effect first. The result is that Homo Sapiens moves first without an opponent looming behind. The effect of moving first implies a theoretically local strategy (a task that is implemented, in principle, exclusively by local efforts), while the notes evoke the technical challenge of extreme difficulty. We need to correctly create a Friendly AI only in one place and once, and not every time everywhere. But you need to create it correctly at the first attempt, before someone builds an AI with lower standards.

I cannot make exact calculations on the basis of a precisely confirmed theory, but my opinion now is that sharp leaps in intellectuality are possible, probable, and constitute the dominant possibility. This is not an area in which I would like to give narrow intervals of confidence, and therefore the strategy should not suffer a catastrophe - that is, not leave us in a situation worse than before - if a sharp leap in intelligence does not occur. But a much more serious problem is the strategies presented for a slowly growing AI, which suffer a catastrophe, if there is an effect moving first. This is a more serious problem because:

- Faster growing AI is a more complex technical challenge.

- Like a truck driving over a truck bridge, an AI designed to remain Friendly in extremely difficult conditions (presumably) remains Friendly in less difficult conditions. The reverse is not true.

- Fast jumps in intelligence are counter-intuitive in terms of ordinary social life. The metaphor for the G-factor for AI is intuitive, appealing, reassuring and, by common agreement, requiring less constructive constraints.

- My current guess is that the intellectual curve contains huge, sharp (potentially) leaps.

My current strategic point of view tends to focus on a difficult local scenario: the first AI must be Friendly. With this precaution, if no quick jumps in the AI occur, you can switch to a strategy that will make most AI Friendly. In any case, the technical efforts that have gone into preparation for the extreme case of the appearance of the first AI will not make us worse.

Scenario that requires an impossible — unanimous — strategy:

- The only AI can be strong enough to destroy humanity, despite the protective measures of Friendly AI.

- No AI is powerful enough to stop human researchers from creating one AI after another (or to find another creative way to solve a problem.).

It is good that this balance of opportunities seems incredible and priori, because under such a scenario we are doomed. If you lay a deck of cards on the table one by one, you will sooner or later lay out the ace of clubs.

The same problem applies to the strategy of deliberately constructing AI, who choose not to increase their abilities above a certain level. If limited AIs are not strong enough to defeat unlimited ones, or prevent their occurrence, then limited AIs are eliminated from the equation. We participate in the game, as long as we do not pull out the superintelligence, regardless of the fact that it is ace of hearts or ace of clubs. Majority strategies work only if it is impossible for a single pest to cause catastrophic damage. For AI, this possibility is a property of the design space itself (design space) —this possibility does not depend on the human solution, as well as the speed of light or the gravitational constant.

11: AI and increased human intelligence. (AI versus human intelligence enhancement).

I do not find it plausible that Homo sapiens will continue to exist for an unlimited future, thousands or millions or billions of years, without even having one mind that breaks through the upper limit of intellectuality. And if so, then the time will come when people will meet for the first time with the challenge of a more intelligent man than the intellect. And if we win the first level of the fight, then humanity will be able to appeal to a smarter intellect than man in the next rounds of the fight.

Perhaps we would rather choose a different path than an AI, smarter than a person - for example, we will improve people instead. To consider an extreme case, suppose that someone says: “The prospect of AI worries me. I would prefer that, before any AI was designed, individual people would be scanned into computers, neuron after neuron, and then refined, slowly but surely, until they become super-smart; and this is the basis on which humanity must confront the challenge of super-intelligence. ”

Here we face two questions: Is this scenario possible? And if so, is it desirable? (It is reasonable to ask questions precisely in this sequence, for rationality reasons: we must avoid emotional attachment to attractive opportunities that are not real opportunities.)

Imagine that a person is scanned into a computer, a neuron behind a neuron, as suggested by Moravec (1988). From this it follows unambiguously that the used computer power far exceeds the computing power of the human brain. According to the hypothesis, the computer performs a detailed simulation of the biological human brain, executed with sufficient accuracy to avoid any detectable high-level effects from systemic low-level errors.

Every biological aspect that in any way affects the processing of information, we must carefully simulate with sufficient accuracy so that the overall process is isomorphic to the original. To simulate a messy biological computer like the human brain, we need to have much more useful computer power than embodied in the messy human brain itself.

The most likely way to be created to scan the brain is the neuron behind the neuron — with sufficient resolution to capture any cognitively important aspect of the neural structure — this is advanced molecular nanotechnology. (four)

(footnote 4) Albeit Merkle (1989) suggests that the non-revolutionary development of reading technologies such as electron microscopy or optical sectioning may be enough to load a whole brain.

Molecular nanotechnology may allow the creation of a desktop computer with a total computing power exceeding the total brain power of the entire human population. (Bostrom 1998; Moravec 1999; Merkle and Drexler 1996; Sandberg 1999.) Moreover, if the technology allows us to scan the brain with sufficient accuracy to execute this scan as a code, this means that several years before this technology was create incredibly accurate pictures of the processes in neural networks, and, presumably, the researchers did everything they could to understand them. Moreover, in order to upgrade the load - to transform the scan of the brain, to enhance the intelligence of the mind inside it - we must understand in detail the high-level functions of the brain, and what useful contribution they make to the intellect.

Moreover, people are not created to be improved, neither by external neuroscientists, nor by recursive self-improvement from the inside. Natural selection has not created a human brain convenient for human hackers. All complex mechanisms in the brain are adapted to work in narrow parameters of the brain design. Suppose you can make a person smarter, not to mention superintelligence; Will he remain sane? The human brain is very easy to unbalance; it is enough to change the balance of neurotransmitters to trigger schizophrenia or other disorders. Deacon (1997) provides an excellent description of the evolution of the human brain, how delicately the elements of the brain are balanced, and how this is reflected in the dysfunctions of the modern brain. The human brain is not modifiable by the end user.

All of this makes it very incredible that the first human being will be scanned into a computer and sanitized before anyone else builds an AI first. The moment the technology first becomes capable of loading, it will require an unimaginably more computer power and a much better science of thinking than is required to build an AI. Building a Boeing 747 from scratch is not easy. But is it easier:

- start with an existing biological bird design
- and by step-by-step additions modify this design through a series of successful stages
- where each stage is independently viable
- so ultimately we have a bird stretched to the size of 747
- which actually flies
- as fast as 747
- and then conduct a series of transformations of real live birds
- not killing her and not causing her unbearable suffering.

I do not want to say that this can never be done. I want to say that it is easier to make the 747, and, having already the 747th, metaphorically speaking, upgrade the bird. “Let's just increase the bird to the size of the 747th” does not look like a sensible strategy of avoiding contact with the eerily complex theoretical mystery of aerodynamics. Maybe in the beginning, all that you know about flying is that the bird has the mysterious essence of flight, and that the materials from which you must build the 747 will simply lie here on the ground. But you cannot blind the mysterious essence of flight, even if it already exists in the bird, until it ceases to be a mysterious entity for you. The above argument is offered as a deliberately extreme case. The basic idea is that we do not have absolute freedom to choose a path that looks positive and comforting, or that will be a good story for a science fiction novel. We are limited by what technologies are likely to precede others. I am not against scanning human beings into computers and making them smarter, but it seems extremely unlikely that this will be the field on which people will first face the challenge of superior human intelligence. From the various limited sets of technology and knowledge required to load and refine people, you can choose:
- upgrade biological brains in place (for example, adding new neurons that are useful to integrate into the work);
- or productively to connect computers with biological human brains.
- or productively connect the brains of people with each other
- or construct an AI.

Further, it is one thing to strengthen the average person, while preserving his sanity, to IQ 140, and another to develop the Nobel laureate to something beyond human. (Putting aside puns about IQ or Nobel prizes as measures of perfect intelligence; forgive me for my metaphors.) Taking piracetam (or drinking caffeine) may or may not make at least some people smarter; but it does not make you substantially smarter than Einstein. This does not give us any significant new abilities; we do not proceed to the next levels of the problem; we do not cross the upper limits of the intelligence available to us to interact with global risks. From the point of view of managing global risks, any technology for improving intelligence, which does not create (positive and sane) consciousness, literally smarter than a person, raises the question of whether it was possible that the same time and effort could be spent more productively on to find extremely smart modern people and set them on the same problem. Moreover, the farther you go from the “natural” boundaries of the human brain structure - from the hereditary state of the brain, to which individual components of the brain are adapted - the greater the danger of personal insanity. If improved people are significantly smarter than normal, this is also a global risk. How much damage to advanced evil people can cause? How creative are they? The first question that comes to my mind is: “Creative enough to create your own recursively improved AI?” Radical techniques for improving human intelligence raise their security questions. Again, I am not saying that these problems are not technically solvable; only pointing out that these problems exist. AI has controversial security issues; also concerns the improvement of human intelligence. Not all that clang is your enemy, and not everything that squishes is a friend. On the one hand, a positive person starts with all the enormous moral, ethical and structural complexity that describes what we call a “friendly” decision. On the other hand, AI can be designed for stable recursive self-improvement and is focused on safety: natural selection did not create a human brain with many circles of precautionary measures, a cautious decision-making process and whole orders of magnitude of security fields.

Improving human intelligence is a matter of its own, not a subsection of AI; and in this article there is no place to discuss it in detail. It is worth noting that I considered both the improvement of human intelligence and AI at the beginning of my career, and decided to focus my efforts on AI. First of all, because I didn’t expect that useful, superior to the human level, techniques for improving human intelligence will appear enough in time to significantly affect the development of a recursively self-improving AI. I will be glad if they prove to me that I am wrong about this. But I do not think that this is a viable strategy - on purpose to choose not to work on Friendly AI, while others are working to improve human intelligence, in the hope that improved people will solve the problem of Friendly AI better. I do not want to be involved in a strategy that will suffer a catastrophic defeat if the improvement of human intelligence takes more time than creating an AI. (Or vice versa.) I am afraid that working with biology will take too much time - there will be too much inertia, too much struggle with bad design decisions already made by natural selection. I fear that regulators do not approve of human experimentation. And even human geniuses spend years learning their art; and the sooner an improved person has to learn, the harder it is to improve someone to this level.

I will be pleasantly surprised if there will be improved humans (augmented humans) and build Friendly AI before anyone else. But the one who would like to see this result should probably work hard to accelerate technologies for improving intelligence; it will be difficult to convince me to slow down. If AI is by its nature much more complex than enhancing intelligence, then there will be no harm; if the construction of the 747th in a natural way is easier than increasing the bird to its size, then the delay will be fatal. There is only a small area of opportunity within which the intentional abandonment of work on Friendly AI can be useful, and a large area where it will be unimportant or dangerous. Even if enhancing human intelligence is possible, there are real, complex security issues; I should seriously ask myself if we want Friendly AI to precede intelligence gain, or vice versa.

I do not attribute the high credibility of the statement that Friendly AI is simpler than improving a person, or that he is safer. There are many ways in which one can improve one’s mind. Perhaps there is a technique that will forgive and be safer than AI, powerful enough to have an impact on global risks. If so, I can switch the direction of my work. But I wanted to point out some considerations that point out against the assumption made without question that improving human intelligence is simpler, safer, and powerful enough to play a prominent role.

12: Interaction between AI and other technologies. (Interactions of AI with other technologies).

Accelerating the desired technology is a local strategy, while slowing down a dangerous technology is a difficult majority strategy. Stopping or abandoning unwanted technology tends to require an impossible unanimous strategy. I suggest thinking not in terms of the development or underdevelopment of certain technologies, but in terms of the pragmatic available opportunities to speed up or slow down technologies; and wonder, within the limits of these possibilities, what technologies would we prefer to see developed before or after one another.

In nanotechnology, the usually proposed goal is to develop protective shields before the advent of offensive technology. I am very concerned about this, because a given level of offensive technology usually requires much less effort than technology that can protect against it. The offensive was superior to defense throughout most of human history. Guns were made hundreds of years before bulletproof vests. Smallpox was used as a weapon of war before the invention of the smallpox vaccine. Now there is no protection against a nuclear explosion; nations are protected not by defense superior to offensive forces, but by a balance of offensive threats. Nanotechnology has by its very nature been a difficult problem. So, should we prefer that nanotechnology precede the development of AI, or does AI precede the development of nanotechnology? Asked in this form, this is a somewhat fraudulent question. The answer to it has nothing to do with the inherent problems of nanotechnology as a global risk, or with the intrinsic complexity of AI. To the extent that we worry about the order of occurrence, the question should be: “Will AI help us cope with nanotechnology? Will nanotechnology help us cope with AI? "

It seems to me that the successful creation of AI will significantly help us in interacting with nanotechnology. I do not see how nanotechnologies will make the development of Friendly AI more simple. If powerful nanocomputers make it easier to create AI, without simplifying the solution to the independent problem of Friendliness, this is a negative interaction of technologies. Therefore, other things being equal, I would really prefer Friendly AI to precede nanotechnology in the order of technological discoveries. If we cope with the challenge of AI, we can count on the help of Friendly AI in relation to nanotechnology. If we create nanotechnology and survive, we will still have to face the challenge of interacting with AI after that.

Generally speaking, success in Friendly AI should help solve almost any other problem. Therefore, if a certain technology makes AI not simpler and not more difficult, but carries with it a certain global risk, we should prefer, other things being equal, first of all to meet the AI challenge. Any technology that increases the available power of computers reduces the minimum theoretical complexity necessary to create an AI, but does not help in the Friendliness at all, and I consider it negative in sum. Moore's Law for Mad Science: Every 18 months the minimum IQ needed to destroy the world drops by one point. Success in enhancing human intelligence will make Friendly AI easier as well as help with other technologies. But improving people is not necessarily safer, or easier than Friendly AI; it is also not within the realistic limits of our ability to change the natural order in which people improve and Friendly AI, if one technology is by its nature much simpler than the other.

13: Progress on Friendly AI. (Making progress on Friendly AI.)

“We propose that within 2 months, ten people study artificial intelligence in the summer of 1956 at Durmouth College, Hannover, New Hampshire. The study will be performed on the basis of the assumption that any aspect of training or any other quality of intelligence can in principle be described so accurately that a machine can be made to simulate it. An attempt will be made to learn how to make machines use language, form abstractions and concepts, solve the problems that are now available only to people, and improve themselves. We believe that substantial progress in one or several of these works is possible if a carefully selected group of scientists works on this together during the summer. ”
---- McCarthy, Minsky, Rochester, and Shannon (1955).
The proposal of the Dartmouth Summer Research Project on Artificial Intelligence is the first recorded use of the phrase “Artificial Intelligence”. They had no previous experience that could have warned them that the problem was difficult. I would call it a sincere mistake that they said that "significant progress can be made," and not there is "there is a small chance of significant progress." This is a specific statement regarding the difficulty of the problem and the solution time, which reinforces the degree of impossibility. But if they said "there is a small chance," I would not have objections. How could they know?

The Dartmouth proposal included, among other things, the following topics: linguistic communication, linguistic reasoning, neural networks, abstraction, contingency and creativity, interaction with the environment, brain modeling, originality, prediction, invention, discovery and self-improvement.

(Footnote 5) This is usually true, but not universal truth. The last chapter widely uses the textbook “Artificial Intelligence: A Modern Approach”. (Russell and Norvig 2003) (Including the “Ethics and Risks of Artificial Intelligence” section, mentioning the explosion of intelligence on IJGood and Singularity, and calling for further research.) But by 2006, this attitude is the exception rather than the rule.

Now it seems to me that AI, capable of languages, abstract thinking, creativity, interaction with the environment, originality, predictions, invention, discoveries, and, above all, self-improvement, is far beyond the level at which it should be and Friendly. The Darmouth proposal says nothing about building a positive / good / benevolent AI. Security issues are not discussed even with the goal of dropping them. And this is at that sincere summer when the human-level AI seemed right around the corner. The Darmouth proposal was written in 1955, prior to the Asilomar (Asilomar) conference on biotechnology, children poisoned with Tamylamide during pregnancy, Chernobyl and September 11. If the idea of artificial intelligence were proposed for the first time now, someone would have to try to find out what exactly is being done to manage risks. I can not say whether this is a good change in our culture or a bad one. I am not saying whether this creates good or bad *** science. But the bottom line is that if the Dartmouth sentence were written 50 years later, one of its topics would have to be security.

At the time of this writing in 2006, the AI research community still does not consider Friendly AI as part of the problem. I would like to quote references to this effect, but I cannot quote the lack of literature. Friendly AI is missing in the concept space, and not just not popular or not funded. You cannot even call Friendly AI an empty place on the map, since there is no understanding that something is missing. (5) If you read non-fiction / semi-technical books that suggest how to build AI, such as Godel, Escher, Bach. (Hofstadter, 1979) or the Community of Consciousnesses (Minsky, 1986), you may recall that you did not see the discussion of Friendly AI as part of the problem. Similarly, I did not see the discussion of Friendly AI as a technical problem in technical literature. The literary surveys I have undertaken have found mostly brief non-technical articles that are not related to one another without common references except for the “Three Laws of Robotics” by Isaac Asimov. (Asimov, 1942.) Bearing in mind that now it is already 2006, why not many AI researchers who

продолжение следует...

Продолжение:

Часть 1 Artificial intelligence as a positive and negative global risk factor
Часть 2 9: Threats and prospects. (Threats and promises.) - Artificial intelligence
Часть 3 - Artificial intelligence as a positive and negative global risk

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote