Architects of Intelligence Page 6

Of course, I don’t know how much patience the major car companies have. I do think everyone is committed to the idea that AI-driven cars are going to come, and of course the major car companies feel they must be there early or miss a major opportunity.

MARTIN FORD: I usually tell people a 10-15-year time frame when they ask me about self-driving cars. Your estimate of five years seems quite optimistic.

STUART J. RUSSELL: Yes, five years is optimistic. As I said, I think we’ll be lucky if we see driverless cars in five years, and it could well be longer. One thing that is clear, though, is that many of the early ideas of fairly simple architectures for driverless cars are now being abandoned, as we gain more experience.

In the early versions of Google’s car, they had chip-based vision systems that were pretty good at detecting other vehicles, lane markers, obstacles, and pedestrians. Those vision systems passed that kind of information effectively in a sort of logical form and then the controller applied logical rules telling the car what to do. The problem was that every day, Google found themselves adding new rules. Perhaps they would go into a traffic circle—or a roundabout, as we call them in England—and there would be a little girl riding her bicycle the wrong way around the traffic circle. They didn’t have a rule for that circumstance. So, then they have to add a new one, and so on, and so on. I think that there is probably no possibility that this type of architecture is ever going to work in the long run, because there are always more rules that should be encoded, and it can be a matter of life and death on the road if a particular rule is missing.

By contrast, we don’t play chess or Go by having a bunch of rules specific to one exact position or another—for instance, saying if the person’s king is here and their rook is there, and their queen is there, then make this move. That’s not how we write chess programs. We write chess programs by knowing the rules of chess and then examining the consequences of various possible actions.

A self-driving car AI must deal with unexpected circumstances on the road in the same way, not through special rules. It should use this form of lookahead-based decision-making when it doesn’t have a ready-made policy for how to operate in the current circumstance. If an AI doesn’t have this approach as a fallback, then it’s going to fall through the cracks in some situations and fail to drive safely. That’s not good enough in the real world, of course.

MARTIN FORD: You’ve noted the limitations in current narrow or specialized AI technology. Let’s talk about the prospects for AGI, which promises to someday solve these problems. Can you explain exactly what Artificial General Intelligence is? What does AGI really mean, and what are the main hurdles we need to overcome before we can achieve AGI?

STUART J. RUSSELL: Artificial General Intelligence is a recently coined term, and it really is just a reminder of our real goals in AI—a general-purpose intelligence much like our own. In that sense, AGI is actually what we’ve always called artificial intelligence. We’re just not finished yet, and we have not created AGI yet.

The goal of AI has always been to create general-purpose intelligent machines. AGI is also a reminder that the “general-purpose” part of our AI goals has often been neglected in favor of more specific subtasks and application tasks. This is because it’s been easier so far to solve subtasks in the real world, such as playing chess. If we look again at AlphaZero for a moment, it generally works within the class of two-player deterministic fully-observable board games. However, it is not a general algorithm that can work across all classes of problems. AlphaZero can’t handle partial observability; it can’t handle unpredictability; and it assumes that the rules are known. AlphaZero can’t handle unknown physics, as it were.

Now if we could gradually remove those limitations around AlphaZero, we’d eventually have an AI system that could learn to operate successfully in pretty much any circumstance. We could ask it to design a new high-speed watercraft, or to lay the table for dinner. We could ask it to figure out what’s wrong with our dog and it should be able to do that—perhaps even by reading everything about canine medicine that’s ever been known and using that information to figure out what’s wrong with our dog.

This kind of capability is thought to reflect the generality of intelligence that humans exhibit. And in principle a human being, given enough time, could also do all of those things, and so very much more. That is the notion of generality that we have in mind when we talk about AGI: a truly general-purpose artificial intelligence.

Of course, there may be other things that humans can’t do that an AGI will be able to do. We can’t multiply million-digit numbers in our heads, and computers can do that relatively easily. So, we assume that in fact, machines may be able to exhibit greater generality than humans do.

However, it’s also worth pointing out that it’s very unlikely that there will ever be a point where machines are comparable to human beings in the following sense. As soon as machines can read, then a machine can basically read all the books ever written; and no human can read even a tiny fraction of all the books that have ever been written. Therefore, once an AGI gets past kindergarten reading level, it will shoot beyond anything that any human being has ever done, and it will have a much bigger knowledge base than any human ever has.

And so, in that sense and many other senses, what’s likely to happen is that machines will far exceed human capabilities along various important dimensions. There may be other dimensions along which they’re fairly stunted and so they’re not going to look like humans in that sense. This doesn’t mean that a comparison between humans and AGI machines is meaningless though: what will matter in the long run is our relationship with machines, and the ability of the AGI machine to operate in our world.

There are dimensions of intelligence (for example, short-term memory) where humans are actually exceeded by apes; but nonetheless, there’s no doubt which of the species is dominant. And if you are a gorilla or a chimpanzee, your future is entirely in the hands of humans. Now that is because, despite our fairly pathetic short-term memories compared to gorillas and apes, we are able to dominate them because of our decision-making capabilities in the real world.

We will undoubtedly face this same issue when we create AGI: how to avoid the fate of the gorilla and the chimpanzee, and not cede control of our own future to that AGI.

MARTIN FORD: That’s a scary question. Earlier, you talked about how conceptual breakthroughs in AI often run decades ahead of reality. Do you see any indications that the conceptual breakthroughs for creating AGI have already been made, or is AGI still far in the future?

STUART J. RUSSELL: I do feel that many of the conceptual building blocks towards AGI pieces are already here, yes. We can start to explore this question by asking ourselves: “Why can’t deep learning systems be the basis for AGI, what’s wrong with them?”

A lot of people might answer our question by saying: “Deep learning systems are fine, but we don’t know how to store knowledge, or how to do reasoning, or how to build more expressive kinds of models, because deep learning systems are just circuits, and circuits are not very expressive after all.”

And for sure, it’s because circuits are not very expressive that no one thinks about writing payroll software using circuits. We instead use programming languages to create payroll software. Payroll software written using circuits would be billions of pages long and completely useless and inflexible. By comparison, programming languages are very expressive and very powerful. In fact, they are the most powerful things that can exist for expressing algorithmic processes.

In fact, we already know how to represent knowledge and how to do reasoning: we have developed computational logic over quite a long time now. Even predating computers, people were thinking about algorithmic procedures for doing logical reasoning.

And so, arguably, some of the conceptual building blocks for AGI have already been here for decades. We just haven’t figured out yet how to combine those with the very impressive learning capacities of de
ep learning.

The human race has also already built a technology called probabilistic programming, which I will say does combine learning capabilities with the expressive power of logical languages and programming languages. Mathematically speaking, such a probabilistic programming system is a way of writing down probability models which can then be combined with evidence, using probabilistic inference to produce predictions.

In my group we have a language called BLOG, which stands for Bayesian Logic. BLOG is a probabilistic modeling language, so you can write down what you know in the form of a BLOG model. You then combine that knowledge with data, and you run inference, which in turn makes predictions.

A real-world example of such a system is the monitoring system for the nuclear test-ban treaty. The way it works is that we write down what we know about the geophysics of the earth, including the propagation of seismic signals through the earth, the detection of seismic signals, the presence of noise, the locations of detection stations, and so on. That’s the model—which is expressed in a formal language, along with all the uncertainties: for example, uncertainty in our ability to predict the speed of propagation of a signal through the earth. The data is the raw seismic information coming from the detection stations that are scattered around the world. Then there is the prediction: What seismic events took place today? Where did they take place? How deep were they? How big were they? And perhaps: Which ones are likely to be nuclear explosions? This system is an active monitoring system today for the test-ban treaty, and it seems to be working pretty well.

So, to summarize, I think that many of the conceptual building blocks needed for AGI or human-level intelligence are already here. But there are some missing pieces. One of them is a clear approach to how natural language can be understood to produce knowledge structures upon which reasoning processes can operate. The canonical example might be: How can an AGI read a chemistry textbook and then solve a bunch of chemistry exam problems—not multiple choice but real chemistry exam problems—and solve them for the right reasons, demonstrating the derivations and the arguments that produced the answers? And then, presumably if that’s done in a way that’s elegant and principled, the AGI should then be able to read a physics textbook and a biology textbook and a materials textbook, and so on.

MARTIN FORD: Or we might imagine an AGI system acquiring knowledge from, say, a history book and then applying what it’s learned to a simulation of contemporary geopolitics, or something like that, where it’s really moving knowledge and applying it in an entirely different domain?

STUART J. RUSSELL: Yes, I think that’s a good example because it relates to the ability of an AI system to then be able to manipulate the real world in a geopolitical sense or a financial sense.

If, for example, the AI is advising a CEO on corporate strategy, it might be able to effectively outplay all the other companies by devising some amazing product marketing acquisition strategies, and so on.

So, I’d say that the ability to understand language, and then to operate with the results of that understanding, is one important breakthrough for AGI that still needs to happen.

Another AGI breakthrough still to happen is the ability to operate over long timescales. While AlphaZero is an amazingly good problem-solving system which can think 20, sometimes 30 steps into the future, that is still nothing compared to what the human brain does every moment. Humans, in our primitive steps, use motor control signals that we send to our muscles; and just typing a paragraph of text is several tens of millions of motor control commands. So those 20 or 30 steps by AlphaZero would only get an AGI only a few milliseconds into the future. As we talked about earlier, AlphaZero would be totally useless for planning the activity of a robot.

MARTIN FORD: How do humans even solve this problem with so many calculations and decisions to be made as they navigate the world?

STUART J. RUSSELL: The only way that humans and robots can operate in the real world is to operate at multiple scales of abstraction. We don’t plan our lives in terms of exactly which thing are we going to actuate in exactly which order. We instead plan our lives in terms of “OK, this afternoon I’m going try to write another chapter of my book” and then: “It’s going to be about such and such.” Or things like, “Tomorrow I’m going to get on the plane and fly back to Paris.”

Those are our abstract actions. And then as we start to plan them in more detail, we break them down into finer steps. That’s common sense for humans. We do this all the time, but we actually don’t understand very well how to have AI systems do this. In particular, we don’t understand yet how to have AI systems construct those high-level actions in the first place. Behavior is surely organized hierarchically into these layers of abstraction, but where does the hierarchy come from? How do we create it and then use it?

If we can solve this problem for AI, if machines can start to construct their own behavioral hierarchies that allow them to operate successfully in complex environments over long timescales, that will be a huge breakthrough for AGI that takes us a long way towards a human-level functionality in the real world.

MARTIN FORD: What is your prediction for when we might achieve AGI?

STUART J. RUSSELL: These kinds of breakthroughs have nothing to do with bigger datasets or faster machines, and so we can’t make any kind of quantitative prediction about when they’re going to occur.

I always tell the story of what happened in nuclear physics. The consensus view as expressed by Ernest Rutherford on September 11th, 1933, was that it would never be possible to extract atomic energy from atoms. So, his prediction was “never”, but what turned out to be the case was that the next morning Leo Szilard read Rutherford’s speech, became annoyed by it, and invented a nuclear chain reaction mediated by neutrons! Rutherford’s prediction was “never” and the truth was about 16 hours later. In a similar way, it feels quite futile for me to make a quantitative prediction about when these breakthroughs in AGI will arrive, but Rutherford’s story is a good one.

MARTIN FORD: Do you expect AGI to happen in your lifetime?

STUART J. RUSSELL: When pressed, I will sometimes say yes, I expect AGI to happen in my children’s lifetime. Of course, that’s me hedging a bit because we may have some life extension technologies in place by then, so that could stretch it out quite a bit.

But given that we kind of understand enough about these breakthroughs to at least describe them, and that people certainly have inklings of what their solutions might be, suggests to me that we’re just waiting for a bit of inspiration.

Furthermore, a lot of very smart people are working on these problems, probably more than ever in the history of the field, mainly because of Google, Facebook, Baidu, and so on. Enormous resources are being put into AI now. There’s also enormous student interest in AI because it’s so exciting right now.

So, all those things lead one to believe that the rate of breakthroughs occurring is probably likely to be quite high. These breakthroughs are certainly comparable in magnitude to a dozen of the conceptual breakthroughs that happened over the last 60 years of AI.

So that is why most AI researchers have a feeling that AGI is something in the not-too-distant future. It’s not thousands of years in the future, and it’s probably not even hundreds of years in the future.

MARTIN FORD: What do you think will happen when the first AGI is created?

STUART J. RUSSELL: When it happens, it’s not going to be a single finishing line that we cross. It’s going to be along several dimensions. We’ll see machines exceeding human capacities, just as they have in arithmetic, and now chess, Go, and in video games. We’ll see various other dimensions of intelligence and classes of problems that fall, one after the other; and those will then have implications for what AI systems can do in the real world. AGI systems may, for example, have strategic reasoning tools that are superhuman, and we use those for military and corporate strategy, and so on. But those tools may precede the ability to read and understand complex text.r />
An early AGI system, by itself, still won’t be able to learn everything about how the world works or be able to control that world.

We’ll still need to provide a lot of the knowledge to those early AGI systems. These AGIs are not going to look like humans though, and they won’t have even roughly the same abilities across even roughly the same spectrum as humans. These AGI systems are going to be very spiky in different directions.

MARTIN FORD: I want to talk more about the risks associated with AI and AGI. I know that’s an important focus of your recent work.

Let’s start with the economic risks of AI, which is the thing that, of course, I’ve written about in my previous book, Rise of the Robots. A lot of people believe that we are on the leading edge of something on the scale of a new industrial revolution. Something that’s going to be totally transformative in terms of the job market, the economy and so forth. Where do you fall on that? Is that overhyped, or would you line up with that assertion?

STUART J. RUSSELL: We’ve discussed how the timeline for breakthroughs in AI and AGI is hard to predict. Those are the breakthroughs that will enable an AI to do a lot of the jobs that humans do right now. It’s also quite hard to forecast which sequence of employment categories are going to be at risk from machine replacement and a timeline around that.

However, what I see in a lot of the discussions and presentations from people talking about this, is that there’s probably an over-estimate of what current AI technologies are able to do and also, the difficulty of integrating what we know how to do into the existing extremely complex functionality of corporations and governments, and so on.

< Prev Next >