Data for Good: Data Science at Columbia
May 31, 2019 - New York City
Jeannette M. Wing is Avanessians Director of the Data Science Institute and Professor of Computer Science at Columbia University. From 2013 to 2017, she was a Corporate Vice President of Microsoft Research. She is Adjunct Professor of Computer Science at Carnegie Mellon where she twice served as the Head of the Computer Science Department and had been on the faculty since 1985. From 2007-2010 she was the Assistant Director of the Computer and Information Science and Engineering Directorate at the National Science Foundation. She received her S.B., S.M., and Ph.D. degrees in Computer Science, all from the Massachusetts Institute of Technology.
MANUELA VELOSO: Thank you very much for coming to this distinguished lecture-- JP Morgan distinguished lecture on AI-- one more in our series, and a very special guest we have today. So Professor Jeannette Wing has many honors. In the invitation, you've seen all the honors she has. I'm not going to say all the boards in which she is-- all the honors that she has, but I will tell you one thing though that you should appreciate.
Professor Wing, she did not write in her biography in a little bio, but she actually introduced to the world the concept of computational thinking. She was a professor at Carnegie Mellon for many years. When I came to Carnegie Mellon, Professor Wing was actually in this faculty, and I was still PhD student. And then we were colleagues for a long time. And during that time, there was this-- from an educational point of view, this kind of thinking that math, and reading, and liberal arts education, it was not going to cover everything that people needed to know.
And we were in this computer science department. And then Jeannette one day did come up with this concept of the world, and our students, and the outreach-- I mean, and the K through 12-- everybody should start thinking computationally. And I believe you gave many talks. And she really wrote great articles. So the homework number one today is to really Google computational thinking, and you are going to see the tremendous impact that such concept has had, started by Jeannette.
So that's, I think, the thing that I wanted really to get across. And then Jeannette now is the head of-- the director of Data Science Center at-- in fact, of the inaugural director of the Data Science Center at Columbia University. She was the head of all research at Microsoft-- at the Microsoft Research, where in fact, I was invited to give a talk by Jeannette there. And we actually had a lot of interactions with Microsoft Research.
And before that also, it's important for you also to know that Professor Wing was extremely influential also by heading NSF-- an NSF division for funding efforts. And why is this so important? It's because she gained this perspective of everything that was going on, in terms of research in computer science, in AI, in systems, in programming language and verification. So she has this very, very large spectrum of knowledge and now focus in data science.
This is important for you to know. She's probably going to give a great talk on data for good, and introducing her science of Columbia, but at the end, if you have questions that cover this wide spectrum of Professor Wing, feel free to ask her because she's really a genius. Professor Wing, it's a pleasure to have you here as a friend and as a researcher here.
JEANNETTE WING: Well, that is some introduction. Dr. Veloso, is that what I call you?
MANUELA VELOSO: [INAUDIBLE]
JEANNETTE WING: Professor Veloso, OK. I've never heard Manuela call me Professor Wing. We do go back many years on-- she was still a graduate student at Carnegie Mellon, and I have watched her career just skyrocket. And now, she's at JP Morgan doing phenomenally, not surprisingly, and still coming up with really out-of-the-box thinking. So JP Morgan is really lucky to have her.
In fact, I would say, New York City is very lucky to have someone like Manuela here, because I have-- now that I'm in New York and at Columbia, I'm trying to promote the tech industry in New York. And to have someone like Manuela for me to point to and say, look, JP Morgan Stole her from Carnegie Mellon, is a really important message. So I'm going to talk about data for good, data science at Columbia University. But as-- I'll just say Manuela-- as Manuela said, because of my experience in industry at Microsoft Research and in government at the National Science Foundation, and of course, in academia at Carnegie Mellon and here at Columbia, I'm more than happy to answer questions on almost anything. So please feel free to ask.
So what I wanted to start out doing was putting in context how I think of data. So I'm going to talk about the data lifecycle. It starts on the very left with the generation of data. Now, in science, we've been generating lots and lots of data for many, many years. If you think about large scientific instruments like Large Hadron Collider, telescopes in Chile, neutrino detectors in the south pole, these one-off very expensive scientific instruments have been generating and continue to generate volumes and volumes of data.
And so the sciences have been dealing with lots and lots of data all the time, and I think the finance industry is the same. What's new to the universe is, more recently, that we, people, are generating lots and lots of data. Of course, we've always done that, but through our digital devices and our interaction with the digital world, we are generating lots and lots of data. It's these, if you will, big companies that are collecting all this data about us, to the point that others know more about us than we know about ourselves.
So generation of data, then we collect the data. We don't always collect all the data we generate. We process the data. Under processing, I include encryption, compression, but also the less sexy things of data wrangling, and data cleaning, and so on. Then we actually store the data in some medium. Then we usually store it in a way that we can retrieve it quickly, access it quickly. So that's where all the database data management aspects of data come into play.
Then there's data analysis, and I think this is what is-- it's AI and machine learning that people usually identify with data science. It's this data analysis phase, which really is where the machine learning and the AI are coming to fore. But it's not enough to spit out probabilities or is it yesses is or noes, or cats or dogs. One has to visualize these results, and that's where data visualization comes to play.
And it wasn't until I joined Columbia University and I talked to all my colleagues across the university-- in particular, the School of Journalism-- when I really came to appreciate that last step I put on my data lifecycle. The journalists call it storytelling. I call it interpretation. And that is really it's not enough to show a pie chart or a bar graph-- you really have to explain to the end user, what am I looking at? Tell a story about the data interpretation.
So I like to emphasize the privacy and ethical concern throughout this data lifecycle. I think it's very important from even the start of what data do we collect to what data do we analyze. What answers do we provide the end user? So let me also share with you a very succinct definition of data science, which I think this audience will especially appreciate. Data science is the study of extracting value from data. And there are two important words in this definition. The most important word is value, and value is-- and I define it deliberately to be-- value is subject to the interpretation of the reader.
So value to a scientist is discovering new knowledge, but value to a company is likely accrues to the bottom line. In fact, it's very likely calculable. And value to a policymaker is information so that the policy maker can make a decision about the local community. So the other important word here is extracting, because it takes a lot of work to get this value from the data. And now I want to share with you my three-part mission statement for the Data Science Institute.
The first is to advance the state of the art in data science. This is about pushing the frontiers of the field, about doing basic research, about doing basic long-term research, and inventing new techniques, and new discoveries, and new science. The second is transforming all fields, professions, and sectors through the application of data science. And this really speaks to the prevalence of AI, machine learning, data science that's affecting all fields, all sectors as we see it today. Everyone has data and data is everywhere. And data today is what feeds those very hungry machine learning algorithms. So with a lot of data, you can do a lot using these machine learning techniques.
And finally, ensure that responsible use of data to benefit society. This really speaks to both benefiting society using data-- and this is really tackling societal grand challenges like health care, energy, climate change, the UN sustainability goals, for instance-- but it's ensure the responsible use, which I inserted in my mission statement that I really want to emphasize. With a lot of concern that we read about every day in the news, about biased algorithms, or biased models, or biased data that we feed their algorithms produce biased models, I think it's very important that we technology people try to ensure that what we do will be non-discriminatory, be fair, and take into consideration these ethical concerns.
So I summarize this long-winded mission statement into my tagline data for good-- data to do good for society and also using data in a good manner. So what I wanted to do-- oh, let me just-- a few more facts about the Data Science Institute, and then I'll go through some research stories. So we actually, Columbia University, the Data Science Institute at the university level and a university-wide Institute.
So we have over 350 faculty now from 12 different schools across the university. Every single profession is represented. Every single discipline is represented. Everything-- every single sector is represented, from arts and sciences, architecture, business, dentistry, all the engineering disciplines, public policy, journalism, law, medicine, nursing, public health, and social work. So this is partly reflecting the breadth of the university.
We have a few centers that are thematic. I'm not going to belabor this, but just to say that we have some themes that pop out. Like business and financial analytics, we have a center there. We have a center in health analytics, computational social science, and so on. We have a robust master's in data science program at Columbia University. And this is just to show you how we define a minimum bar of what it means to be a data scientist.
I share this with you because in industry today, across the country, across the world, there's a lot of confusion about what is a data scientist, and there are a lot of titles out there. And so I thought, well, let me set a bar for what makes a data scientist. So we have a highly selective program and a very rigorous program. So there are six required courses-- three in computer science, three in statistics, and then the capstone course, which is where an industry person or affiliate comes in, brings industry data to a team of students. They work on that data set, and answer real-world questions driven by the industry affiliate. And of course, everyone gets a job.
So that's our program. I should mention we just started a PhD specialization in data science at Columbia. So if you're already enrolled in certain PhD programs, you can take these additional courses and get it-- a specialization. I wanted to mention a couple of things going on in the education space. There's a program called the Collaboratory that's joint between the Data Science Institute and Columbia Entrepreneurship.
And we have now run this program for a few years, which requires that professors from two or more different disciplines come together, co-design, and co-teach a new course where one of those professors is computer science, or data science, or applied math. And it's really to bring computational and data science expertise to some other domain.
And the one I want to highlight is not just a course, but an eight-course curriculum that was designed by the Business School, along with computer science and data science, to the point where now, 50% of the MBAs coming out of Columbia have had some exposure to data science. So this is phenomenal, and the only limiting factor here is capacity. So if we could, it would probably be 100% eventually.
So what this says to me is, first of all, students are always smart. They see the future. They know what they need to learn while they're in school in order to be a good productive employee in any discipline later. We have a robust industry affiliates program. I'm double checking that JP Morgan is there. Yes it is. And we even this past year expanded it to international companies, like three from China-- Alibaba, Baidu, and [INAUDIBLE] should be up there-- and we have a company from Brazil.
Very recently, we created a center with IBM specific to blockchain and data transparency. This was actually quite a big deal for Columbia, and it was-- it's going strong. We have three tracks-- one on research, one on education, and one on innovation-- which is really an accelerator incubator track, so startups can get nurtured and then spawned out of this center.
So now, what I wanted to do in my remaining time is just to share with you a few research stories to show the kind of work that's going on in Columbia and data science. And I'll just do one on advancing the state of the art, and a few in terms of the other two parts of the statement. I wanted to emphasize that unique to Columbia University, data science-- the foundations of data science builds on three pillars of strengths-- computer science, of course, statistics, of course, but also operations research.
So at Columbia, we have a very strong OR department in the School of Engineering and a very strong OR group in our Business School. In fact, they all work together. And so from afar, it's really a great strength of Columbia. And OR, a lot of it is optimization, and there's just so many similar techniques and interest. And OR, I think, as a field, is moving more into machine learning, data science, and so on.
So the one story I wanted to share with you, in terms of advancing state of the art, has to do with causal inference. Of course, causal inference has been of interest to statisticians for decades. It is the bread and butter of economics, political science, and so on. And I think the machine learning community has deliberately and rightly been shy to ever infer causality and all the patterns they recognize. They say, it's just a correlation.
But still what we want, decision makers and policymakers especially want to know, does this cause that? Does smoking cause cancer? That's the canonical example. So what I want to share with you is some new results by [INAUDIBLE] Wang and David [INAUDIBLE] on multiple causal inference. And it turns out that the classical causal inference problem is a univariate problem, where you just want to know about a single cause having an effect.
But it turns out that the multiple causal inference problem is actually more of a prevalent problem. And it turns out also to be an easier problem to solve with weaker assumptions. So let me frame it in terms of an example. So pretend I'm a movie director, and I want to choose actors for the movie I'm going to produce. And I want to know how much money am I going to make. I want to predict how much money I'm going to make.
So what happens to movie revenue, if I place a certain actor in my movie? And what I have it at my disposal is a little database that says, for this movie, for these actors, I made that amount of money. And then mathematically, in the statistics point of view of causality, we would frame this as solving this equation-- or solving this expression, estimating the potential outcomes of a set of actors in my movie [INAUDIBLE].
Or if you are a computer scientist and you learned causality in some course, or you read about it through Judea Pearl's book, The Book of Why, you might express it in terms of the [INAUDIBLE] notation. They're essentially equivalent. So understanding causality-- and this, by the way, is a multiple causes that-- the problem with understanding causality is that there can be many possible confounding factors that can influence the outcome.
And I want to account for those confounding factors, otherwise, I will over count some factors and undercount others. So this problem has many applications, whether it's in genetics-- what genes cause a particular trait-- the people I choose on my sports team-- I want to know how many points I'm going to score-- or prices in a supermarket-- how much money is going to be spent depends on the prices in the supermarket.
So in classical causal inference-- by the way, for the movie example, let me just give you some examples of some confounders, and I think that that will motivate why it's a difficult problem. So if I were making an action movie, the genre of the movie is likely to affect the revenue. Action movies make more than artsy movies. And even knowing that I'm going to make an action movie will likely affect who I'm going to choose to be in my movie.
And then who I might choose in a movie might affect who else I might choose in the movie. And whether the movie is a sequel or not is another confounding factor. So today, when we do classical causal inference, in terms of a single causal inferences, the approach is you think about all these confounders-- genre, sequel, other actors, and so on and so forth. That's a human task.
And then, assuming you have thought of all the confounders, then you can plug the numbers in and estimate the causal effects. That's the right-hand side of the equation. The problem is it's a big assumption and it's untestable. But we seem to be OK with that. But this is the way it is. So we make this assumption that we've been really smart, we thought of everything, and then we crank it out.
And we get our numbers. We base everything on what we just cranked out, probably forgetting that we made that big assumption. So in the new approach, under the assumption that we have multiple causes, the idea is to construct what's called a deconfounder. And the beauty of this approach-- there are two advantages.
One is there's a weaker assumption, and the other is one can test the model of the deconfounder that you're constructing against a goodness function. So it's more constructive. So the basic idea is to fit a local latent variable model to the assigned causes. Those are the observable. So think of a factor model. And then infer the latent variable for each data point. That's what the z hats sub i's are.
And it is a substitute confounder. And then instead of using the Ws from the previous slide, which are the confounders I thought of, use the Zs, which come out of the model that I've constructed. And then use the usual right-hand side construct for the substitute confounder in the causal inference. And the only assumption that we need to make is that there's no unobserved single cause confounder.
In the classical causal inference case, we had to assume we thought of all the unobserved confounders. So it's a weaker assumption. Moreover, once you construct this model, you can actually test it for goodness against some function that you define as good. And then there's a proof in the paper that shows is an unbiased inference. So this is actually, I think, quite a move forward in the grand scheme of causal inference. So if at all you're interested in this, there's a paper you can read.
So let me go back to movies. What does this mean? Once we construct a deconfounder based on that little database that I showed you-- this is a snapshot of a James Bond movie. Sean Connery used to play the role of James Bond, 007. I don't know if any of you are old enough to remember Sean Connery. You probably remember Roger Moore.
But anyway, with the deconfounder, Sean Connery, the person who played James Bond, his value goes up, whereas unfortunately, those actors who played the lesser roles, M and Q, their values go down. What this means is that, without the deconfounder, Sean Connery's value was underestimated, and M and Q's actors' values were overestimated. So the deconfounder corrects for that.
And then once you have this model, just as with any causal model, you can do this counterfactual reasoning-- what if this, what if that? OK, so that's one story, in terms of advancing the state of the art. I think causal inference is still a very hot topic. And it's always been for statistics, but it's really rearing its head in the AI and computer science community now.
Now, what about transforming all fields, professions, and sectors through the application of data science? What I wanted to do is run through a lot of little stories to show you the breadth of what's going on at Columbia University. And I'm going to start with some science stories-- in particular, biology. So this is where the big data-- and there's not sophisticated machine learning going on here. It's just the big data problem where you're using this DNA sequencing-- in particular, of the microbiome-- around pancreatic cancer tumor cells.
And what the scientists discovered is that the microbiome around the pancreatic cancer tumor cells were counteracting the effect of the chemotherapy used to treat the tumor. So this is not good. It basically says that chemotherapy's ineffective. But the scientists went one step further to show that, if you inject the tumor cells with an antibiotic, that antibiotic would counteract the effect that the microbiome, therefore making the chemotherapy treatment effective.
So all of that was done through just lots and lots of data. I'm very impressed by the astronomy community because they have so much data, and they've been capturing so much data from all these large telescopes on Earth, flying in the around the universe, giving us images of the universe. And why I'm so impressed by the astronomers is that they'll throw anything at this data. They're very courageous.
And so this group of faculty at Columbia, along with some computer scientists and data scientists, use convolutional neural networks to look at weak gravitational lens images coming from large telescopes that are flying around. And what they showed is that they were able to estimate the parameters of the Lambda Dark Matter Model, which is a model of the universe, far better than off-the-shelf statistical techniques.
Now, as a person who's witnessed the use of neural networks and deep learning over the past five years or so, and how it's exploded in its success and applications-- from image processing, to speech processing, to natural language translation, and everything-- it's exasperating to me that, yet again, we throw neural networks at this, and there's huge success. Because as a scientist, what's exasperating is we don't really know why these neural networks are so successful. But there you go.
For something completely different-- this is coming from our economics faculty-- they have been looking at online market-- labor markets-- in particular, Amazon Mechanical Turk, and other markets such as that. And what they've discovered using this technique called double machine learning is that these online labor markets do not behave as a regular free marketplace.
Rather, they behave like what's called a monopsony. So all of you probably know what monopsonies are, but I didn't when I read this paper. A monopoly is when you have one seller and multiple buyers. A monopsony is when you have one buyer and multiple sellers. And so these online labor markets are actually behave more like that. And it's counterintuitive.
An example of a piece of data that they were-- a piece of information they were able to generate from all the data they collected is here, where they show that high-reward tasks do not get picked up more quickly than similar low-reward task. The idea is that, if it were a normal marketplace, you would go after the high-reward task because you get more money. But that's not how this marketplace-- this labor marketplace behaves.
For something more, and perhaps related to your world, is finance and reinforcement learning. This is a colleague of mine in the Operations Research Department who has been looking at robo-advising. So this is all very familiar to you. You have a lot of money. You might have a personal financial advisor. And over time, that personal financial advisor learns your investor preferences, what your risk aversion is to investing in a certain portfolio of instruments.
And so what he does-- and I just wanted to show one equation-- basically using standard reinforcement learning, over basic eight or nine iterations of this formula, you learn what the investor's preferences are. And what he also goes to show is that actually the combination of human and machine still outweighs either the human or the machine.
Now, again, for something completely different, we have some very modern history faculty who are using standard machine learning techniques, like topic modeling, sentiment analysis, and so on to look-- to paw through documents. So one set of documents that one of the history professors has accumulated is the largest set of declassified documents.
So he has downloaded everything that the federal government produces every year. And one example of what he's done with that is to look at-- if he will looking at just the cablegrams that diplomats sent each other in the 1970s, he wanted to see if he could detect the anomalous events, which should be the interesting historical events.
So each black dot here represents one of those interesting events. And you can recognize some of them-- the evacuation of Saigon, the death of Mao Zedong, and so on. So this is really the History Department and the history faculty using machine learning, AI, and so on in doing their research.
It's actually quite interesting that the history faculty have said to me-- they're at this juncture in what to teach their next generation students, because there's all these techniques that you learn as a historian, but now all of a sudden there are these computational techniques, these machine learning techniques. Knowing that everything is now digitized, it's important for the history students to learn this material.
So now let me talk a little bit about data for good, responsible use of data. And I want to share with you an acronym that I already don't like anymore, but it reminds me of the principle that we need to subscribe to, in terms of using data for good-- so fairness, accountability, transparency, ethics, safety and security. And I call that faiths.
My sole contribution to this acronym is S for safety and security. Others have come up with [INAUDIBLE] and FATE and so on. But what I wanted to focus on is safety and security. So there's a system that actually, JP Morgan, thanks to Manuela's little program, is helping to fund-- some large larger efforts that we are exploring at Columbia University on looking at deep learning and using formal methods and programming language techniques to better understand these deep learning systems.
In this particular work, DeepXplore, they're using two techniques inspired by software engineering and programming languages to look at deep neural networks. And the one technique that I think-- I find very easy to understand is this notion of neuron coverage. So we know from programming languages and writing computer programs, when you're testing a program, you want to-- there's a notion of code coverage.
So if you want to test all the paths in your program to make sure that at least for those paths, you get the right answer. Inspired by that idea, why not use that idea for a notion of neuron coverage, where you want to tickle every node in your network and every edge in your network, and see what is germane, given the inputs to the output? So that's one idea.
And what they've found, using another idea called differential testing, was they took off-the-shelf state-of-the-art DNNs used for image processing, like ImageNet and so on, and they've basically found ways-- they found flaws in these classifiers. So in particular, let me give you an example.
In the first case, on the left is an image for which the classifier-- and now, think of this classifier as being in your car or your self-driving car-- the camera of your car-- so this classifier looks at this image and correctly says, veer to the left. And that's fine. What this DeepXplore does is it finds natural perturbations to input images in such a way that you can fool the classifier to do the wrong thing.
So in this case, a natural perturbation is to darken the image just slightly, which is a natural event because we don't always drive in daylight. And once you do that, then this classifier will actually say, veer to the right. You hit the guardrail, you fall down the cliff, and you die. And so that's why they call it fatal errors.
So another interest in data science, and computing more generally, and especially with EU GDPR, is privacy. And what this group did was combine the notion of differential privacy, which has been out there for over a decade now, with deep learning to-- rather than test a deep learning system to see if it will do the right thing, can we once and for all ensure that for all inputs or for a set of inputs, we will guarantee the classifier's robust to perturbations?
And so they use this notion of adding noise that we get from differential privacy to add a noise layer to the DNN. In particular, they found that adding the noise layer early on will give us this ability to prove once and for all that the output is robust to perturbations from the input. So I think this is very exciting work, and it really there again shows this combination of results from different parts of the field coming together.
So now, let me close with some stories about tackling societal grand challenges. I wanted to share this one because it's a New York City happening. This is a huge grant that the combination of many universities, including Rutgers and Columbia, NYU, receive from the National Science Foundation to basically put in a one square mile area in Harlem-- so right next to Columbia University-- a test bed that will be advanced, in terms of the kinds of antennas put in, the kinds of-- wireless test bed that people can play with.
And this is so that we can look at not just 5G protocols, but protocols beyond 5G, and not just the wireless protocols that we need, but the applications that will sit on top of that. So this is going to light up a lot of interesting research. And having it close to Columbia and working with the Harlem community has been really wonderful.
So in terms of climate science, as many of you probably know, Columbia University has an Earth Institute, which is a university level, university-wide institute. And within the Earth Institute is Lamont Doherty, which is probably the premier-- one of the premier climate science capitals in the world. And this work that I'm showing you here is actually an example of the kind of infrastructure needed for many scientists to share models, share data, share algorithms.
It's called Pangeo, another open-source platform, also partly funded by the National Science Foundation, but other agencies, as you can see. And it is already being used by climate scientists around the world to share their models, to share their data, to share their algorithms. And it's a really nice layered stack of software, and it's already being used to do meteorologic examples, hydrology.
And the example I wanted to show you is not being done by Pangeo, but this is a simulation done by NASA and JPL of the ocean currents flowing around the Earth. This is a simulation based on 2 petabytes of data. And 2 petabytes right now is not a lot, because the IPCC anticipates within the next 10 years of generating up to 100 petabytes. And there's no one system that can simulate all that data. And of course, when I look at that, I immediately want to zoom in and out. I want more the ocean-- more than the oceans. I want so much more, but this is state of the art.
OK, my last example is in the health care arena, and I wanted to use this example for a couple of reasons. First, Columbia economy is the coordinating center for this federated data set of electronic health records around the world. There are already 600 million patient records in this federated data set coming from 25 different countries, 80 different databases. What is phenomenal to me, as the IT person usually in the room, is that these records are all in the same format.
Yeah. But once you have this data set, as you all know, you can do things you would not be able to do in traditional medical science. So what I wanted to show you is a couple of examples of what results they were able to see from just looking at these patient records. No clinical trials-- just observational data.
So first, the other reason I wanted to show you this is to reinforce the point that data visualization is not enough. I have to tell you the stories behind these beautiful pictures. So first of all, they looked at three different diseases-- diabetes, hypertension, and depression, from left to right. Each circle of rings represents a single data set.
So on top of the topmost the middle column is CUMC. That's Columbia University Medical Center. So if you are treated for hypertension, you're in there. Now, what they were looking at is, for each disease, for each patient, the sequence of drugs given to that patient to treat that person for that disease.
And so first, you're given a drug, and that would represent the inner circle. So for instance, if I pick on that hypertension one up in CUMC, that drug represented by the orange color is the first drug of treatment. If that works, fine. But if that doesn't work, then I'm given a second drug, and that drug is represented by the second circle-- second ring around that circle.
And if that doesn't work, I'm given a third drug. If that doesn't m I'm given a four drug, and so on. And by the way, I'm only showing you four rings for each circle. This goes on for tens of rings, in some cases. So the first interesting observation the scientists made by looking at just this data is that, if I collapse all the rings and circles for hypertension across all the data sets that they collected, 1/4 of patients treated for hypertension are treated uniquely.
That's pretty astounding. So that says, Manuela, if you have hypertension and you're in that 1/4, and you say to me, Dr. Jeannette, is there anyone else being treated in the world like me, the answer's no. This is just an observation. Of course, this observation then asks more questions, like why is this? What's really going on?
So that's one interesting result from just looking at the data. The second is the lower left-hand corner here. This is diabetes. If I were to show you all the rings and circles for all the data sets that are represented, they would pretty much look like the top two, where this first drug-- the chartreuse drug-- works pretty well.
But you'll notice it doesn't for the lower left-hand corner, and that lower left-hand corner represents a Japanese medical clinic. And it turns out that the Japanese are predisposed against that chartreuse drug. This was not known until looking at the observational data. And so of course, this also raises interesting questions of why, and so on.
And I asked my colleague, well, does this hold for the Chinese and the Koreans? And it doesn't. So again, some interesting new science to discover what's really going on. So I will close now. Data for good, just remember that of my talk, and I'll be thankful. Thank you.
MANUELA VELOSO: We have time for a couple of question.
JEANNETTE WING: Don't be shy. Yes?
MANUELA VELOSO: [INAUDIBLE]
AUDIENCE: With the Data Science Institute, are there any upcoming programs or events that you're really looking forward to, or initiatives--
JEANNETTE WING: First of all, we have an annual event called Data Science Day. It's usually at the end of March or beginning of April. It's a huge event. This past year, we had over 800 registrants-- a very popular, event especially with our industry affiliates. I would absolutely encourage all of you, if you're interested, to look for that one-day event.
It's where we showcase a lot of the research going on across the entire university in data science. We have demos, and posters, and all sorts of cool things. So that's one event that, if there's just one event you were to attend, that would be it. But the other thing that we run routinely are smaller workshops or conferences that are more thematic. Most recently, we actually ran one on machine learning and finance. And we've done that one a few years in a row.
We also have done an event on business, and I can't remember the title, but it basically is business and journalism. And so we run a lot of these different sorts of things. If you're interested, I can make sure that you're on some mailing lists, and then you can hear about them, because we're continuously running lots of events. Uh-oh.
MANUELA VELOSO: So I have a very difficult question, and I'll take advantage of having you here to ask it. So with all your experience-- I forgot to say, Jeannette was actually the head of the Computer Science Department at Carnegie Mellon for many years-- the first woman, the first energetic [INAUDIBLE]. Anyway, with all your experience, suppose there is an oracle, someone that will answer one question the right way.
So in all the things, from a scientific point of view, from an organizational point of view, from a society point of view, what is the one question that takes your sleep at night, and you would like to have someone give the absolutely truth [INAUDIBLE]? Jeanette--
JEANNETTE WING: Oh, Manuela.
MANUELA VELOSO: Jeannette, I'm not letting you go without you answering.
JEANNETTE WING: [INAUDIBLE]
MANUELA VELOSO: What were you expecting from me and you here?
JEANNETTE WING: You should have warned me.
MANUELA VELOSO: Jeannette--
JEANNETTE WING: Yeah?
MANUELA VELOSO: --what is that question? Don't tell me that it's like the traffic in New York or something. No. But what is the question--
JEANNETTE WING: No, what is that question?
MANUELA VELOSO: --that you really think? Is it an educational question? Is it what should people learn? What is the question? Should the machines be faster? What is the question? This oracle will give you the right answer, so you don't have to do research more about it. You are done. What is the question that you would like to be done with?
JEANNETTE WING: Why do humans behave the way they do?
MANUELA VELOSO: Very good question.
JEANNETTE WING: What do you think of that?
MANUELA VELOSO: I think it's a great question. In fact, the mystery are the humans indeed.
JEANNETTE WING: We are. We have this Neuroscience Institute at Columbia University. They study mind, brain, and behavior to show that it's not enough to understand how your neurons work-- it's all the layers on top. And then when I think about human behavior, that causes grief to computer scientists, because it's the presence of uncertainty that-- Manuela's as poor little robots are always trying to avoid the humans because they might get in the way.
And then the cars-- the cars are worried about pedestrians and bicyclists, because we are what we are. And it's not just that. When I think of my friends in public health, public health schools continue to exist because, despite best practices and good advice on what to eat, exercise, and so on, we don't do that.
MANUELA VELOSO: There are many examples here also in the financial world [INAUDIBLE]
JEANNETTE WING: Yeah, you should relate it to the financial world. But I think that the finance world is being slowly overtaken by machines.
MANUELA VELOSO: It's humans, I'm telling you. It's 250,000 human at JP Morgan.
JEANNETTE WING: Around the world.
MANUELA VELOSO: So any other more easier questions?
JEANNETTE WING: Easier questions please.
AUDIENCE: I'm not quite sure how this relates to anything, but recently, I was reading about China and the HIV-resistant babies created by the CRISPR. And even in machine learning, China has so much detail, rightly or wrongly corrected. And then your title is Data for Good, but good is relative, so I don't know. Where do you see China versus the US in the next 5 or 10 years in machine learning?
JEANNETTE WING: That's an excellent question-- so timely, because I just returned from a trip to Shenzhen, which is just an amazing, amazing little corner-- big corner-- bigger and growing corner of China. Though the Chinese are, I think, right now unstoppable. In AI, they have humongous amounts of data. They don't have a lot of regulation to get in the way. The citizens of the country kind of-- they know they're monitored for everything, so privacy is-- and the big companies, like Alibaba, Baidu, TenCent, and so on, they're just going full steam ahead.
And even the business models, the economic policy models-- China's very flexible, very adaptable, very experimental. So China is definitely a place, if you want to think competitively, with respect to the US, to watch. And they're not going to wait around for any other country to get their act together or to behave collegially or anything.
AUDIENCE: I just want to say I really relate to your fascination about how humans behave. My question actually that I wanted to ask before you said that was, how do you see data science a machine learning advancing neuroscience research, and just understanding how the brain works and all the mysteries?
JEANNETTE WING: It's a great question. And I'm glad I actually have an answer to that because fortunately, the Zuckerman Mind Brain Behavior Institute asked me to give a keynote recently, and I said, OK, I better relate it to neuroscience. And so I did a little homework. So the first thing, of course, as many of you know, is that deep learning itself was inspired by the way the brain is wired.
And I remember-- so Manuela probably remembers this too-- remember Jeff Hinton? The three of us were colleagues in the '80s together, and we used to kind of just humor Jeff when he would talk about neural networks and making the brain-- making a machine that's wired like the brain. And then 25 years later, it takes off, and it is also unstoppable.
So what's interesting is, of course, it's still a metaphor. And the neuroscientists are benefiting from deep learning and so on just as all other fields are benefiting. It's basically image analysis, and Manuela can tell you about her latest and greatest there, as well. But in terms of medical science, neuroscience, and so on, a lot of it is image processing. So think EEGs and fMRIs, think about combining the two-- all the stuff that you can do with just images, they're going full out with that.
But what's interesting is the reverse now. So the neuroscientists are saying to us-- computer scientists-- well, your model of the brain-- and look, it was just a metaphor. It wasn't really supposed to be a real model of the brain. But they're saying, your model of the brain is too simplistic. The brain is really much more complicated. First of all, it's not just a bunch of homogeneous nodes that do sigmoids. The wirings, it's not just hierarchical.
We've got connections across the layers. We've got all kinds of connections. And all those nodes, they could be doing different functions. And I think there are both combinations of neuroscientists and computer science working now together to say, OK, well, let's make these neural networks a little more complicated, make them look more like a real brain. Even that was meant to be a metaphor. So there's where I think some exciting science is going to happen, in both directions. There was a question back there.
MANUELA VELOSO: [INAUDIBLE]
AUDIENCE: So is there any currently, or there was, any project related to financial inclusion? Like getting data from people and [INAUDIBLE] any credit to low-income people and get data from there. Because I saw that the institute is related to social good and social work, so I wanted to know if there is any project related to that, or which projects it's currently-- are you currently working on with that area?
JEANNETTE WING: Projects related to--
AUDIENCE: [INAUDIBLE] I'm referring to using data from people-- for example, where do they live, which are their incomes.
JEANNETTE WING: What our School of Social Work does routinely, as many other-- probably the School of Public Health and so on-- is they are still collecting data about people through the normal instrument, which is a survey. So there are lots of population surveys, and they're looking into issues of inequity usually. They're usually looking into issues of why is it-- or causality now-- why is it that children growing up in this poor neighborhood with these kinds of circumstances end up in one life path over another life?
Is there something we can intervene and change about that? There's a very interesting work I thought I would share with you on that particular topic-- I didn't include in my talk-- on using natural language processing-- NLP-- pay on Twitter data, but specific to social justice. This is a social work faculty member working with Cathy McEwen-- Manuela knows her-- on looking at the tweets that Chicago gang members write to each other.
It turns out that the Chicago gang has its own language. And this is not surprising. Little subpopulations have their own tweet language. So first, the idea was, if we can understand what they're tweeting with each other, then maybe we can intervene before a violent act happened. So the idea is you're a Chicago gang, one of your gang members is killed. You start tweeting a feeling of sadness or depression versus aggression and some violence in nature. If I'm a social worker or a law enforcement person and I can see your tweets, and I detect that you're expressing a feeling of aggression, then I can intervene.
So that was a long story, and they were able to do this. Interestingly enough, from the computer science point of view, it's a brand new language. So it needed parts of speech analysis, a tagging and annotation. But now it's out there. It's a public data set. And not that many people will speak in this language, but it's out there for you to look at. So there is very interesting work going on at the School of Social Work. But most of the data still is collected through surveys. He has that question.
AUDIENCE: I had a question which is sort of a corollary to the neuroscience and human behavior question. We are more and more interacting with somewhat intelligent or semi-intelligent systems. How do you see this impacting us as humans, and where we can use this for good, or for evil, potentially?
JEANNETTE WING: It's a great question. I actually would go even further back and say, before we even got to the fancy shmancy AI machine learning, how has the digital transformation affected our way of life, but also our way of learning, and our way of teaching? And I just witnessed this in the classroom. I'm sure Manuela did, as well, I saw generations of students grow up with technology to the point where it's a different kind of-- and I talk to my neuroscience friends about this because the mean would be, oh, they're wired differently because they learned-- they didn't learn to work in blocks of time to solve a particular problem, and continue-- but they kind of learn in this scattered way because all the information's on the web, and you can just go around, and maybe their brains are wired-- I think that we still don't know.
As recipients of such minds, I have hypotheses, but I don't really know. And I think it is an interesting question. How has digital technology changed the way in which we think and our brains are [INAUDIBLE]? But now, if you throw in the AI machine learning, when we get used to using and AI system as our accompaniment, do we give up something, or does it make room for more creativity on our part, that will always, I hope, elude the machine?
MANUELA VELOSO: [INAUDIBLE] I'm sorry, we've ran out of time, but Professor Wing [INAUDIBLE]. Let's just thank her kindly.
JEANNETTE WING: Thank you.
Thank you, thank you.
"Siri, meet Siri."
March 11, 2019 - London
Michael Wooldridge is a Professor of Computer Science and Head of Department of Computer Science at the University of Oxford. He has been an AI researcher for more than 25 years, and has published more than 400 scientific articles on the subject. He is a Fellow of the Association for Computing Machinery (ACM), the Association for the Advancement of AI (AAAI), and the European Association for AI (EurAI). From 2014-2016, he was President of the European Association for AI, and from 2015-2017 he was President of the International Joint Conference on AI.
Michael Wooldridge Lecture Series Video
[00:00:00.99] SAMIK CHANDARANA: So I recognize a lot of people in this room. But for those of you that don't know me, I'm Samik Chandarana. I'm part of the team that looks after the applied ML and AI. And I'm obviously very lucky to have Manuela Veloso, who I'm joined at the hip with on the AI research side.
[00:00:15.99] So today, we have the fourth lecture in the series that Manuela has been organizing. When Manuela joined the firm, we were very excited because we wanted to make sure that on top of using her own mind, we actually delved into her network. And part of her network today, we're very lucky to welcome Professor Wooldridge from Oxford. Manuela will tell you a little bit about him, and then we get on and hear from the man himself.
[00:00:43.28] MANUELA VELOSO: Perfect. Thanks all for coming. And we are trying to put some more chairs. But also, welcome to everyone who is online. So it's a pleasure to be here and, actually, to have our first distinguished lecture hosted in London. So the other three were hosted in New York. We are planning on coming every other time, actually, to London and host it here. So thanks all for coming.
[00:01:10.21] It's a great pleasure for me to introduce without further ado the speaker today, who is Professor Michael Wooldridge. And I have known-- where's Mike?
[00:01:21.73] Oh, right here. I was expecting you to be there. I'm like, wait, he's up here. I've known Mike for, I believe, more than 20 years. We have been faculty in an area of AI which we call multi-agent systems, multi-agent learning. And Professor Wooldridge will tell you more about this.
[00:01:43.41] But we have been partners in crime in this particular kind of area of research. In fact we co-edited a book together, where we have our own book on AI. And Professor Wooldridge has a distinguished career of researching AI. He's the current head of the Department of Computer Science at Oxford University.
[00:02:07.53] And he was also President of the European Association for Artificial Intelligence, as well as the International Joint Association, what we call ECCAI Conference on AI has its own board. And Professor Wooldridge was also the president there.
[00:02:24.45] And then, just so you know, in academia, there are these honors which are associations for which we can be fellows. And the main associations of our areas are the ACM, which is this Association for Computing Machinery, AAAI, which is the Association for the Advancement of AI and the European, also, Association for the Advancement of AI. And Professor Wooldridge is a fellow of these three institutions. In addition, he has published more than 500 articles.
[00:02:59.85] And it's a great pleasure and, in fact, a great honor to welcome him to JP Morgan. And so please, we'll let that Professor Wooldridge speak for as much as he wants-- hopefully, within time-- and then, we'll have a session of questions. So I would ask you if you could hold the questions to the end of the lecture would be great, unless you have a real clarification question, and then you can ask. But otherwise, let's please welcome Professor Wooldridge and we are going to listen on Siri. Let me introduce you, Siri.
[00:03:40.47] MICHAEL WOOLDRIDGE: Thank you, Manuela. Thank you, colleagues of Manuela for doing me the honor of inviting me. I'm used to hanging out in Oxford, and when you first get to Oxford, there's all these grand buildings, these ancient medieval buildings. It's very intimidating. After a while, you get used to it. And you forget when people come in how intimidating it is. And then I got off the train this morning at Canary Wharf, and it's like, oh, my god.
[00:04:03.06] So I'm trying to cope with that. I'm very grateful to have the opportunity to speak. It's very exciting to see an organization like JP Morgan start to take AI so seriously. I think it's going to be incredibly exciting to see what Manuela achieves. Everybody in the field knows that Manuela has boundless energy, boundless determination, boundless enthusiasm, which she communicates to everybody around her. So I think you've made a great hire. And there are exciting times for AI at JP Morgan ahead. And I look forward to seeing what's happening there.
[00:04:35.34] So before I begin, a small confession. This morning, I left Oxford at about 7:15. And as I was leaving, my wife said, you're not wearing a tie. I don't normally wear a tie. Actually, this is me looking smart, I have to say. And she said, you're going to JP Morgan. Everybody is going to be wearing a tie. And I look around the audience, and I see maybe two-- two-- two people wearing. You must be the bosses.
[00:05:00.69] AUDIENCE: [INAUDIBLE]
[00:05:02.78] MICHAEL WOOLDRIDGE: OK. So let me get started. So this is a talk about artificial intelligence. But I have to begin with a confession. There is no deep learning in this talk whatsoever, OK? So sorry if that comes as a shock to some of you.
[00:05:18.34] But actually, the truth is, AI is a very broad church. And deep learning is kind of the poster boy or poster girl for the successes of AI at the moment. But it is one thread in a very rich tapestry of AI research, which is now delivering really impressive results. It isn't all just about deep learning.
[00:05:38.49] And in fact, if you just remember one message from what I'm talking about today, just take that one away, all right? It's not all about deep learning. There is an awful lot more going on than that.
[00:05:48.21] What I want to talk about particularly, and it is an area, my own field, which Manuela has also worked in, a field called multi-agent systems. I've never really done anything else. This is what I've been doing since around about 1989 when it was an idea, right, that we thought would come to fruition. And it's an idea, I believe, which now has come to fruition.
[00:06:09.34] And I give you a little bit of a flavor about how that happened. So OK. So I'm going to start off by motivating the idea of multi-agent systems, what are multi-agent systems.
[00:06:33.15] It worked a minute ago.
[00:06:34.54] AUDIENCE: [INAUDIBLE] AV. Oh. Sound?
[00:06:38.91] MICHAEL WOOLDRIDGE: OK, brilliant. Thank you. So I'm going to start off by motivating the idea of multi-agent systems and give you an idea of where this idea came from. And I'll show you a little video, which Manuela will have seen a million times before. Many of you will also have seen it but many of you won't.
[00:06:52.22] And what's remarkable about this video, which was made in the 1980s, is how clearly it anticipated a bunch of things which we're now seeing happening. So I'll show you that video. And that will lead me in to discuss the idea of multi-agent systems.
[00:07:07.49] And then, the bulk of my talk-- so this is the high-level motivational stuff about why we're doing what we're doing and why this is a natural thing to be doing and what we might want to be doing. In the middle part of the talk, there's some more technical material. And the more technical material is really to do with an issue which arises in multi-agent systems, which is that they are inherently unstable things.
[00:07:28.77] We build things which are inherently unstable. We need to find ways of understanding and managing their dynamics. And so the technical part of the talk is about understanding the equilibria of multi-agent systems and the tools that we've developed that enable you to do that.
[00:07:45.25] So I'm going to talk about two different approaches there. The first is an approach based on ideas in game theory. And game theory is an area of economics which has to do with strategic interaction. And I'll show you how you can apply game theoretic ideas to understand the dynamics of multi-agent systems, what behaviors those multi-agent systems might exhibit.
[00:08:05.98] And that's one approach, which has some advantages and disadvantages. I'll show an alternative approach very, very briefly, an approach called based on agent-based simulation, where instead of trying to formally analyze the equilibria of the system at hand, what we try and do is just simulate the system in its entirety. And these are both ideas, complementary ideas-- they're not in competition because they deal with quite different types of systems-- but both ideas whose time has come.
[00:08:36.00] And then I'll wrap up with some conclusions. So let's start by motivating multi-agent systems. So I wrote a textbook on multi-agent systems. It's appeared in a couple of editions now. And the first edition I wrote in around about 2001.
[00:08:50.88] And the very first page of that first edition essentially had this slide on it. It said "what is the future of computing?" And the future of computing, I reckoned in the year 2000, was the following. The future of computing was to do with ubiquity, interconnection, delegation, human orientation, and intelligence.
[00:09:09.22] So what do I mean by those things? Sorry. I should have added, by the way, I think, looking back, I think I was absolutely right. I don't know very much in life. I don't know very much in life, but what I know is that, actually, this is how things turned out, and this is how things will turn out for the foreseeable future in computing.
[00:09:28.56] So let me explain what's going on. Ubiquity is just to do with Moore's law, right? Moore's law, the number of transistors that you can fit on a chip or whatever, basically, what it means is computer processing power just gets exponentially cheaper year on year. The devices, the computer processes that are used to drive computers get smaller, have lower power requirements, are more powerful year on year.
[00:09:52.53] And all that means is that you can put computers into places and in devices that you couldn't have imagined. So ubiquity just means putting computer processing power everywhere. And that's made possible by Moore's law.
[00:10:05.71] So a neat example of this-- a magazine, a home computer magazine, which costs 5 pounds to buy in the UK recently handed out on the front cover a Raspberry Pi computer-- free. You got it free with the magazine, a computer which, 10 years ago, would have had the power of a typical desktop computer of the day. So that's ubiquity. We can put computer processing power everywhere. Any device that we might care to build, we can augment it with computer processing power.
[00:10:32.79] And the second aspect of that is interconnection. These devices are connected, right? When I was studying computer science as an undergraduate in the 1980s, I did a course on networks. And the guy that was teaching this course told us, look, these networked systems, these distributed systems, they're really hard to build. But don't worry, because you'll probably never make one, right?
[00:10:53.61] I mean, no lecturer has ever been more wrong than that lecturer was on that day, right? Because we now realize-- and this was a change in the way that people thought about computing-- we now realize that, actually, networked systems, distributed systems and interconnection, communicating systems, these are the norm. These aren't the exception. And that's resulted in a fundamental shift in the way that people think about computing.
[00:11:20.05] OK. Then the third trend is a bit more subtle. It's towards delegation. So what do I mean by delegation? Well, some extreme examples of delegation are, again, if I go back to the mid 1980s, Airbus were talking about fly-by-wire aircraft getting rid of the human in the loop. And the aircraft onboard computers would actually have control and, in some circumstances, could overrule the instructions of human pilots.
[00:11:46.08] And a lot of people were outraged at this. They thought this was absolutely the end of civilization, that machines were taking over that. Well, some things went wrong. But actually, the truth is, it ended up being a really good thing.
[00:11:57.60] What we're now seeing, the shift towards driverless cars, whether or not full, level five autonomy that you jump in the car and just state your destination, that's some time away. But nevertheless, there are all sorts of features-- smart cruise control features on cars-- which we are delegating the task of driving the car to.
[00:12:18.13] But those are extreme examples of a much wider phenomenon. We hand over control of ever more of our lives and, very relevant for JP Morgan, ever more of our businesses to computers. We are happy for them to make decisions on our behalf, OK? And that's a trend towards delegation.
[00:12:38.64] Human orientation-- what do I mean by that? When Alan Turing arrived in Manchester in the end of the 1940s to program the first computer-- and the first programmable computer, stored-memory computer, I think was their chief claim to fame, the Manchester Baby-- he actually had a bunch of switches on the front of it.
[00:12:56.37] And you had to flip a switch this way to put a one in this particular memory location and then flip it the other way to put a zero there. To program that computer, you had to understand it right down at the level of-- and it wasn't even transistors then-- at the level of the valves that were in this machine. Nothing was hidden from you.
[00:13:13.95] Well, this was an incredibly unproductive way of programming. People were not very good at programming computers in that way. And if you learn to program on Manchester's machine, you couldn't have gone to Oxford and programmed Oxford's machine because they had a completely different architecture.
[00:13:27.60] The trend since then, firstly, with the arrival of high-level programming languages-- Fortran and COBOL in the 1950s, then on towards languages like ALGOL, Pascal and C-- meant that you could learn to program on one machine and transfer your skills to another. But the key point is what those languages present you with is ever more human-oriented ways of thinking about programming.
[00:13:52.72] Object-oriented programming, which is a state of the art-- I'm sure there are some programmers in the room, and you probably do object-oriented programming-- takes its name from the idea that it was inspired by the way that we interact with objects like this clicker in the real world. And it was supposed to reflect that. It's a human-oriented way of thinking about computing. And interfaces-- human-computer interfaces-- will get ever more human-oriented, and we'll see that in a moment.
[00:14:19.33] And then, the final one was intelligence. Now, intelligence, here, I mean two things. All I mean is, really, the scope of the things that we get computers to do for us continually expands. Year on year, computers can do a wider range of tasks than they could do previously.
[00:14:38.37] Now, there's a very weak sense of intelligence, which is just that, actually, it's just decision-making capability. But actually, what we've witnessed over the last decade is an explosion of intelligence in the AI sense, that now, we're seeing computers that are capable of a much richer range of tasks than we could have imagined when I wrote the first edition of this book.
[00:15:01.41] So I say the future of computing, with absolute certainty, I think, lies in that space. It is towards ever more ubiquity, ever more interconnection, ever more delegation, ever more human orientation, and ever more intelligence. Now, that's still a wide space, and that gives us a huge range of possibilities for where we might end up. But there are a number of other trends in computing, each of which picks up on different aspects of these trends.
[00:15:32.05] So if you look at the trend towards interconnection and intelligence, that takes you towards a thing called the Semantic Web. So after the World Wide Web was developed, the idea of the Semantic Web was putting intelligence on the web, so having smart web browsers so that, for example, if you did a search for weather in Canary Wharf, right, that your browser would be smart enough to realize that if it couldn't find a website which referred to weather in Canary Wharf, that weather in East London, or just weather in London, would be a reasonable proxy for that, involving some reasoning, the kind of common sense reasoning that you will do, but your web browser doesn't. And that's the Semantic Web. And that's been, historically, over the last 20 years, a big tradition in AI-- adding AI to the web.
[00:16:15.93] Peer-to-peer-- don't hear too much about it these days, but 15 years ago, it was all the thing. Peer-to-peer is just one aspects of this ubiquity and interconnection. Similarly, cloud computing, the Internet of Things.
[00:16:28.62] The Internet of Things is just the idea that all our devices-- our toaster, our fridge, our television-- are all connected to the web. It's all one, big interconnected mass. Right now, what you might want to do with that, I don't know. But the point is it's a really exciting potential.
[00:16:44.57] But where I want to go is just pick up on this last manifestation of these trends. And this is the trend towards what we'll call agents. And at this point, I'm going to show this video that I referred to. So this is an old video, OK? It's a video that came from Apple computers. And so you have to set the scene.
[00:17:04.93] This is the late 1980s. John Sculley was then CEO of Apple. He was the guy that evicted Steve Jobs, one of the famous business decisions of all time. They'd just released the Mac. The Mac was being a smash hit. But John Sculley was already worrying about what would come after the Mac.
[00:17:23.36] And the innovation on the Mac was the user interface. It was suddenly a human-oriented interface. It was an interface that people could use without specialist training about interfaces. And so to think about what would come next, they came up with this video, which is called Knowledge Navigator.
[00:17:44.86] [VIDEO PLAYBACK]
[00:17:45.86] [MUSIC PLAYING]
[00:18:11.76] MICHAEL WOOLDRIDGE: iPad.
[00:18:14.25] - You have three messages. Your graduate research team in Guatemala, just checking in, Robert Jordan, a second-semester junior requesting a second extension on his term paper, and your mother reminding you about your father's surprise birthday party next Sunday.
[00:18:31.55] Today, you have a faculty lunch at 12 o'clock. You need to take Kathy to the airport by 2:00. You have a lecture at 4:15 on deforestation in the Amazon rainforest.
[00:18:43.48] - Right. Let me see the lecture notes from last semester. No, that's not enough. I need to review more recent literature. Pull up all the new articles I haven't read yet.
[00:18:59.90] - Journal articles only?
[00:19:01.82] - Mm-hmm, fine.
[00:19:03.02] - Your friend, Jill Gilbert, has published an article about deforestation in the Amazon and its effects on rainfall in the sub-Sahara. It also covers drought's effect on food production in Africa and increasing imports of food.
[00:19:18.19] - Contact Jill.
[00:19:20.20] - I'm sorry. She's not available right now. I left a message that you had called.
[00:19:25.16] - OK. Let's see. There's an article about five years ago-- Dr. Flemson or something. He really disagreed with the direction of Jill's research.
[00:19:36.76] - John Fleming of Uppsala University. He published in the Journal of Earth Science of July 20 of 2006.
[00:19:44.58] - Yes, that's it. He was challenging Jill's projection in the amount of carbon dioxide being released to the atmosphere through deforestation. I'd like to recheck his figures.
[00:19:54.91] - Here is the rate of deforestation he predicted.
[00:19:57.58] - Mm-hmm. And what happened? Mm. He was really off. Get me the university research network. Show only universities with geography nodes. Show Brazil. Copy the last 30 years at this location at one-month intervals.
[00:20:31.71] - Excuse me. Jill Gilbert is calling back.
[00:20:34.89] - Great. Put her through.
[00:20:36.84] - Hi, Mike. What's up?
[00:20:38.19] - Jill, thanks for getting back to me. Well, I guess that new grant of yours hasn't dampened your literary abilities. Rumor has it that you've just put out the definitive article on deforestation.
[00:20:48.69] - Aha. Is this one of your typical last-minute panics for lecture material?
[00:20:54.18] - No, no, no, no, no. That's not until, um--
[00:20:58.41] - 4:15.
[00:21:01.50] - Well, it's about the effects that reducing the size of the Amazon rainforest can have outside of Brazil. I was wondering, um-- it's not really necessary, but--
[00:21:11.01] - Mm, yes?
[00:21:13.77] - It would be great if you were available to make a few comments-- nothing formal. After my talk, you would come up on the big screen, discuss your article, and then answer some questions from the class.
[00:21:24.24] - And bail you out again? Well, I think I could squeeze that in. You know, I have a simulation that shows the spread of the Sahara over the last 20 years. Here, let me show you.
[00:21:40.32] - Nice. Very nice. I've got some maps of the Amazon area during the same time. Let's put these together.
[00:21:58.62] - Great. I'd like to have a copy of that for myself.
[00:22:02.67] - What happens if we bring down the logging rate to 100,000 acres per year? Hmm. Interesting. I can definitely use this. Thanks for your time, Jill. I really appreciate it.
[00:22:23.90] - No problem. But next time I'm in Berkeley, you're buying the dinner.
[00:22:28.63] - Dinner, right.
[00:22:29.64] - See you at 4:15.
[00:22:31.20] - Bye-bye.
[00:22:33.31] [MUSIC PLAYING]
[00:22:36.66] - While you were busy, your mother called again to remind you to pick up the birthday cake.
[00:22:41.16] - Fine, fine, fine. Print this article before I go.
[00:22:44.97] - Now printing.
[00:22:46.70] - OK. I'm going to lunch now. If Kathy calls, tell her I'll be there at 2 o'clock. Also, find out if I can set up a meeting tomorrow morning with, um, Tom Lee.
[00:22:56.99] [END PLAYBACK]
[00:22:57.91] MICHAEL WOOLDRIDGE: OK. So he's a professor at Berkeley, apparently. They have a somewhat more relaxed life than I would imagine.
[00:23:07.48] So what's remarkable about our video is quite a number of things that are anticipated. So number one you saw, right? There was an iPad, a 1980s iPad, but clearly, it was a tablet computer. There were no tablet computers, and they were not on the horizon at the time. But that was clearly the way they thought it was going. It had a little selfie camera. I don't know if you-- well, actually, quite a big selfie camera. So they anticipated that.
[00:23:29.65] Other stuff that's interesting is-- and this is before the internet was a big thing-- they anticipated a web search or something like it, that they were already thinking about the devices that people would have in their homes being connected in that way. They picked up on the idea of visualization. Visualization is an area that's grown the way that he's visualizing that data and putting together those different data sources to be able to visualize it in neat ways. That's been a huge growth area over the last 20 years.
[00:24:01.66] But the thing that we picked up on, the thing that drove my community, is the idea of an agent. The thing that he was interacting with on the tablet screen was not a person. It was an animated piece of software. Actually, there's an interesting story there that this animated piece of software, clearly, the idea was that they wanted this to look as lifelike as possible.
[00:24:24.31] And actually, the received wisdom these days is if you're doing something like that, you really don't want it to look as lifelike as possible. Because you don't want to mislead people into thinking that they're talking to another human being. You've got to be explicit, you've got to explicitly show them that they're talking to a piece of software.
[00:24:42.05] So what my community picks up on is that notion of an agent. And what we saw is the idea that, instead of interacting like the 1984 Mac screen, where you went to Microsoft Word, and you went on a menu, and you dragged down that menu to select some item where everything happened because you made it happen, right, where the software was a passive servant, right-- it was only doing the things explicitly that you told it to-- that there was a shift, that instead, the software would become a cooperative assistant, something that was actively working with you, taking the initiative to work with you on the problems that you were working on, OK, so the fundamentally different shift away from this idea of software being something that you do stuff to to something that works with you in the same way that a good human personal assistant would work with you, OK? And that's exactly the metaphor that they had there for their agent.
[00:25:47.32] So that was the idea. That's the kind of vision that launched the agents community at the beginning of the 1990s was that vision. So just remember that video was made at a time when Ronald Reagan was president in the United States. Margaret Thatcher was prime minister here. Nigel Lawson was Chancellor of the Exchequer in the UK. That's how old it was. But actually, a lot of what they predicted was pretty much bang on the nail. It's an impressive vision.
[00:26:12.71] OK. So the first research on agents started in the late '80s and early '90s. And a lot of the thrust of the work at the time was about building specific applications like software that would help you read your email-- so something which would help you prioritize your email-- software that would help you browse the web-- anticipate which link, for example, you were going to follow next and proactively help you with the tasks that you're working on.
[00:26:40.30] But it took 20 years from that video before we actually saw the first commercial agents really start to take off. And the one that grabbed my attention at the time, because I knew some of the people involved and so did Manuela, I think. The people involved in Siri were working at Stanford Research Institute in the US. And where they came from is exactly this work on agents-- software agents-- that we were doing in the 1990s.
[00:27:07.69] And then, we've seen, of course, a flurry of others-- so Alexa on Amazon, Cortana on Microsoft, Bixby on Samsung. I've never actually seen that one. And there are a whole host of other software agents, OK? And those software agents are embodying exactly those ideas, that idea of human orientation, moving away from machine-oriented views towards human-oriented views, presenting you with an interface which you can understand and relate to through your experience of the everyday world.
[00:27:37.90] And the most important aspect of that in that video is communicating with just very natural language, right, communicating in English, which isn't nuanced, which isn't in some kind of strict subset of English. It's not some special artificial language. You're just talking as you would to a human assistant.
[00:27:58.24] OK. So why did this idea of agents actually take off? Well, it's no accident that Siri was released when the iPhone got sufficiently powerful that it could actually cope with the software. There's an awful lot of very smart AI under the hood on Siri. And understanding spoken language requires a lot of processor time, right? So it could only happen when we had sufficiently powerful computers. In other words, it was the ubiquity. It was Moore's law took us there and made that feasible.
[00:28:35.22] So advances in AI also made competent voice understanding. And speech understanding made that possible, OK? It couldn't have happened in the 1980s because we just didn't have the compute resources available. We didn't have the data sets that we now have available to train speech understanding programs and so on.
[00:28:56.49] But probably the most important is that the supercomputer that you all carry around with you in your pocket, right, the smartphone that you have in your pocket-- I mean, we call it a phone. That's the dumbest thing it does, right? That's the most trivial of things that it actually does.
[00:29:10.71] It's a supercomputer. It's equipped with incredibly powerful processors, massive amounts of memory. It knows where you are. It can hear its environment. It can sense movements. And it's connected to the internet. And it's the fact that it has all that stuff which has made these agents in the way that they envisaged back in the 1980s, has made them now possible.
[00:29:33.24] So the agent-based interface, whether or not it's realized in the way that Apple envisaged in that video, right, the agent-based interface, the idea that you interact with some software through that kind of human-oriented interaction, is just inevitable. It is the future of the computing because there is no alternative, right? If you think about your smart home and, in the future, all homes will be smart homes, there isn't really any other feasible way to manage it other than through a kind of agent-based interface. It's got to happen, right?
[00:30:09.86] If you think about a sector like banking, right, where you download an app which you interact to manage your accounts, where that's going, inevitably, is that that app is going to be more and more like an agent, somebody that's working with you to help you manage your accounts, to help you manage money, not just something which is doing something dumbly when you tell it to, but actually something which is actively helping you to manage your finances.
[00:30:39.00] Now, rich agent interfaces-- really rich agent interfaces-- are still some time away, OK? And by rich agent interfaces, I mean that they're still very limited in the kinds of language that they can understand. You don't have to dig very deep to understand the limitations of Siri, in particular, actually. But even the better ones, you don't have to dig very deep to understand their limitations.
[00:31:02.63] But what I want to now dig into is one aspect of this which has really been ignored. So if I say to Siri, Siri, book an appointment to see Manuela next week, why would Siri phone up Manuela herself? Why wouldn't my Siri just talk to her Siri to arrange this?
[00:31:21.89] That's what a PA would do, right? They wouldn't go straight to the boss. They would go to the other assistant. In other words, my field is concerned with what happens when these agents can talk to each other directly, OK? If I want to book a restaurant, why would my Siri phone up the restaurant?
[00:31:41.78] There was this famous example-- I forget whose software it was-- that did exactly this-- it might well have been Apple-- about phoning up a restaurant to book a table you may have seen in the news last October or so. But why would they do that? Why wouldn't they just go direct to the agent at the other site? It just makes perfect sense, all right?
[00:32:00.23] We were discussing over lunch that one of the frustrations in my life, and I'm sure many of yours, as well, is diary management. I spend crazy amounts of time juggling meetings and trying to find suitable things. Why don't we have agents that can do that?
[00:32:11.96] This is not, actually, AI rocket science. It ought to be feasible to have such things now. And there, why wouldn't my Siri just talk to your Siri to arrange this? That is the vision of multi-agent systems, right? Multi-agent systems are just systems with more than one agent.
[00:32:29.45] So if we're going to build multi-agent systems, what do they need to do? Well, your agents need to be able to talk to me, right? My Siri needs to be able to talk to me, but it also needs to be able to interact with other agents. And I don't just mean the plumbing, the pipes down which data is sent. I mean the richer social skills that we have.
[00:32:48.84] So for example, my Siri and Manuela's Siri need to be able to share knowledge. My Siri and my wife's Siri need to share skills and abilities and expertise. If I've acquired some expertise in something, I want my Siri to be able to share it with my wife.
[00:33:04.79] Actually, in neural nets, this is called transfer learning. It's very difficult to extract expertise out of one neural network and put it into another. It's a big research area at the moment.
[00:33:15.08] How can agents work together to solve problems, coordinate with other agents, or, really excitingly, negotiate? Just something as simple as booking a meeting is a process of negotiation. I have my preferences. I don't like meetings before 9:00 in the morning. I like to keep my lunchtimes free. But maybe you have different preferences. How are our agents going to reach agreement? They need to be able to negotiate with each other.
[00:33:41.91] All of these things have been big research areas over the last 20 years in multi-agent systems. And we're beginning to see the fruits of that research make it out into the real world. So just one example. If you've booked an Uber recently-- well, firstly, shame on you because they're not nice people-- but secondly, what happens when you book a ride, that process of allocating somebody to pick you up and do your transport, right, that process, that basic protocol, is a protocol called the contract net protocol.
[00:34:18.08] The process through which that happens is a protocol that was designed within the multi-agent systems community. And it has a ton of other applications out there right now, as well. It's probably the most implemented cooperative problem-solving process, allowing you to allocate tasks to people in a way that everybody is happy with.
[00:34:36.66] So all of these things are active areas of research. If I want my Siri to talk to your Siri, my Siri and your Siri need social skills, the same kind of social skills that we all have-- cooperation, coordination, negotiation, the ability to solve problems cooperatively, OK?
[00:34:57.32] So the debate about general AI, the grand dream about AI, has sort of kicked off again recently because of these advances. I'm not a big believer that that's going to happen. Well, I'm not a believer at all that that's going to happen any time soon.
[00:35:10.85] And nor do I envisage that an agent will have these skills in a very, very general sense. But for specific applications, like meeting booking, right, negotiation skills for meeting booking, protocols that will allow agents to book meetings-- taking into account everybody's preferences so that, for example, I can prove in a mathematical sense that my agent is not going to get ripped off, right, it's not going to end up with a bad deal, that we end up with an outcome which is fair to all of us-- these are all big areas in the multi-agent systems community that we've made a lot of progress with.
[00:35:47.32] So I want to emphasize multi-agent systems are used today. There was an article by a colleague of ours called Jim Hendler-- I don't know if you remember it, Manuela-- about 15 years ago. And his article was called "Where Are All the Agents?" And he said, well, we've been working on these agents for 10 years, but, actually, I don't see them.
[00:36:04.73] Well, I'd love to have Jim here today because, of course, firstly, we've all got an agent with us, right? You've all got a smartphone in your pocket. There are hundreds of millions of software agents out there in the real world and not just agents that interact with people. There are multi-agent systems.
[00:36:20.48] So high-frequency trading algorithms, in particular, are exactly that. These are algorithms to which we have delegated the task of doing trading. People are out of the loop-- completely out of the loop. And they couldn't be in the loop because the timescales on which high-frequency trading algorithms operate are way, way, way too small for people to be able to deal with in any kind of sense at all.
[00:36:44.76] But here's the thing-- so this is going to introduce me to the next part of the talk-- is that when you start to build systems like high-frequency trading algorithms, they start to get unpredictable. They start to have unpredictable dynamics. So here are a couple of examples of this.
[00:37:02.36] So the October 1987 market crash-- so the guys with ties on will remember that. So was it Black Monday or Wednesday? I forget which. Does anybody remember? It was Black something. And what happened? What led us to this October 1987 market crash?
[00:37:20.77] As with all of these things, there was no one cause. But actually, one of the big contributing factors was that the London Stock Exchange and International Stock Exchange had computerized just a couple of years before. I think they called it the Big Bang in London, right? It was when all the stock markets went computerized, and you went from handing people pieces of paper to actually doing trades electronically.
[00:37:43.24] And people built agents to do automated trading. And they gave agents rules like if share price goes down, sell. And you don't have to be an AI genius to see that if every agent has that kind of behavior, then a sudden event like a sudden sharp stock price fall for some reason creates a cascading feedback effect. And that's exactly what happened. It wasn't the only cause but generally accepted that that was one of the key causes of the October '87 stock market crash.
[00:38:14.95] More dramatically, recently-- 6 May 2010-- we were having a general election here. But over in the US, in the middle of the afternoon over a 30-minute period, the markets collapsed. And briefly, more than a trillion dollars was wiped off the Dow Jones Industrial Average. It was the largest one-day drop in the Dow Jones history.
[00:38:39.90] But it only lasted 30 minutes. The markets bounced back. They didn't quite regain, and it wobbled a bit. But actually, they regained their position. And the joke was, of course, if you were having a cup of coffee at the time, you would have missed the whole thing, right?
[00:38:53.64] This was happening on timescales that people simply couldn't comprehend. By the time they understood that something weird was going on, it was already starting to bounce back. So some very strange things happened. So Accenture were trading at a penny a share for a while-- a very, very brief period of time. And Apple shares, bizarrely, were trading at $100,000 each for a very brief period of time.
[00:39:14.71] The scary thing about this is that, of course, now, all these markets are connected, right? We're not operating in isolation. We're operating in a global marketplace. And a nightmare scenario was that you would have a flash crash at the end of the trading day.
[00:39:30.69] And if you hit the trough-- the bottom of the trough-- at the end of the trading day, OK-- you hit the bottom of the trough at the end of the trading day-- nobody knows whether or not this is a real collapse or just a freak phenomenon which is just going to rebound. And then, you've got contagion. It starts to spread over Asia and the rest of the world. And these are very real and very, very scary phenomenon.
[00:39:52.64] So the next point in my talk is if we're going to build multi-agent systems-- and the problem is that people are frantically running ahead to build high-frequency trading algorithms, right, and trying to build them faster and using things like AI sentiment analysis on Twitter to drive the decisions that are being made-- then they are going to be prone to these unpredictable dynamics. We need to be able to understand them. We need to be able to manage them.
[00:40:19.03] And at the moment, management is just hitting a kill switch. It's unplugging the computers, right? That's how these things are managed at the moment. I mean, it's not all it is.
[00:40:27.16] So let me just briefly give you a feel for one of the two approaches that we look at. And the first approach that we look at is what's called formal equilibrium analysis. So this is relevant for systems where there are small numbers of agents. It doesn't work for big systems for various technical reasons. So the alternative technique that I'll talk about in a moment works for big systems.
[00:40:50.07] But for small systems where there are just a handful of interacting agents, what we can do is we can view this as an economic system and start to understand what its equilibria are and what kinds of behaviors the system would show in equilibrium. So to put it another way, what we do is we view a flash crash as a bug in the system, right?
[00:41:12.55] If our system that we have is exhibiting a flash crash or some other undesirable behavior, what we do is we treat it as a bug, and we say, how did this bug come about, and how can we fix that bug? And so the technology that we apply is exactly the technology that's been developed in computer science to deal with bugs. And the most important of these technologies is a technique called model checking.
[00:41:36.92] And so here, I've got a simple illustration of model checking. So the idea is what this thing is here is just a description of a system-- a little bit more technical. I said it was going to get a bit more technical but not too much.
[00:41:47.33] These are the possible states of the system. And then these arrows correspond to the actions that could be performed by the agents in the system. So if the system is currently in this state and some agent does this action, then the system transforms to this state.
[00:42:01.33] And what that gives us is this structure here, which we just call a graph structure. And this is just a model of the possible behaviors of my system. So it could be, for example, that some state down at the bottom here is a bad state, right? This is a flash crash state. And what we want to understand is, how does that flash crash state arise? How can we get to that?
[00:42:22.16] OK. So in model checking, what we do is we use a special language, a language called temporal logic, to express properties of the systems that we want to investigate. So here is a property written in a standard temporal logic. It just says if ever I receive a request, then eventually I send a response. That's what that says. You don't need to worry about the details.
[00:42:42.32] And what the model checker does is it will check whether or not that property holds on some or all of these possible trajectories. So each path that you can take through that graph, right, corresponds to one possible trajectory of our system, right? And imagine that there is some flash crash trajectory where bad things are happening. So a classic example of a query would be something like, eventually, I reach a flash crash. And so what we're asking is, is there some computation in my system that will lead to that flash crash?
[00:43:15.80] So this is, again, a very big body of work. And colleagues of Manuela's at CMU won the Turing Award for their work in developing model checking technology because it's industrial strength. It really works, with all sorts of caveats. You can really use this to analyze systems.
[00:43:34.16] And many model checkers are now available. And really, the reason is for that that, actually, these model checkers are relatively easy to implement, OK? So the algorithmics of these things are really, really quite simple. And actually, you can end up with tools that really, really work if you want to do this.
[00:43:54.17] So the two basic model checking questions, then, are, is there some computation of a system on which my property-- like there is, eventually, a flash crash-- holds, or does that property hold on all computations of the system? Is it inevitable that I'm going to have a flash crash on all the possible trajectories of the system? So those are the two basic questions which are introduced in model checking.
[00:44:15.95] OK. So now, if we turn, instead, to multi-agent systems, the idea is that our agents now, each of them is trying to do the best it can for itself. My Siri is acting for me. Your Siri is acting for you.
[00:44:30.26] And what we now want to ask is, OK, under the assumption that our agents are acting rationally, doing the best that they can for us, then what properties, what trajectories, will our system take, OK? Assuming that your agent is doing the smartest thing it can in pursuit of what you want to do-- like meeting booking-- and mine is doing the smartest thing it can for me, what will happen?
[00:44:56.00] And to cut a long story short, that's the question that we ask in this work, OK? And that's the approach. The approach is that we call is equilibrium checking. It's understanding the equilibria that a system can have.
[00:45:08.76] And the basic analytical concept that we use for this-- what is a rational choice-- is an idea from game theory called Nash equilibrium, named after John Forbes Nash Jr., who just died a couple of years ago. They made a film about him, A Beautiful Mind. The film is terrible, but the book on which it's based is much better.
[00:45:27.65] And he formulated this notion of rational choice in these strategic settings. And the notion of Nash equilibrium is extremely simple. It's the idea that we use in our work for analysis. Suppose all of us are busy. We all have to make a choice, right? You have to make a choice. You have to make a choice. You have to make a choice. We, all of us in this room, make a choice.
[00:45:46.24] It's a Nash equilibrium if, when we look around the room and see what everybody else has done, none of us regrets our choice. We don't wish we'd done something else, yeah? Given that you lot did all your bits, I'm OK with what I did. But similarly, given that we all did our bits, you're OK with what you did. That's a Nash equilibrium.
[00:46:04.81] And what we look at in our system is, suppose our agents make Nash equilibrium choices, then what trajectories will result? So the picture looks very similar. We've got our model of our systems we did in model checking. We've got our claim, like there is eventually a flash crash. But now, we know what the preferences are of the agents in the system. We know what each of them is trying to accomplish. And the question is, can I get a flash crash under the assumption that we all make rational choices? Or is a flash crash inevitable under the assumption that we all make rational choices, yeah?
[00:46:45.84] So that's the work that we look at. And we have a tool that does this. It's available online. So the tool is called EVE for Equilibrium Verification Environment. It's available online at eve.cs.ox.ac.uk.
[00:47:04.69] And what you can do is you can describe a system using a high-level language called reactive modules. It's a programming language, so you should expect to see a programming language. And then, you specify the goals of each of the players. And those goals are specified as temporal logic properties, like the example that I talked about earlier.
[00:47:25.21] And what it will do is it will tell you what properties hold of that system under the assumption that all the component agents make Nash equilibrium choices. So that's what we mean by formal verification. So it's game theoretic verification.
[00:47:41.93] Because what it's doing is it's looking at the system from a game theoretic point of view. It's saying, you're going to do the best for yourself, I'm going to do the best for myself, then what will happen? We're going to make Nash equilibrium choices. What will happen in my system, OK?
[00:47:58.48] OK. So that's a formal approach to understanding equilibrium properties in a precise mathematical sense, right, in the precise mathematical sense of game theory. And it will tell us what properties will hold inevitably under the assumption of rational choice or could possibly happen.
[00:48:18.12] So the idea is, in this setting, the fact that something is possible in principle might not be relevant if it doesn't correspond to rational choices. All you're concerned about is what would happen under the assumption that our agents chose rationally, OK? So that's equilibrium analysis.
[00:48:39.32] OK. So this really only works for small systems. If you've got a handful of agents, it works. For technical reasons to do with game theory, if you've got large numbers of agents, which, of course, you do on the global markets, it doesn't really work. So what do we do instead with large systems?
[00:48:54.56] So the idea, instead, is we've got an alternative approach, which is called agent-based modeling. And to cut a long story short, what agent-based modeling does is it says you simulate the whole system. You build, literally, a model of the economy with all of the decision-makers in that economy modeling, and you model the interactions-- the buying and selling and lending behaviors, all the other stuff that you might want to do-- you model them directly, OK? And then, you run a simulation.
[00:49:25.63] And this is an old idea. But it's possible now, for familiar reasons. Why is it possible? Because we have large data sets that we can get, right? For example, in the finance sector, regulators require that banks and other groups make their data publicly available, or parts of their data, publicly available. And we can scrape that and use that in our simulations. And that's what we do.
[00:49:52.84] And we've got the compute resources to be able to simulate this at scale. So the kind of simulations that we do involve seven million decision-makers, right? And those decision-makers correspond to banks and individuals and so on, and we simulate that at scale.
[00:50:08.84] There are some challenges with this. So when you start doing agent-based simulation, it looks like a beautiful thing to do. Literally, you're modeling each of the decision-makers in the economy as an individual program that's making decisions about what to do. But actually, just getting to meaningful simulations, where it doesn't just wobble up and down crazily, never settling down, that's actually a challenge in itself.
[00:50:31.82] Once you've got simulations, you then discover that what you've done is you've plugged in what are called magic numbers. So to get anything sensible, I had to set this parameter to 13.3, but why, right? 13.3 is a magic number in the simulation. And this is a real problem. Because it just feels arbitrary. We don't want to have to do that.
[00:50:51.26] Calibration is a huge problem. So calibration means if your model tells you, this is going to happen, how do you know that $1 in the model actually corresponds to $1 in the real world? At the moment, that's the cutting edge of agent-based modeling, doing meaningful calibration. And predictably, the way that you do calibration at the moment, the state of the art technique, is to do lots of heavyweight machine learning on your model to try to understand what it's actually doing, and finally, whether you interpret the data that it's providing as quantitative or qualitative.
[00:51:26.00] So my colleague, Doyne Farmer in Oxford, uses the analogy of weather forecasting. If you go back to the 1940s, how did they do weather forecasting? They would look at the pressure over the United Kingdom, the weather patterns, and they would just go back through their records to find something similar and then look what happened the next day.
[00:51:43.85] And simulation of weather systems was widely regarded as something which was impossible for a long time. It eventually became possible when you could get the grain size of what you were modeling down to a sufficiently small area and you had the compute power available to be able to do these simulations at scale. And now, it works. And I think the claim is that we will be able to do similar things with agent-based modeling.
[00:52:09.44] But this is, it's simulation. It's Monte Carlo simulation, which means it involves random numbers, basically, right? You have to do lots of simulations and see the results that you get. And whether you interpret the results literally to give you quantitative data or qualitatively to say this trajectory could happen, you could get a flash crash under these circumstances, that's also at the cutting edge of the debate on agent-based modeling.
[00:52:35.63] OK. So I was, actually, originally planning this talk-- I was going to make this the center point of the talk. But then, I panicked because I'm not a finance person at all. And this work is only possible because we have somebody who works in the finance industry doing this. But this is a quote from his [INAUDIBLE].
[00:52:49.46] So for example, he's looking at the conditions that can give rise to flash crashes. And one of the things that he looks at, for example, is crowding. And this is where everybody is buying into a particular asset. And if that asset becomes distressed, then the concern is that that, then, propagates.
[00:53:05.18] And so the conventional wisdom is if everybody is buying into the same asset, this can be a bad thing because it leads to contagion and propagation of distress. But actually, he's discovered, for example, under some circumstances, this can be a good thing. So these are qualitative things that he's doing here. The next stage is to try to calibrate this.
[00:53:27.46] OK. So to wrap up. So the agent paradigm, it seems to me, it's a 30-year dream for AI, but it's now a reality. We all have agents with us, right? It's not necessarily the case that your Siri is talking to somebody else's Siri.
[00:53:40.14] I think that's an obvious thing, actually, to happen, right? So I genuinely believe that that will happen. And it won't happen in the sense of your Siri being generally intelligent. It will be in niche applications.
[00:53:53.11] The next step for the agent paradigm is to put agents in touch with other agents, for Siri to talk to Siri. But multi-agent systems have unpredictable dynamics. So we need to be careful about the systems that we build.
[00:54:04.71] And I've described in fairly high-level way two different approaches to doing that. One is, for small systems, you can view these things as an economic system, a game in the sense of game theory, model it as a game, and then understand what its game theoretic equilibria-- its Nash equilibrium-- behaviors are. Do I get something bad happening in the Nash equilibrium?
[00:54:23.07] An alternative is agent-based simulation, OK? So the alternative is to directly model the system. And we can do that because we have compute resources at scale, and we have data at scale. OK.
[00:54:35.52] So I'm going to wrap up. Thank you for your attention. I've giving you a tour of where agents came from, why it's a natural idea-- it took 20 years for it to become a reality, but it is a reality, and we've all got agents in our pockets now-- and where that might go.
[00:54:48.98] And that vision of the future of computing-- ubiquity, interconnection, intelligence, delegation, human orientation-- just seems to me to be inevitable. And the agents paradigm, it seems to me, is bang in the middle of that. OK. Thank you for your attention.
[00:55:11.62] MANUELA VELOSO: A couple of questions?
[00:55:20.18] AUDIENCE: Test. Thank you for your time, and thank you for your talk. I have two questions. But for the sake of time, I'm going to let you choose which one to answer, if that makes sense.
[00:55:30.56] MANUELA VELOSO: No. For the sake of time, you choose one, and just say one.
[00:55:32.92] AUDIENCE: OK.
[00:55:33.38] MICHAEL WOOLDRIDGE: Choose the easy one, please.
[00:55:35.19] AUDIENCE: All right, cool. One of the hot topics in artificial intelligence is the impact of automation in society. And I am curious as to what you think the impact of the evolution of multi-agent systems will have in automation and, subsequently, how we can think about educating our children, educating ourselves, and, ultimately, educating society.
[00:56:00.24] MICHAEL WOOLDRIDGE: So I think there's two slightly conflated-- I think the issue of automation is, and how multi-agent systems will impact that-- let me just take that one first. So a lot of my job, and I daresay your jobs, is full of fairly routine management of tasks-- processing paperwork, passing it on.
[00:56:17.54] There will be any number of workflows within an organization like this, as there is at the University of Oxford, to deal with processes which involve multiple people. And they're extremely tedious and time-consuming. The first thing that I could really see agents doing is simply automating the management of an awful lot of that so that you have agents managing, in our case, for example, when a student applies to us.
[00:56:40.52] You have an agent that manages that process, can remind me when I need to do things, can make sure that the paperwork gets to the next people, can flag up to the right people when things aren't processed in time, and so on. And that seems to me to be crying out for agent-based solutions. So I think there will be big applications there in that kind of scenario. And I think that will be an area where multi-agent systems has an awful lot to offer.
[00:57:07.91] I think the second part was to do with education, was is? It that right?
[00:57:10.37] AUDIENCE: Yeah, how to fine-tune education to overcome automation.
[00:57:17.18] MICHAEL WOOLDRIDGE: Well, I think that's a bigger question about AI itself rather than multi-agent systems. And I think the answer to that one is that the skills that won't be easily automated are human skills. Doctors, for example, they're not going to be automated by X-ray reading machines. So we have software that can read X-rays and diagnose heart disease and so on very, very effectively.
[00:57:39.10] But that's a tiny sliver of what a doctor does. An awful lot of what a doctor does-- most of what a doctor does-- is to do with human skills that requires huge amounts of training which are not going to go away any time soon, although I'm put in mind of this news article that you may have seen over the last 24 hours of this patient that was told he was going to die by a telepresence robot. That wasn't AI, by the way. It was just a crass application of telepresence technology.
[00:58:04.98] MANUELA VELOSO: OK. So there's one more question.
[00:58:07.41] AUDIENCE: Just a quick question.
[00:58:08.38] MANUELA VELOSO: Just a second. Just a second.
[00:58:14.45] AUDIENCE: How would you suggest we should deal with biases, applying databases full of old bias and inefficiencies? So how, by delegating to agents, how do you deal with biases in the data?
[00:58:25.62] MICHAEL WOOLDRIDGE: OK. So wow. That's a huge question. So I think what's interesting about bias is that the algorithmic treatment about bias-- and it's not just AI, anything to do with an algorithm which has to make decisions, it's the same thing-- but the algorithmic treatment of bias is something we didn't really anticipate there's going to be an issue. And it's just exploded over the last couple of years.
[00:58:47.85] So there, I think, we're just developing the science to understand what it means, for example, to be able to say in a precise sense, when is a data set biased? When is an algorithm unbiased or biased? We're just getting there. And people are frantically running ahead to try to understand those issues.
[00:59:04.46] And I'm pretty confident over the next 10 years, we will have a much richer understanding of that. What will be interesting is to see that experience fed back into undergraduate programs, for example, so that when we teach people about programming, we don't just teach them about programming. We teach them about those issues of bias.
[00:59:22.01] At the moment, it's a huge, great, difficult area. And there are no magic fixes for it. We're just at the beginning of a process to understand what those issues really are.
[00:59:32.79] MANUELA VELOSO: OK. We'll have two more questions. I see many hands up. OK. You have the mic. So then we select him for that time.
[00:59:41.59] AUDIENCE: Thank you very much. And thanks for the talk. The question you spoke [INAUDIBLE] about defining what's meant to be each agent's interest because originally, once we define what each agent is doing in its best interest, then you can define the Nash equilibrium. But
[00:59:55.48] In real-world applications, agents may have different interests, right? So what would be the two, three topics, or methods, that you think are the state of the art or inferring what is the reward function or the actual interest of each agent so we can model each agent and then go to multi-agent level?
[01:00:13.55] MICHAEL WOOLDRIDGE: OK. So the slightly technical answer, so I apologize for that, so the problem of I'm interacting with people, and I don't quite know what their preferences are-- they could be this sort of person, they could be this sort of person-- is a standard one in game theory. And there are standard tools.
[01:00:29.34] There's a variation of Nash equilibrium called Bayes Nash equilibrium, which deals with exactly that. We haven't done that in our work because it's an order of magnitude more complex. But nevertheless, it's a standard technique. And in principle, you can use that technique to understand this.
[01:00:43.51] You could then argue against the models that they use in-- what they do in Bayes Nash equilibrium is you've got a space of possible types. You could be this type or this type or this type of person-- in other words, have these preferences or these preferences or these preferences.
[01:00:56.44] And what you know is the probability that they're of this type and this type and this type. And you could immediately argue, well, actually, even that, actually, is asking quite a lot. On large-scale systems, that might not be an unreasonable thing to do. But for the reasons I've said, we don't look at large-scale systems using game theory.
[01:01:12.31] MANUELA VELOSO: We have [INAUDIBLE]. So one final question there.
[01:01:18.70] AUDIENCE: I have a question about agent-based modeling. So take the example of the flash crash. Statistics say this is something that will happen once every billion years or something. Do you think that there is some issues with the way we do agent-based modeling or the reliance on simulations to model whether things are likely to happen or not?
[01:01:40.03] MICHAEL WOOLDRIDGE: So is that an actual quote, the once every billion years? Because it seems a very silly quote, given that it's actually happened.
[01:01:46.39] AUDIENCE: Yeah, something like that. Usually, these events are super rare [INAUDIBLE] risks.
[01:01:49.96] MANUELA VELOSO: [INAUDIBLE]
[01:01:51.97] MICHAEL WOOLDRIDGE: Well, there have been smaller-scale flash crashes since then, right? There've been a number of them. So the scary thing about flash crashes, if they happen in the circumstances where, for example, like I say, at the end of the trading day, when you hit the trough when the markets closed, that's what's potentially very scary. And there could be others, right-- there could be other circumstances.
[01:02:13.00] So in our simulations, what we aim to do is, at the moment, we're just getting qualitative indications. Look, these are the kinds of conditions-- because these are hugely complicated events, it's not just one factor. Certainly, high-frequency trading, the flash crash couldn't have happened without high-frequency trading.
[01:02:32.02] So that's certainly a contributing factor, but by no means is it the only one. And actually, I'm not sure whether they got to the bottom of whether or not somebody had actually done something fraudulent in the flash crash. I'm sure some of you know the answer to that. But yes, so there's a huge range of things.
[01:02:44.71] But what we aim to do is be able to give you the kind of characteristics. Look, if your system has these properties, this is the kind of trajectory that it might exhibit under these circumstances-- so qualitative indicators, which is still useful for us. As I say, we're right on the frontier of going from that to be able to make-- if your leverage is this much and the crowding is this much, then the probability is this much. We're on the edge of being able to do that. We can't do that with confidence yet. That's a way off, I think.
[01:03:16.63] MANUELA VELOSO: OK. We'll have one final question. There was-- there in the back. Sorry. We also have in the back row [INAUDIBLE]. OK. Thank you.
[01:03:30.31] AUDIENCE: Hi, professor. When I hear you talk about the agents, and I see you show the video from Apple, I'm curious whether you see agents as a solution to the IO problem when dealing with computers. Do you see it as the next step towards, or the next trend towards, people interface with computers?
[01:04:00.12] Obviously, now, we see companies in America like CTRL-labs. And in their work, they're looking at gestures, reading the nerve signals to interact with computers-- one of the big revolutions in the industries.
[01:04:15.08] MANUELA VELOSO: [INAUDIBLE] a question.
[01:04:17.18] AUDIENCE: So yeah. The question is, do you see it as a solution to the IO problem, or is it a bigger application than that?
[01:04:23.11] MICHAEL WOOLDRIDGE: So yes, it's a problem. It's a solution to the human-computer interface problem, which, I think, is what you're saying. You're talk about input/output problem, is that right? So I think it's a solution to the human-computer interface problem.
[01:04:36.09] The reason that so many people are working on it is because, at the moment, they don't see any alternative. So gesture-based interface is certainly going to have a role to play. I think they're not on the stage at the moment, even remotely, where they could be rolled out.
[01:04:48.99] And I think that it's hard to, with a gesture-based interface, to think about booking a meeting with Manuela. It just seems easier to say that than it does to try and do something with gestures. Maybe somebody will come up with something innovative there. I don't know. But at the moment, I think gestures-based interfaces are very, very niche area.
[01:05:09.63] Brain reading, I think, is-- again, we're nowhere near being able to do anything like book me a meeting with Manuela through brain reading. I think the state of the art there is one or zero, and possibly something a little bit more sophisticated but not much.
[01:05:26.68] So at the moment, I say the reason that people are chasing that up is because they just don't see any alternative, right, for an awful lot of these systems. If you're driving a car, how are you going to interface? You can't take your hands off the wheel and start typing. And you certainly can't do gestures, right? So it's the only alternative that's there.
[01:05:43.79] MANUELA VELOSO: Very good. So let's thank Professor Wooldridge again.
Power and Limits of Deep Learning
January 24, 2019 - New York City
Yann LeCun is a VP & Chief AI Scientist at Facebook, and Silver Professor of CS and Neural Science at NYU. Previously, Yann was the founding Director of Facebook AI Research and of the NYU Center for Data Science. He received a PhD in Computer Science from Université P&M Curie (Paris). After a postdoc at the University of Toronto, Yann joined AT&T Bell Labs, and became head of Image Processing Research at AT&T Labs in 1996. He joined NYU in 2003 and Facebook in 2013. Yann’s current interests include AI, machine learning, computer vision, mobile robotics, and computational neuroscience. He is a member of the National Academy of Engineering.
Yann LeCun Lecture Video
[00:00:00.30] MANUELA VELOSO: OK. Thank you very much for coming, and welcome to the JP Morgan distinguished lecture on AI. It's my great pleasure, my very great pleasure, to just say a couple of words to introduce you to our speaker today who is Professor Yann LeCun. So Yann LeCun currently has appointments both at NYU and Facebook.
[00:00:29.76] Yahn is the silver professor of computer science and neuroscience at NYU, which he joined in 2003 after actually having been at Bell Labs having a group on image processing at Bell Labs, and also having done a postdoc at University of Toronto, and after his PhD, which was in Paris at Pierre and Marie Curie University. So Yann has this academic kind of life, and in about, I think in 2013, he took another hat which was to head and create the Facebook AI research at Facebook.
[00:01:15.69] He's currently a VP and chief AI scientist at Facebook, and he co-founded and directed, actually the FAIR, the Facebook AI Research. And in terms of research, we all know that his passion, his contributions have been at the deep learning level, machine vision, a lot of knowledge on mobile robotics and AI in general. So finally, you know we should also know that our speaker Yann LeCun is a member of the National Academy of Engineering, which is one of the highest honors for researchers in the United States. So please, let's welcome Professor Yann LeCun, and welcome him to JP Morgan.
[00:02:08.50] YANN LECUN: Thank you, Manuela. A real pleasure to be here. I rarely go above 23rd Street, really, in New York but--
[00:02:15.61] That gives me an excuse. I guest I didn't have to fly, which is good. I spent about half my career in industry and half in academia. Now, I have one foot in each. I guess it's one half of a foot in academia now because I'm spending most of my time at Facebook, but things changed. I basically started the first research lab at Facebook, FAIR. There's a bunch of research labs now at Facebook, but this was the first. And this was a bit of a cultural shock for Facebook, which was a very sort of engineering short-term oriented company that had to invent itself a new culture for research. And that's what I liked about it, the fact that we could start from scratch and basically establish this culture.
[00:03:04.83] So what we're doing at Facebook is open research. Facebook FAIR research is really outward focused, and all the research we do is published, and almost all the code that we write is open sourced. With that we hope to foster interesting problems that we think are interesting, and see the creativity towards working on provenance that we think are important. The thing here is not whether Facebook technology is ahead of Google's or Microsoft's or whatever, but more the fact that the products that we want to build are not possible today. We don't have the science, even the basic principles, to build the stuff we want to build. And so since we don't have a monopoly on good ideas, the best we can do is basically accelerate progress for the entire community, and that's one of our roles.
[00:03:53.96] Of course, we have a big impact on the company. In fact, a much bigger impact than Mark Zuckerberg thought we would have five years ago when FAIR was created, and today Facebook is entirely built around deep learning. If you take deep learning out of Facebook today, you get dust essentially. Not entirely, but you know--
[00:04:14.97] You know what I mean. OK. So most of machine learning today-- Machine learning has had a huge impact in various areas of business and society, and science, certainly, but most of the applications of machine learning today, basically, use supervised learning, one of the three main paradigms of learning. So there is supervised learning, there is reinforcement learning, which people have been talking about a lot in the last few years, and then there is another thing that's not very well defined called unsupervised learning or self supervised learning, which I'll talk about later.
[00:04:51.88] So supervised learning is this idea by which if you want to train a machine, for example, to classify images of cars from airplanes you show an image of a car, you run this through the machine, and the machine has adjustable knobs on it. And if it says car, you don't do anything to the knobs. If it doesn't say car, you adjust the knob so that the answer the machine produces gets closer to the answer you want. There is a desired answer that you give to the machine, and you can measure the discrepancy between the answer you want and the answer the machine produces.
[00:05:21.51] And then you show an image of an airplane, and you do the same. And by tweaking the parameters with thousands of examples, eventually, perhaps the knobs will converge to a configuration where all the cars are correctly classified, and all the airplanes are currently classified. And the magic of this is that it may even work for airplanes and cars it's never seen before. That's the whole purpose of learning, which is that you learn the concept without having to memorize every example.
[00:05:50.61] So this type of learning, supervised learning works really well if you want to do speech recognition, that being speech to words, images to categories, face recognition, generating captions for photos, figuring out the topping of a text, translating from one language to another, that's all supervised learning.
[00:06:09.38] And that's the basic idea of this, or the tradition, starts with models from the late '50s, early '60s, the Perceptron and the Adaline, which interestingly at the time, were really sort of hardware devices. They were not programs on the computer. They were actually analog computers that were built. At the bottom here is the Perceptron. What you see here is Bernie Widrow reviving one of his old Adeline systems at Stanford.
[00:06:38.49] So that created the standard model of pattern recognition that really was prevalent until fairly recently, by which you take a rough signal and you feed it to what's called a Feature Extractor, which is hand engineered. It's designed by people to basically extract relevant information from the raw signal, and then you see the result, the feature vector, to the classifier, something like an [INAUDIBLE] classifier or [INAUDIBLE] or the tree or whatever. There's a lot of techniques that people have come up with over the last 50 years, if not 60. More like 60 years, actually to do this. And that's kind of the standard way.
[00:07:20.33] And what deep learning changed is the idea that you can learn this feature extractor. So instead of having to engineer-- spend a lot of time, and expertise, and money on building those things for every new problem, you can basically train the feature extractor as part of the entire process. So basically you build a machine as a cascade of trainable modules, which you can call layers, and essentially all those modules are [INAUDIBLE] using supervised learning. And you hope that the machine will learn, not just to classify, but also to figure out what the relevant features are that need to be extracted at every layer for the system to do a good job.
[00:08:06.00] So the next question you can ask is, what do we put in those boxes? And the answer to this is not recent. It's the idea of artificial neural networks. So in a artificial neural net, essentially the layers are of two types. One type is just a linear operator. So imagine that the signal is represented as a vector, in our recent numbers, the pixel values or the signal, whether it's audio or financial tensor or whatever.
[00:08:36.90] Represent this as a vector, multiply this by a matrix. So what a matrix does is that when you computed the product with this matrix by this vector is you computing the dot product of the vector by every row in the matrix, and that's like computing a weighted sum of input features. It's actually represented here. So you have a bunch of-- Here's a vector. With components of this vector, you're computing a weighted sum where the weights are the coefficients in the matrix, and that gives you an output.
[00:09:05.30] And then there is another type of function here, which is a point-wise nonlinearity. So you take this vector and then apply the nonlinear function to every component of this vector independently. In this case, what's called a value, which is really just a halfway rectifier, OK?
[00:09:17.76] So function is identity for positive arguments and equal to 0 for negative arguments. Very simple nonlinearity. And there are theorems that show that with only two layers of this-- so linear, nonlinear, linear-- you can approximate any function you want as close as you want, as long as the dimension of this vector is sufficiently large, possibly infinite. But-- and there is no real theoretical results on this. But what we know empirically and intuitively is that by stacking lots of layers of those, you can represent many functions very efficiently. So that's the whole motivation to deep learning, which is that by stacking multiple layers of alternating linear nonlinear operators, you can approximate a lot of useful functions very efficiently.
[00:10:03.57] AUDIENCE: [INAUDIBLE] in theory, prove that [INAUDIBLE]?
[00:10:16.20] YANN LECUN: What do you mean by simple?
[00:10:17.07] AUDIENCE: Simple function is [INAUDIBLE].
[00:10:26.31] YANN LECUN: Yeah, yeah. So yeah, the theorems from the late '80s that show that with just two layers-- with just one layer or nonlinearity, you can approximate any function you want, OK? But there is no limit on the dimension of the middle layer. So for the longest time, because of the limitations of computer power and because the data sets were small, those approaches-- the number of applications that we could apply these two were very limited. I mean, the basic techniques for this are from the late '80s, but the amount of problems for which we could have enough data to train the system was very small. We could use it for maybe handwriting recognition and speech recognition and maybe a few other applications, but that was kind of limited.
[00:11:10.38] People at the time, actually, in the '80s, were really interested in hardware implementations and this is coming back. So there is a whole industry now that has been restarted over the last three to five years on building special-purpose chips to run those neural nets efficiently, particularly for embedded devices. So probably within the next two years, every smartphone will have a neural net accelerator embedded in it. And within five years, it'll be in every car and shortly after that, basically every electronic device that you buy will have a neural net accelerator in it. And so your vacuum cleaner will have smart computer vision in it because it will have a $3 chip that does neural net acceleration.
[00:11:56.10] So how do we train those things? So we train them by-- it's basically large scale optimization. So the supervised learning. You measure the discrepancy between the answer the machine produces and the answer you want through some sort of objective function that basically just measures the distance or something like that of some kind. Then you can average this over a training set of samples of pairs of input and output, and the process by which you tune the parameters of the system is just gradient descent. So figure out in which direction to change all the knobs so that the objective function goes down, and then take a step in that direction. And then keep doing this until you reach some sort of minimum.
[00:12:39.04] So what people use is something called Stochastic Gradient Descent, where you estimate the gradient on the basis of a single sample or maybe a small batch of samples. So you show a simple example, figure out the error between those two things. Then compute the gradient of that function with respect to all the parameters, tweak the parameters, then go to the next sample. Stochastic because you get a noisy estimate of the gradient on the basis of a single sample or small batch.
[00:13:05.14] So the next question you might ask is how do we compute this gradient? And that's where back-propagation comes in. And I'm sure many of you are familiar with this. Don't attempt to understand the formula. You don't need to.
[00:13:16.29] But it's basically the idea that to compute the gradient, which is really the sensitivity parameters of the cost function with respect to all the coefficients in the system-- all the weights and all the matrices in the weighted sums-- you can compute all of those by doing a backward pass, which is basically just a practical application of chain rule, OK? So you know that by tweaking a parameter in this block here, it will affect the output in a particular way. And you know how tweaking this output affects the overall cost. And so it's very easy, by using this back-propagation method, to compute all the terms in this gradient.
[00:13:57.40] So now for every parameter in the system, you have a quantity that indicates by how much the cost will increase or decrease if you tweak the parameter by some given delta. So that gives you the gradient. Take a step in the negative gradient direction. There's various tweaks to make this fast.
[00:14:19.65] So what made deep learning possible and what makes it easy to use is you don't have to figure this out at all. The modern deep learning frameworks, you basically build a network either by writing a program in Python or your favorite language, or by assembling blocks that have been pre-defined into a graph. And the system automatically knows how to compute the gradient. So you tell it how to compute the output and it keeps track of all the operations that are done during this computation, and then can trace back those operations so that, automatically, the gradient of whatever it is you're computing, with respect to all the parameters you have, will be computed. So that's a very simple concept, automatic differentiation. But that's really what makes deep learning so easy to deploy and use.
[00:15:07.95] OK, now here is a problem. If you want to apply deep learning or neural nets into the way I describe them to images, it's not really practical to view an image as a vector where the components of the vectors are the pixels. Because if you have an image, let's say, 200 by 200 pixels, that's 40,000 pixels and if you multiply this by the matrix, that matrix would be 40,000 by something. It's going to be large, OK? Too large.
[00:15:40.68] So you have to figure out how to specialize the connection between the neurons or basically, how to build sparse matrices in such a way that the competition becomes practical. That's the idea of convolutional networks, which is something that my name is associated with. The inspiration for this goes back to classic work in neuroscience from the '60s. Actually, it's Nobel Prize winning work by Hubel and Wiesel on the architecture of the visual cortex. There were various people who tried to make computer models of this, but didn't have things like back-prop.
[00:16:11.88] So what's the idea behind convolutional net? You take an image and the operation you're going to do, the linear operation, is not going to be a full matrix, but it's going to be what's called a discrete convolution, which consists in taking a little patch of the image-- 5 by 5, in this case-- and then computing the weighted sum of those pixels with a set of 25 weights. And then putting the result in a corresponding pixel on the output, OK?
[00:16:38.15] And then you take that window, you shift it over a little bit by one pixel, and do the same. Compute the dot product or the weighted sum of the pixels by those coefficients and record the result next to it, OK? So by swiping this window over, you get an image of the output, which is a result of convolving this image by this so-called convolution kernel.
[00:17:00.38] So the number of free parameters here in your matrix is very small. It's only 25, in this case, OK? And the amount of competition is relatively small. And the advantage of using this kind of competition is in situations where the signal comes to you in the form of an array-- single or multidimensional array-- in such a way that the statistics are more or less stationary. And in situations also that neighboring values tend to be highly correlated, whereas far away values are not, or less. It's the case for Financial Times series, for example, that belongs to this, right? And people have been using convolutional nets on Financial Times series. Yes.
[00:17:39.01] AUDIENCE: [INAUDIBLE]
[00:17:40.17] YANN LECUN: Say it again?
[00:17:40.72] AUDIENCE: [INAUDIBLE] like charter spanning tree and things like that?
[00:17:44.67] YANN LECUN: So there certain configurations of those coefficients that will produce edge detection, yes. But we're not going to hardwire those coefficients. They're going to be the result of learning. We're not going to build it by hand. We're just going to initialize them randomly and then train the entire thing end to end supervised to produce a right answer at the end on millions of examples, or thousands of examples. And then look at the result.
[00:18:10.97] OK, so that's the first layer. We're going to have multiple filters of this type. In this case here, four. So each of those-- you had four filters here, producing four so-called feature maps. And then there's a second type of operation here called pooling, which consists in taking the results of those filtering small neighborhood and pooling the results, which means computing an average or a max of the values. And then sub-sampling the image. So sub-sampling means that this image is half the resolution of that image. The pixels are twice as big if you want.
[00:18:46.97] The reason for this is to eliminate a little bit of position information about the location of features in the image. That's important if you want to have a system that is robust to small deformations of the input.
[00:18:59.52] So this is a convolutional net in action. It's been trained to recognize handwritten digits. And it's showing the output here. And this is the input first layer. After pooling, third layer. Another layer of pooling, then yet another layer. By the time you get here, the representation is very distributed and abstract.
[00:19:18.08] But every unit here is essentially influenced by the entire input, OK? So the entire input and the representation of the input is this list of those values. You can get those things to recognize not just single characters, but multiple characters. And do simultaneous segmentation. This is really important because eventually, you'll want to use those systems with natural images. So this is sort of vintage, early '90s convolutional net, which was built when I was at Bell Labs.
[00:19:49.32] Eventually, at Bell Labs, we built a check reading system based on those convolutional nets and various other tricks. And by the-- it was deployed in the mid-'90s, and by the end of the '90s, it was reading somewhere between 10% and 20% of all checks in the US. So a big success. But by that time, the machine learning community had lost interest in neural nets. Nobody was working on neural nets, essentially, in the late '90s until the mid-2000s, roughly.
[00:20:17.19] I left industry in 2003 and joined NYU, as Manuela was mentioning. And I wanted to reignite the interest of the community for those methods because I knew they were working. And they had a reputation of being very finicky, right? We had our own framework for deep learning, but nobody was interested in this, so nobody was using our code. And neural nets had the reputation of being so hard to train that only I and a few people working with me were able to train them, which of course, was not true. It's just that people are lazy.
[00:20:56.71] I'm being facetious here. So around 2003, 2004, just when I joined NYU, I got together with my friends Yoshua Bengio at University of Montreal and Jeff Hinton at the University of Toronto, where I had done my post-doc many years before. And we decided to basically start a conspiracy to renew the interest of the community in neural nets. And we started with various algorithms that we thought would enable back-prop, perhaps, to train very, very deep networks. So not networks with just three or four or five layers, but networks, perhaps, with 20 layers or something like that.
[00:21:37.23] And we started working with unsupervised learning algorithms, which were only partially successful, but they were successful enough to get enough interest from people that the community started building itself. Around 2007, there was enough of a community that our paper started to get actually accepted at [INAUDIBLE]. Before that, we could never publish a paper in any conference on neural nets, essentially.
[00:22:00.18] And then we started getting really good results on standard benchmarks, but they were still kind of dismissed to some extent. That changed around 2009, 2010 in speech recognition, where the results were so much better that people started really switching to using neural nets. And then, around 2013 in computer vision, and the history of this is well-known.
[00:22:21.38] But in the meantime, starting in the mid-2000s-- I'm hearing myself now. I started working on robotics. It was something that, Manuela, I was very familiar with. A project that, actually, that Tucker Bosch was involved in as well. He's now at JP Morgan. He was at Georgia Tech at the time. Still at Georgia Tech. And this was a project to use machine learning to get robots to drive themselves in nature. [INAUDIBLE] roughly between 2004, 2005, 2009.
[00:22:55.86] So the idea here was to use a neural net basically to do what's called semantic segmentation, which means to label every pixel in an image with a category of the object it belongs to. So it's using a convolutional net, which sees a band around the horizon of the image. And it's trained to produce another image, which is this image that has essentially three categories.
[00:23:18.48] Here is something I can drive over. I'm going to label it green. Or is it something that is an obstacle? My video is not working for some reason. That's interesting. OK.
[00:23:32.36] Oh, here we go. Oh, that's fun. All right. Oh, wow. OK. All right, this one is working. So this is an example of semantic segmentation that took place a couple years later, around 2010 or 2009, where there were data sets with a few thousand images where people had painfully labeled every pixel with the category of the object it belongs to. So things like road and sidewalk and cars and pedestrians, trees, et cetera.
[00:24:10.41] So we trained this convolutional net to be applied to the entire image and it basically labels every pixel with a category. It makes mistakes. It labels this as desert. This is the middle of Washington Square Park.
[00:24:27.21] There is no beach I'm aware of. But at the time, it was state of the art. It was, in fact, quite a bit better than the state of the art at the time. This was 2010. It was also 50 times faster than the best runner-up competing technique. So we submitted a paper to CDPR, the big computer vision conference, at the end of 2010. Pretty sure that the paper was going to be accepted because it was faster and better than everything else people had done before. And it was rejected by all three reviewers, who said, basically, what the hell is a convolutional net? And we don't believe that the method we never heard of could do so well, so this could be wrong. Essentially, that's what the reviewers said. And it's funny because now you can't actually have a paper accepted at CDPR unless you use convolutional nets.
[00:25:23.88] Oops. That's not what I wanted you to do. Sorry about that. Bear with me for just a second. OK. So con nets are really useful a lot of things today, for self-driving cars. Every self-driving car project has a convolutional net in it. And for all kinds of other things. I gave a talk in 2013 that gave some ideas to people at MobilEye, which now belongs to Intel. Also to NVIDIA. They're using convolutional nets for all their self-driving car projects.
[00:26:04.28] In fact, there is a self-driving car project taking place in the Holmdel building, which is the building where I used to work at Bell Labs by a group from NVIDIA. And the guy running this project at NVIDIA is actually a former colleague from Bell Labs who worked with us on this robotics project that Tucker was involved in.
[00:26:22.06] OK, so deep learning today. It was a revolution in 2013 in [INAUDIBLE] division because our friends at University of Toronto, in Jeff Hinton's group, figured out how to implement convolutional nets on the GPU in a very efficient manner. They're one of the first ones to do this. It was done at Microsoft in the mid-2000s. But they applied this to image net and managed to get results that were so much better than what people were doing before that. That really created a bit of a revolution.
[00:26:50.79] So this was the error rate on ImageNet that people were getting in 2011. And in 2012, with the so-called AlexNet project from Toronto, the error rate went down by a huge amount. And then, over the last few years, it went down to levels that are so low that now this benchmark is actually not interesting anymore. It's better than human performance on this particular data set.
[00:27:12.26] What we've seen simultaneously is an inflation in the number of layers in those networks. So the [INAUDIBLE] I showed you from the '90s earlier had seven layers. The one from 2013, one of the ones that worked best, had 20 layers. And now that the best ones have anywhere between 50 and 150 layers.
[00:27:33.13] And Facebook uses those convolutional nets very widely for a lot of different things. And one of the most popular ones that's used in production is something called ResNet50. So ResNet is this particular architecture that is here, where there are layers of convolutions and pooling and nonlinearities, but there's also skipping connections that allow the system to fail gracefully. If some layers don't learn appropriately, then they become transparent essentially. And so that is what enables us to train very, very deep network. This is an idea that came from [INAUDIBLE], who was, at the time, at Microsoft Research Asia, who is now at Facebook.
[00:28:15.16] And so that's a graph that was put by Alfredo Canziani, who's a postdoc with me at NYU. But he did this before he came. On the y-axis, you have the accuracy. On the x-axis, the number of operations, which is billions of operations that are necessary to compute one output. And what people have been trying to do in industry is bring everything down to this corner essentially, where for the minimum amount of computation, you have the best accuracy on ImageNet or similar things.
[00:28:44.90] And so ResNet50 is right here. There are better results now. And then the size of the bubble, if you want, is the memory footprint, the number of parameters that are necessary. There's a lot of work on optimizing running those networks on regular processors or specialized processors to save power. And the reason this is important is-- to give you an idea, Facebook users upload somewhere between 2 and 3 billion photos on Facebook every day. And this is just on the blue site. I'm not talking about Instagram or anything like that, just Facebook.
[00:29:22.37] Every single one of those three billion photos goes through six convolutional nets, roughly-- half a dozen-- within two seconds of being uploaded. And those do things like essentially represent the image into a feature vector that can be used for all kinds of things-- retrieval search, indexing, feature vector for other purpose-- generic feature vectors, if you want. And one that does face recognition and face detection. Another one that generates captions, that describes the images for the visually impaired. And there is a couple that basically detect objectionable content-- nudity, violence, things like that.
[00:30:08.10] So the advantage of deep learning is that the system basically spontaneously learns to represent images in a hierarchical way from low-level features like edges, to parts of objects and motifs and things like that. One trend over the last few years is the use of weakly-supervised learning, or semi-supervised learning. This is weakly-supervised learning.
[00:30:32.03] So this is an experiment that was done at Facebook with one of the applied computer vision groups which consisted in taking 3 and 1/2 billion images from Instagram and then training a convolutional net to predict the hashtags that people tag images with. So they decided on about 17,000 different hashtags that would correspond to physical concepts, if you want. And then run 3.5 billion images through the convolutional net, asking you to predict which of those 17,000 hashtags is present.
[00:31:09.09] Then you take this network, chop up the last layer that predicts the hashtags, and just use the second last layer as a feature vector, which is an input to a classifier that you train on other tasks, like say ImageNet. And you can actually beat the record on ImageNet this way, OK? So until fairly recently-- it was actually beaten by another team at Facebook-- but until fairly recently, the record on ImageNet was held by this system, which is trained on a different task than the one you actually finally trained it on.
[00:31:37.76] So that shows points towards something that is going to become more and more important in the future, which is this idea that you pre-train with lots of data in a relatively task-independent way. And then you use this relatively small amount of data to actually train your system to solve the task you want to solve. I'll come back to this afterwards.
[00:31:59.04] So a lot of progress over the last few years in computer vision using convolutional nets with-- I'm not going to go into the details of how this is built, but you can get results like this, where every object in an image is outlined and identified. This is called instance segmentation and it can detect wine glasses and backpacks and count sheep. And it's optimized-- my videos aren't running for some reason.
[00:32:32.52] It's optimized to the point that you can run those things in real time on smartphones. So this is unfortunately a video that you can't see. And I'm not sure why you can't see it. Wow. It disappears. Which is a person essentially being tracked-- [INAUDIBLE] is being tracked on a smartphone in real time at something like 10 frames per second. So a lot of work has gone into those optimizations to run on small platforms and on iPhone, also, you have accelerations libraries.
[00:33:08.90] This is all open source, so if you want to play with computer vision, the latest systems, you can just download this. This is using the PyTorch framework, which also was developed at Facebook. And there is similar things for tracking body poses densely.
[00:33:24.27] ConvNets are used for all kinds of stuff-- in medical imaging. It's actually one of the hottest topics now in radiology, which is how you use deep learning for analyzing medical images. This is a project which I'm not involved in, but it's colleagues at NYU at the medical school and in the computer science department who have been developing those architectures for analyzing MRI images of hips and getting really good results with this. So this is a really hot topic now and it's probably going to have a big effect on radiology in the future.
[00:33:58.54] OK, so I don't want to do a laundry list of [INAUDIBLE] convolutional nets. This is one that was also developed at Facebook for translation. It's a little complicated to explain here with so-called gated convolutional net. But basically, the input is a sequence of words and the output is also a sequence of words in a different language [INAUDIBLE] translation. And that goes through convolutions that include something called attention circuits. And there is some sort of module in the middle that tries to match the-- warp the sequence to the-- so that words appear in the right place in the output sequence.
[00:34:39.74] This had the record on some data set for a short time. It's since been overrun. And you can use it for sound generation or for sequence generation here. So this is generating synthetic sounds by specifying what type of sound you want. This is a project that was done at Facebook in Paris.
[00:35:04.20] And interesting projects in unsupervised learning for translation. So this is a project also that was done in Paris, partly in New York, where you feed a so-called unsupervised embedding system. So you can learn vector representations for words in a language by figuring out in which context they appear. This is a very classic technical [INAUDIBLE]. This doesn't use [INAUDIBLE]. This uses something different, but it's very similar. So with this technique, completely unsupervised manner, you give a big corpus of text in one language to a system and it figures out the vector representation for each word in such a way that similar vectors correspond to similar words essentially, depending which context they appear in.
[00:35:42.20] You do this for several languages, and then you ask the question, is there a simple mapping that will take the cloud of points corresponding to all the vectors in one language and transform it into the cloud of points of another language? And if you can find such a mapping, there is some chance that you'll find a mapping between the two languages, and this actually works. And so what this allows you to do is basically build a translation system from one language to another without having any parallel text for those two languages, which is dumbfounding to me, but it works, OK?
[00:36:15.33] It doesn't give you record-breaking results if you had data, but it's amazing. It's very important for Facebook because people use thousands of different languages on Facebook. In fact, we just open sourced something which is actually not this particular project. This project is open source too, but where we provide embeddings for words and sentences in various languages-- in 92 different languages, actually. That's all open source.
[00:36:44.84] Oh, that's nice. OK. All right. OK. Question answering. I'm going to skip this. OK, so a lot of applications of deep learning and convolutional nets. A whole new set of potential applications that are trying to pop up, which are enabled by a new type of neural net, which instead of being applied to basically multidimensional array data-- things like images or things like that-- you can now apply neural nets to graph data. So data that comes to you in the form of a graph with values on it. Function on a graph, if you want. And the graph doesn't need to be static.
[00:37:33.92] I'm going to point you to a review paper that I'm a distant co-author on. Geometric Deep Learning, Going Beyond Euclidean Data. So this is the idea of how you can define things like convolutional nets and things like this on data that is not an array, but basically a function on a graph. And the cool thing about this is that you can apply this to social networks, regulating networks, networks of financial instruments, let's say, 3D shapes, functional networks in biology, things like that.
[00:38:02.03] There's essentially three types. The classical ConvNets where the input is known. It's a function on a grid, if you want. Like an image, for example, you could think of as a function on a grid. Things where the graph is fixed. For example, the graph of interactions between different areas of the brain. But the function on the graph is not fixed. And so you'd like to apply convolutional nets to domains of this type. How do you define a convolutional on such a funny graph?
[00:38:32.57] And then there are applications where the graph changes for every new data, right? So for example, the data point could be a molecule, and molecule is best represented as a graph. Can we run a neural net on the graph? And the answer is yes. And I think this whole area opens an entire Pandora's box of new applications of neural nets that are [INAUDIBLE] unforeseen, and so I think it's really cool.
[00:38:57.80] Last year, I co-organized a workshop at IPAN-- the Institute for Pure and Applied Mathematics-- at UCLA on new techniques in deep learning and there was a lot of talks about this. So if you want to learn about this, that's a good way to get started.
[00:39:12.99] OK, now, there's been a lot of excitement about reinforcement learning, particularly deep reinforcement training, in the last few years. Everybody has Alpha Go and things like that. And reinforcement learning works really well for things like games. So if you want to train a machine to play Doom or play Go or play chess-- StarCraft not so much yet-- reinforcement learning works really well.
[00:39:37.84] So reinforcement learning is the idea that you don't tell the machine the correct answer. You only tell the machine whether it did good or bad, right? So you let the machine produce an answer, in this case, an action or an action sequence. And then you tell it you won or you lost, or you did good, you gain points, or you didn't.
[00:39:55.34] So it works amazingly well except that it requires many, many, many interactions with the environment. So a few people were thinking that-- so it works really well for Go, for example. So this is a Go system that is actually produced at Facebook, which is similar to AlphaGo and Alpha0, which works at superhuman level and everything. We're working also on a similar project for StarCraft where we train a StarCraft agent to win the battle.
[00:40:27.36] The big problem that I was just mentioning is that reinforcement learning is very inefficient, in terms of samples. So this is a figure from a recent paper from DeepMind where they measure as a function of the number of millions of frames that the system sees for playing an Atari game. So this is a classic Atari game from the 1980s. Using the best algorithms, it takes roughly 7 million frames to reach a performance that humans will reach in a few minutes. And that corresponds to something like 100 hours of play if you translate this into real time. So these systems are much, much slower than humans-- or animals for, that matter-- at learning new skills.
[00:41:15.74] And that's why they are not really practical for real world application for which there is no gigantic amounts of interactions that are accessible. So if you want to use reinforcement learning to train a car to drive itself, it's basically not going to work in its current form. Machine will have to drive off a cliff several thousand times before it figures out how not to do it, OK?
[00:41:42.65] Now, how is it that humans are able to learn to drive a car in about 20 hours of training without crashing? It's kind of amazing. This would require hundreds of thousands, if not millions, of hours of training to get a car to drive itself. You could do this in simulation, but simulations are not accurate and people are working on how you can transfer from simulation environment to the real world.
[00:42:07.79] Yeah, this is just in passing, a list of major open source project that Facebook research has put out. So PyTorch is the environment we use for deep learning. FAISS is a very fast similarity search library for nearest neighbor. It's very useful. This is used everywhere within Facebook. This stands for dialogue. There's this reinforcement learning framework for Go. OpenGo is the system I just mentioned. FastText for natural language understanding. FairSeq for sequence processing, things like translation and things like that. There's a whole bunch of projects coming up. You can get them all from this GitHub here, github.com/facebookresearch.
[00:42:48.47] OK, so obviously, you know, we can't get our machines to run as fast as humans, so we're missing something really essential here to get to real AI. And in my opinion, we're missing three things. One thing we're missing is the ability of learning machines to reason. So right now, all the applications I've shown you is for perception. And for perception, deep learning works amazingly well. It can learn to represent the perceptual world very well.
[00:43:10.48] But learning to reason, that's more difficult. There are a lot of ideas. Some work on it, but I don't think we have the answer to that. The second problem is learning models of the world. So the reason, perhaps, that we are able to drive a car with 20 hours of training without crashing is that we can predict the effect of our actions. We can predict what's going to happen in the world before it happens. The whole front part of our brain basically is a prediction engine.
[00:43:43.09] And our machines don't really have the ability of basically predicting. Not that they don't have it, we can train them to predict in certain ways, but there are technical difficulties which I'll come to in a minute. And the last thing, which I'm not going to talk about, is the ability to learn not just hierarchical representations of the perceptual world, but hierarchical representations of the action world. When we decide to go from here to Atlanta, we have to decompose that task into sub-tasks, all the way down to millisecond-by-millisecond control of our muscles. And so it has this sort of hierarchical representation of action sequences. And we don't really know how to do this automatically with machine learning today. But I'm not going to talk about this.
[00:44:31.68] So it's a big problem because we can have all those cool things that we can build with deep learning, but we can't have those things, which is what we really want. We'd like to have machines with common sense that know a dialogue system that we can talk to. And doesn't have a very narrow set of things it can do for us, right, like playing music and giving us the weather and the traffic. We'd like machines to help us in our daily lives the way a human assistant would.
[00:45:00.60] So we want to build things like intelligent personal assistants. We won't have that until we have machines that have some level of common sense. We'd like to have household robots that are agile and dexterous. We don't have that. We don't have robots that are nearly as agile and have nearly as much common sense as a house cat, with all their superhuman performance in Go and everything. So that's what we need to think of. What's the next step?
[00:45:31.04] OK, so with that reasoning, there is an avenue which is interesting because it might lead to a new way of doing computer science, essentially, which is the idea of differentiable programming. And it's the idea that when you build a deep learning system, you don't actually build a graph of modules anymore in frameworks like PyTorch. You just write a program. The purpose of this program is just to compute the output of your neural net. And every call of functions in this program is like a module that you can differentiate.
[00:46:01.72] And so essentially, it's a new way of writing software where when you write the program, the function of each instruction is not entirely specified until you train the program to actually do the right thing from examples, OK? So it's like a weakly-specified program, essentially. So it's called differentiable programming because it's the idea that you write programs. So essentially a neural net architecture is really a program. It's like an algorithm whose function is not completely finalized until you train it.
[00:46:32.13] And there's lots of really interesting stuff that are viewed in this context. For example, the idea of memory augmented neural nets. So the idea that you have a neural net and you attach to it something that works like a an associated memory that this then can use as a working memory to do things like reasoning, long chains of reasoning. Or maybe it can use the memory to store factual knowledge like relationships between net knowledge bases between objects, objects and relations, things like that. There's quite a bit of work on this. Again, I don't think we have the complete answer, but it's interesting.
[00:47:17.62] Here's another example. This is an interesting project where you'd like a system to be able to answer questions like that here. So you show it an image of this type and you tell it there is a shiny object that is right of the gray metallic cylinder. Does it have the same size as the large rubber sphere? And for us to answer that question, we have to configure a visual system to basically detect the shiny objects and the gray metallic cylinder. We have the strategy. We detect the gray metallic cylinder, and then we look for objects nearby that are shiny, and then we compare sizes, right?
[00:47:53.94] And so the idea behind this project, which is at Facebook in Menlo Park, is you have a neural net that reads the sentence. And what it does is that it generates another neural net whose only purpose is to answer that particular question from an image. So the modules here are dynamically wired, if you want, depending on the input. So it's one of those examples of dynamic neural net whose structure is data dependent-- that's the essence of differentiable programming. Software 2.0, some people have called it this way.
[00:48:25.04] So that's the-- so PyTorch was really designed from the start with this idea that you could have dynamic neural nets. Not quite the case with TensorFlow, which is the Google framework. But TensorFlow is kind of catching up. They're trying to do the same thing.
[00:48:42.30] OK, so how do humans and animals learn? Babies, in the first few days and weeks of life, months of life, they learn an amazing amount of background knowledge about the world just by observation. Babies are kind of helpless. Their actions are very limited. But they can observe and they learn a lot by observing.
[00:49:00.98] So if you play a trick on a baby before the age of six months or so, you show a baby-- put a toy on a platform and push the toy off, there's a trick that makes it such that the toy doesn't fall. Before six months, the baby doesn't pay attention to this. They are sure that's how the world works. No problem.
[00:49:22.09] After eight or nine months, you show this scenario to a baby and she goes like this. Because in the meantime, she's learned that an object is not supposed to float in the air. It's supposed to fall if it's not supported. So she's learned the concept of gravity in between, intuitive physics and things like that, inertia.
[00:49:45.82] In fact, there was this chart that was put together by Emmanuel Dupoux, who is a cognitive neuroscience in Paris who spends part of his time at Facebook, which is when babies learn basic concepts of this type. You know, gravity, inertia happens around seven months or so-- seven or eight months. And object permanence is an important one that pops up very early. The difference between animate and inanimate objects also appears quite early.
[00:50:10.48] So we learn those things just by observation. It's not in a task-dependent way. And this is what allows us to predict what's going to happen in the world. We have a very good model of the world that we learn since we're born just by observation. And we're not the only ones. Animals also have good models of the world. Here is a baby orangutan here. He was being played a magic trick. There was an object in this cup. The object was removed, but he didn't see it. Now the cup is empty. And is rolling on the floor laughing.
[00:50:39.73] His model of the world was violated, and so it causes you to do one of two or three things when your model of the world is violated. You laugh, you get scared, because maybe something dangerous is going to happen that you didn't predict. In any case, you pay attention, OK?
[00:50:57.70] All right, so I think the way to approach that problem is through what I call self-supervised learning. And it's basically the idea that for a system to be able to learn from raw data just by observation, what you're going to do is you're going to feed a piece of data to the system-- let's say a video clip, OK? And you're going to tell the system pretend you know a piece of this input and pretend you don't know this, and try to predict this piece that you pretend you don't know. And then I'm going to show you this piece and you can correct your internal parameters to make the prediction that actually occurred.
[00:51:32.57] OK, so for example, I show you a piece of a video clip and I ask you to predict how the clip is going to continue, the next few frames in the video clip. And then I show you the frames and [INAUDIBLE] predict. But it's not just predicting the future. It could be predicting the past. It could be predicting the top from the bottom, whatever, the piece of the input.
[00:51:53.65] So there's really those three types of learning. The reinforcement learning, where the feedback to the machine is very weak informationally. It's just one scalar value that tells submission whether it did good or bad once in a while. Supervised learning, you give more information to the machine. You tell it what the correct answer is. But still not very strong because all that data has to be curated by humans and so it's limited in the amount.
[00:52:14.49] And then there is this self-supervised predictive learning idea where the amount of data that machine is asked to predict and the amount of data that's given to train is absolutely enormous. With just an hour of video, it's a ridiculously large amount of data that you're asking the machine to predict. Every future frame from every past frame, for example.
[00:52:36.14] So Jeff Hinton made this argument many years ago that if you have a very large learning system, like, say, a brain, that has 10 to the 14 parameters, three parameters, the synaptic connections, you need a lot of data to constrain the system to learn anything useful. And that's pretty much the only way, predicting everything from everything else, essentially. We're not going to do this with supervised learning or reinforcement learning.
[00:53:00.87] That lead me to this certainly obnoxious analogy here that if the stuff we can learn, or intelligence, is a cake, the bulk of the cake is not supervised learning. Almost everything we learn, we learn just in self-supervised fashion. We learn a little bit with supervised learning. We learn a tiny amount through reinforcement learning, so that would be the cherry on the cake. People working on reinforcement learning get a little upset when I show this, but it's become a bit of a meme now in the machine learning community.
[00:53:29.40] OK, this is not-- this doesn't mean reinforcement learning is not interesting. It's necessary. This is a Black Forest cake and Black Forest cake has to have a cherry. It actually has cherries inside even. But it's really not where we learn most of our knowledge. So with things like-- image in painting, for example, is an example of self-supervised learning, and people are working on this in computer vision. So the next revolution in AI is not going to be supervised, that's for sure.
[00:54:03.98] OK, so let's say we want to build predictive models of the world. So it's a very classical thing in optimal control, and I'm sure some of you may have a background in this kind of stuff. There's a system you want to control. Optimal control people call it Plant. And you have an objective you want to minimize, or maximize, in your case.
[00:54:27.01] And you can run your simulator forward and then figure out an optimal sequence of commands that will optimize your objective, given your predictive model, OK? That's the classical thing in optimal control. In fact, that should be a classical thing in the architecture of an intelligent system. The intelligent system should have a way of kind of predicting what's going to happen before it happens to avoid doing stupid things like running off a cliff, right? And we don't run off cliffs even if we don't know how to drive because-- mostly-- because we have this ability to predict the consequence of our actions. So we need this for our simulator in an intelligent agent, as well as other components I'm not going to talk about.
[00:55:11.34] So how do we learn predictive models of the world? We can observe the state of the world, at least partially, through observation, and we can train a function to predict where the next state is going to be. And then we're going to observe where the next state is going to be and we just train our system in a supervised manner to do this.
[00:55:24.50] So this is something that some of my colleagues at Facebook have tried to do a few years ago, where you have those kind of scenarios where you put a stack of cubes and you let the physics operate. And the cubes fall. And the predictions you get-- this is predictions produced by a convolutional net. The predictions you get are blurry because the system cannot exactly predict what's going to happen. There is an uncertainty about what's going to happen to this tower.
[00:55:49.10] And so you get those blurry predictions. If I take a pen and I put it on the table and I let it go, you can predict that it's going to fall, but you probably can't predict in which direction it's going to fall. And so that's a problem because we have to be able to get machines to learn in the presence of large uncertainties.
[00:56:12.22] So there is the pen example. And the only way we can do this is through models that have latent variables. So basically, we observe the past, the [INAUDIBLE] where I put the pen on the table. And we're going to make a prediction.
[00:56:25.03] What we'd like is to make multiple predictions depending on the circumstances. And so what we're going to need is a set of extra variables-- latent variables-- that we can observe. And when we vary those variables of this vector, it makes the prediction vary among all the possible predictions that may occur, OK? Let's call it intervariable model.
[00:56:51.88] And a good example of how to do this is adversarial training. So adversarial training says I'm going to sample this latent variable randomly. And now what I need to train this predictor is something that tells me whether my prediction is on this set of plausible futures or whether it's outside, OK? But of course, I don't have any characterization of the set of possible futures, so I'm going to train a second neural net to tell me whether I'm on this manifold or outside, OK? That's called a discriminator in the context of adversarial training.
[00:57:26.53] And you can think of it as a trainable loss function. A trainable objective function, basically. The objective function tells you how far you are from this manifold, and there's a gradient of it that points you towards the manifold. So that's the idea with adversarial training.
[00:57:40.96] Let's say you want to do video prediction. You show the system a piece of video. And of course, in your data set, you know what the video is going to do. That's the real data. But then you run this through your generator, which from a source of random vectors, tries to predict what's going to happen. And initially, it's not trained properly. It's going to make a bad, blurry prediction or something like that.
[00:58:00.79] So now you train your discriminator-- your function that tells you whether you are [INAUDIBLE] or not of data. You train it to produce low values for this and high values for that. So that's kind of a representation of what this discriminator is doing. And what it's going to try to do is the green points that come from here that are not on the manifold of data is going to try to push up the output here. And for the real ones, they're going to try to push them down. Those are the blue spheres. And so this function is going to take that shape.
[00:58:32.53] And then what you're going to do is use the gradient of that function with respect to its input to train this generator to produce images that this guy can't tell are fake, OK? So now what you have is an objective function in the discriminator that can tell the generator, you are on the manifold or outside. You can use the gradient of that by propagating through the generator to train it to do the right thing. And eventually, it makes-- gives some predictions.
[00:58:55.34] People have used this-- I mean, this kind of technique now have taken over the field, basically. A lot of people are working on this for all kinds of stuff, generating synthetic images. This is work from a few years ago. These are fake faces. This is work from NVIDIA in Finland, and they've trained a system to transform a bunch of random numbers into a face image. They trained it on a database of photos of celebrities, and by the end, you feed a bunch of random numbers and out comes the image of a face. And these are synthetic faces. At high resolution, you can't tell they're fake. But none of those people exist.
[00:59:41.46] At Facebook, we've been working on similar techniques to do things like generating fashion elements. So we-- it's in France, so you know--
[00:59:54.82] So we got a big data set from a very famous designer house, and trained one of the generating networks on this. And these are examples of generations of-- and this is not textures that humans would come up with, essentially.
[01:00:09.92] OK, I'm going to talk a little bit about video prediction. So video prediction is interesting because, in the context of self-driving cars, for example, you'd like a self-driving car to be able to predict what the cars around it are doing, right? I realize I'm out of time, so let me--
[01:00:31.18] MANUELA VELOSO: Yes.
[01:00:35.90] YANN LECUN: And this is a project that we've done with people at NVIDIA who are trying to predict what cars around us are going to do, and then use this predictive model to learn a good driving policy. So basically, we feed the system with a few frames of what the environment of cars looks like around us. And we train the system to predict what the cars around us are going to do. And we use data that comes from an overhead camera for this.
[01:01:02.04] And so these are examples of predictions. So this is if you have a completely deterministic system that doesn't have any latent variable and basically, it makes those blurry predictions. And these are predictions that occur if you have a system with latent variables in it, and I don't have time to go into the details of how it's built.
[01:01:19.34] And then you can build a system to basically train the system to run a driving policy. So you start from a real state. You run your predictive model forward. You can compute the costs, which is how close you are or far you are from other cars, whether you are in the lane or not. And by back-propagation, you can learn a policy network that learns to produce an action that will minimize the probability of collisions over the thing. And if you do this, it doesn't work. But if you add a term in the cost that indicates how certain the system is of its prediction, then it works.
[01:01:49.87] And so I'm just going to end with the cube video here. So these are-- the blue car is driving itself, basically, and the white point indicates whether the car is accelerating, decelerating, turning. The other cars are real cars around it that are just recorded. And so our own car is invisible to them. So it's like we're driving on a highway, but nobody sees us, right? So we can get squeezed in between two cars, basically, and there's nothing we can do. But this thing learns how to merge on a highway and things like that.
[01:02:23.71] OK, so I'm going to end here. Just remind you, there's interesting areas of research in deep learning in things like graph or search of data, reasoning, self-supervised learning, learning hierarchical representation control space. We need more theory. And maybe there is a new type of computer science emerging through different [INAUDIBLE] programming. Thank you very much.
[01:03:01.15] MANUELA VELOSO: We have time for a couple of questions. Yes.
[01:03:05.46] AUDIENCE: Thank you. Really cool talk.
[01:03:08.11] YANN LECUN: Thank you.
[01:03:09.06] AUDIENCE: When you spoke of deep learning, you mentioned how it's important to alternate the linear and the nonlinear layers, and [INAUDIBLE] stack up the nonlinear layers in the middle, for example.
[01:03:25.42] YANN LECUN: Well, there's no point stacking two linear layers because the product of two matrices is a matrix, right? So it's equivalent to a single one. And the role of the linear layer is to mix up-- to detect configurations of their inputs that matter, basically. The nonlinear functions are point-wise, and so they don't combine things, right? They're just point-wise functions usually. I mean, you can imagine other types of nonlinear functions, but more often, they are-- so there's no point stacking them either because they don't actually combine stuff.
[01:03:59.22] MANUELA VELOSO: Very good.
[01:03:59.50] AUDIENCE: Thank you.
[01:04:00.70] MANUELA VELOSO: Yes.
[01:04:02.16] AUDIENCE: I had a question. I had a chance to listen to your talk last year and [INAUDIBLE]. Hi, thank you for the talk. I had the chance to listen to your talk last year as well at a different venue. And in the promising years for research and your cake slide, you had unsupervised learning instead of self-supervised learning.
[01:04:17.07] YANN LECUN: Right.
[01:04:17.47] AUDIENCE: I was wondering if you see self-- because I'm seeing the term for the first time now, when [INAUDIBLE] obviously look up after this lecture. I was wondering if you're defining this as something under unsupervised learning or something separate, or semi-supervised? Just want to hear your thoughts on that.
[01:04:31.38] YANN LECUN: So unsupervised learning is a loaded term. Whenever you say this, people think you're doing exploratory data analysis and clustering and PCA and things like this. And so I kind of moved away from that. Self-supervised learning, you could think of it as kind of a special case, where it's this idea that you have a piece of data and you're trying to predict part of the data that you pretend you don't know from a part that you know, OK? And it's a particular way of doing unsupervised learning, if you think-- if you want to think of it this way. But I think it's the most promising one.
[01:04:54:15] MANUELA VELOSO: And thank you very much for a wonderful talk.
AI in 2020: How we see it at CMU
November 26, 2018 - New York City
Andrew W. Moore
Andrew W. Moore is the Dean of the School of Computer Science, Carnegie Mellon University. Andrew is a distinguished computer scientist with expertise in machine learning and robotics. His research interests broadly encompass the field of “big data”—applying statistical methods and mathematical formulas to massive quantities of information, ranging from Web searches to astronomy and medical records in order to identify patterns and extract information. His past research has also included improving the ability of robots and other automated systems to sense the world around them and respond appropriately.
Previously in 2006, Andrew was a founding director and VP of engineering at Google. In January 2019, Andrew returned to Google to lead its Cloud AI efforts.
[00:00:00.00] MANUELA VELOSO: Welcome to the distinguished lecture series on AI. And it's a great pleasure to introduce Andrew Moore. And Andrew may not need any introduction, but I really want to tell you specifically to point of Andrew. Andrew was born in Bournemouth.
[00:00:24.07] Do you guys know where Bournemouth is? It is a little town-- or a big town-- by the south east by the ocean in the UK, and actually also from our point of view from JP Morgan, it is a place where there is a big group of technologists-- in particular, also [INAUDIBLE] scientists and machine learning. So I'm going to be visiting in December. I couldn't believe that Andrew was from that place that I did not know existed before I actually learned about these data scientists in JP Morgan.
[00:01:01.62] But Andrew actually has a distinguished career in AI and machine learning. He did his PhD-- well he's undergrad in math and computer science at Cambridge. And then he worked at HP for some time. And then he actually has a PhD from Cambridge. And he then moved to the United States. And he was for three years a postdoc at MIT working on robotics and machine learning well before any reinforcement learning was fashionable or even like, known.
[00:01:37.26] Andrew pioneered this field of reinforcement learning. And in fact, a survey on reinforcement learning co-authored by Andrew is the most cited reference in reinforcement learning ever. It's a perfect reading to learn about reinforcement learning. And if you did not read it yet, I strongly encourage you to take that as your bed night reading tonight. That's homework number one. And that's called, really a survey in reinforcement learning, I believe. And I have it actually in my office in case you want a hard copy.
[00:02:16.74] And then Andrew, actually, after these three years as postdoc at MIT joined the faculty at Carnegie Mellon University in 1993. And that's when I was fortunate enough to meet Andrew. And in fact, Andrew and I were part of the AI faculty at Carnegie Mellon since always. And we co-taught courses on AI really since '93-- many years.
[00:02:47.01] And then Andrew in 2006 took a leave to go to Google and create the Google office in Pittsburgh where if I am correct-- I forget if this is correct or not-- Andrew grew it to a gigantic kind of like, machinery of engineers in data science from which maybe he always had 99% recruiting ability at Google. And that was tremendous.
[00:03:17.82] In 2014, we were lucky enough to have Andrew join Carnegie Mellon as the Dean of the School of Computer Science. And again, that gave me a chance to interact with Andrew as a faculty and also as head of the machine learning department, I was fortunate to have my dean be Andrew Moore. And it is a great pleasure.
[00:03:39.93] And I think you are going to learn a lot about AI and how CMU sees AI, actually through Andrew's eyes. It's not very common to have a dean of such a big school like School of Computer Science at Carnegie Mellon who cares about research and the teaching and education and connections with the outside in the area of artificial intelligence. So with great pleasure, let's hear Andrew Moore tell us about how we see it-- this AI in 2020-- at Carnegie Mellon University. Thank you, Andrew.
[00:04:14.54] ANDREW MOORE: Thank you. I'm really excited to be here today. Thank you so much for the invitation to Manuela and the whole team. First, let me encourage questions or comments throughout the talk. I love it when folks want to jump in.
[00:04:34.69] I'm also going to learn from this talk as well, which is Manuela and I have worked in many industries, many verticals together. But I, at least, am not expert at all about the important aspects of AI in the whole world of financial technology. And so I would love folks to jump in and tell me, you know what? That might all be very well for if you're using AI for patrolling jungles but it is useless for the financial world. And so that's a little aggressive. But if you want to jump in with any comments or questions like that, that would be totally welcome.
[00:05:15.46] All right. So what I want to remind folks of is the distinction between artificial intelligence and machine learning. And right now, I and-- I would say many of the faculty and students at Carnegie Mellon-- are passionate that it's an important distinction. It's actually one that Manuela brought to Carnegie Mellon's department of machine learning. And it is very much about the future.
[00:05:48.99] So to tell you about the distinction between machine learning and artificial intelligence, I'm going to give you a quick recap of what happened at the start. So in the 1960s when AI was getting started, the heroes of early AI such as Allen Newell, Herb Simon, John McCarthy were really looking at the question of what is human level intelligence? What does it comprise of?
[00:06:18.13] And they broke it down into three pieces-- perception, deciding what you want to do, and then acting. And there was this big discussion about the loop that an autonomous or intelligent system goes through-- we go through in our lives-- where we're constantly seeing stuff, making a decision what we want to do, acting to change the world, observe the results of that. And we keep going around this control loop.
[00:06:47.43] And this was the version of artificial intelligence that I and, I think Manuela, were both sort of born into in the late 80s, early 90s when we were so excited about a future where we can have autonomous decision makers. Probably the peak of these early days of AI was in '97 when computers had superhuman performance of the game of chess. What was interesting about that is it was kind of the peak. And after that, there was a little bit of a sort of plateauing or decline in the use of artificial intelligence.
[00:07:35.97] And the reason for that decline is this thing about decide is really difficult, except in a few special cases such as the game of chess. The reason is, as you all know-- and I know many of you here are very advanced technologists so this is not news to you-- all that happens in an implementation of decide is you iterate among possible actions-- or if you're a slightly more advanced and you're doing planning, possible sequences of actions-- and for every single one of them, you look at the outcome. And then you do an arg max. And whichever is the best one, you choose. And you keep going around like that.
[00:08:19.30] But in order to figure out what the best one is, for each of your possible actions you've got to know what you expect or what the probability distribution is over what will happen next. It's easy in chess because you can write down the rules of chess in a very short program. In general though, that was the really difficult part for artificial intelligence.
[00:08:43.50] You heard people talk about rule based systems where folks were coming up with sets of methodologies for allowing robots and computers and other scenarios other than games to make decisions. And you could do it for circuit board layout. To some extent, you could do it for scheduling. Not in a very meaningful way were you able to do it for trading.
[00:09:07.62] And in general, the problem was we couldn't find a way to have software engineers write down the predictive rules. And there's no point in arg maxing over a thing which is not going to be accurate. So that led us to the sudden growth of an area of artificial intelligence, which had been a little bit quiet in the 1990s. Which is, can we get rid of the idea of humans writing the predictions of what happens next and instead have computers observing historical data predict what happens next?
[00:09:53.80] There were a few huge successes there. One of them was at Google where it turned out that all the companies which had tried to do search engine results by coming up with a set of rules for how to turn a query from a user into a useful thing to show were being unsuccessful because it was so hard. But if you've got enough clicks and enough data, you could build a predictive model of what the likelihood is that someone wants to find something useful in an answer to a question. And that's what led to the massive growth of Google.
[00:10:29.86] Subsequently, similar things were used for all kinds of recommendation systems and then starts to jump into other verticals such as weather prediction, some aspects of physics, many medical decision systems. And so during the mid 2000s to about now, there's been an explosion of people successfully using machine learning to predict what happens next. And we're all excited about that. That means that this big bug which stopped us in 1997 seems to have been overcome to some extent.
[00:11:11.89] What we at Carnegie Mellon-- and in fact, it's through a lineage of geniuses of AI starting with Herb Simon and Alen Newell, followed up by subsequent folks such as Hymie Carbonell who was one of the forefathers of natural language processing and Tom Mitchell who is one of the people who introduced the idea of computational machine learning, and then frankly going up to folks like Manuela Veloso-- Carnegie Mellon, has always held on to the idea that we have got to be good at perception, decision, and action to build autonomous systems. And now that machine learning is able to help us with decision, we really want to go and work once again on the full loop.
[00:12:06.57] So this picture-- which you're going to see a lot of them this talk-- is a way to think about the field of artificial intelligence. And it also helps me personally structure my thinking when it comes to what is the right curriculum for creating a person who is an expert in AI, not just a person who can use AI but people who can invent new parts of AI.
[00:12:35.42] It's also, over and over again in my career, it's been the diagram that helps me set up a large AI project. Because this is a deep composition of the engineering that you need to do for an AI system, which kind of helps you actually see I'm going to need a lead for machine learning. I'll need a lead for perception. Probably, I'll need a lead for decision making. And then at the very top, I'm going to need people who can work out exactly what the consequences are of acting, either discussing stuff with humans or acting autonomously.
[00:13:18.69] So I want to now walk over this diagram, giving a bit of flavor of the key technologies involved. And we'll begin with sensors. The interesting thing, those of you who own a Tesla-- actually, as everyone here lives in New York, you don't own cars-- but for those remote folks, especially our Bournemouthians if there's any Bournemouth people in the audience who are able to drive, the interesting thing right now is that because computer vision which is itself really grown through the use of machine learning is now a commodity we're doing things like detecting and counting vehicles. It's basically something that you can download open source or buy from a software vendor.
[00:14:14.38] We can do things like this is a very nice example of a system in about a mile radius around Carnegie Mellon where all the traffic lights can see how many cars are waiting at which intersection in which directions. And the traffic lights can be a little smarter than about how they timed themselves on the basis of this. And they can talk to their neighbors so that between them, they can actually start to strike deals. Like OK, I'm going to give you these six cars as long as you can promise that by the time they get to you, you'll be ready for them to take them further. And so that's the nice sort of thing we can do now that perception is a commodity.
[00:14:58.01] Another example of where perception is really changing things is it's getting so much better every year at the moment in almost a frightening way. So Marios Savvides in the ECE department at Carnegie Mellon is an example of someone who has solved a problem. It's actually an interesting problem of the control of the pan and tilt of a camera so that if you walk into a room-- if you walked into the back of this room-- and Mario's camera was here, the camera in a fifth of a second would pan and be able to zoom in on your eye to get this quality of image and then do identification and in fact, actually some other diagnostics based on eyes to do with emotional state and other information within in total a quarter of a second.
[00:15:56.97] So that's this year. Next year, we may perhaps with some additional technology I'm discussing later on be able to look at all the eyes of someone in an audience or track all the faces of someone in the audience. And so don't for a moment think that we're kind of plateaued in perception. We can go a long way forward.
[00:16:22.74] And as many of you are probably realizing, this gives us the opportunity to create an awful Orwellian world. It can be really useful. Certainly the idea of supermarkets where you need to walk into the supermarket and take out a piece of plastic to show your credentials for purchasing something is immediately old fashioned. And as we've seen with Amazon Go stores, it is now perfectly reasonable to create a retail experience where you just walk in, pick stuff up, walk out, and the payment without you having to produce anything can happen behind the scenes-- lots of conveniences, lots of consequences and potential dangerous things.
[00:17:12.64] I want to mention another very important thing that's going on. And I think this is extremely relevant for folks in your organization who are looking at edge computing of various kinds or computing that's forced to be extremely close to the sources of data. And that is how do we get to do extremely low latency massive processing of information?
[00:17:45.05] One of my dreams, which we may have time to talk about later, is a world where after a disaster you have-- such as a flood or a terrorist event or a fire-- you have autonomous fixed wing aircraft in the air at about 3,000 feet able to watch an entire city to immediately see who's in danger, who's doing what, where people might be trapped, where's a dangerous location.
[00:18:14.41] If you want to do that, you would need to put tens of thousands of cameras on a single plane. The moment you can't do that, first even if the cameras are just an ounce each, the plane will plummet from the weight. That's solvable. Second, you could not afford to have a network link getting real time data from 10,000 cameras to ground anyway. So you end up wanting to put the compute on an aircraft. And it needs to be very, very low power.
[00:18:52.03] So a big push in the world of artificial intelligence right now is something that I don't think you would ever have normally being discussed in AI lecture or an AI textbook. It is how we can make these neural networks and arg maxes-- the big search algorithms-- run at about 1,000th of the power that they run at right now on a lot to very tiny piece of equipment.
[00:19:19.49] What's interesting to me and what might be a topic of follow up discussion is those of us who are working with aerospace, massive scale surveillance and satellites, we're doing this work for that use case. What's interesting to me and what I'm sure is interesting here is what would we do in the financial industry if we've got down to 1,000th of the power requirements for doing things like neural network inference? And would that actually have a big impact on the business, especially as that technology we're going to develop anyway because we need it so much in aerospace?
[00:19:58.78] I'm just going to--
[00:20:00.00] [MUSIC - BRUNO MARS, "UPTOWN FUNK"]
[00:20:02.18] All right. I apologize for the music, but I want to give an example of the use space. This is A Couple of grad students in the Robotics Institute who were able to come up with a deep learning system for detecting all the elbows in an image, all the knees in an image, all the pelvises in an image, and be able to run that in real time on GPUs inside regular cameras on regular cell phones.
[00:20:32.90] And what they're able to do now is those of you who have used Microsoft Connect which can track one human body, they can now track whole crowds of people. It's an example of the kind of direction that we're thinking of moving. Unfortunately at the moment, this technology is not quite real time for crowds of thousands of people. And so that's the next thing.
[00:20:57.98] Those students and their advisor, they're now looking primarily at these hardware questions so that they can run this on massive groups of people, again, to help out with situations of public emergency or to do various studies. What do we know at the moment about whether as a infectious disease is spreading in a city, we can see about the actual gaits and walking movements of the population coming into work on any given day to sort of notice small changes in behavior.
[00:21:35.60] The other place the power is really important-- this one was power and algorithms-- is in the three dimensional understanding of the world. When you are building rescue robots or things like this, you need to be able to go into environments which have never been mapped before and have the robot figure out how to map them. These days, it seems like such a 2015 thing to imagine taking the data off your robot, going to a data center to compute that stuff, and then send it back to the robot. Now with both hardware technology and algorithms, we can do this sort of thing in real time.
[00:22:21.62] I'm going to keep moving now as we cover other areas of machine learning. Because I think I've given a sense of this explosion in uses of machine learning. I want to mention one thing which I need everyone to be aware of because it is going to be a big change in the early 2020s for all of us. This is the first piece of work I know of for doing neural network inference at about a ten thousandth of the power that it uses now.
[00:22:59.64] This will be critical for self-driving cars. Many of you probably know this but at the moment, the computing that you need for a self-driving car raises the temperature of the car by about 10 degrees because it's so compute intensive. So you actually have to put in extra air conditioning in the vehicle to make it reasonable for humans. That's one of the many reasons why we want this.
[00:23:26.73] These guys have now built something which is a camera you can wear on your clothes. It is powered entirely by your body movement. And it can do face recognition, currently very slowly. It takes about 10 minutes to do recognition from a single frame. But that power requirement is amazing. As that starts to get commoditized, we can have cameras doing compute, which is just selling Walmarts on sticky labels strips to place them all around your house if you want to for some reason.
[00:24:03.94] This, again, might be relevant to you guys. For the purposes of preventing anyone from disrupting the entire global communications system, it is now possible to imagine putting up billions of very small computers into orbit instead of having all your compute sitting in a few cans floating around the world that are extremely vulnerable.
[00:24:33.95] And here is another thing which has been really interesting for us is voice.
[00:24:52.69] From that piece of sound, one of the faculty at Carnegie Mellon, Rita Singh, was able to deduce this much about the person who was uttering the sound. And why was she able to do this? It turns out that a lot of the frequencies in your voice are very much characterized by the size and shape and flexibility of your trachea. From your trachea, you can actually deduce a whole bunch of things about you including height, age, and weight as well as ethnicity.
[00:25:32.35] And one of the things I really liked was she was able to figure out that he's sitting on a metal chair on a concrete floor. Again, it's not actually science fiction. It turns out that mayday is about 100 megabytes of data. And the frequency analysis and the careful looking at various forms of echoes means that this is quite solvable. In this particular case, it was used to successfully apprehend a serial false alarm caller who had been sending out lifeboats in serious storms for unnecessary reasons.
[00:26:18.23] Now here's where I want to be really clear about something. We at CMU are very excited about machine learning. We formed the world's first machine learning department. But I don't know if I'm speaking for Manuela as well as myself-- it is not enough to just do machine learning. The real art comes in the decision making systems you put on top of machine learning. The things exactly how you do this arg max of over possible actions.
[00:26:56.01] And it's a message that I cannot be too clear about. Do not get obsessed with neural networks or deep networks as the sole component of an AI solution. The number of things where you actually just need to do a one pass prediction of something, a minor compared with the number of use cases where this is wrapped up in a bigger control system, where a decision you make on this time step is going to have repercussions for the next few seconds-- in some cases the next few minutes and in some cases the next few years. So you still have to be doing pretty sensitive optimization over what you're predicting.
[00:27:40.98] Here's one part of this is safety of machine learning based systems is still a black art. And it has serious consequences. This is an image of a system developed by the Army called TARDEC. In 2010, it first came online. It's something where a series of trucks going in a dangerous area-- Afghanistan, Iraq, and so forth-- can have one human driver in the front. And then the rest of the convoy just follows each other.
[00:28:18.39] It is still not being deployed. Why is it not being deployed? Because absolutely correctly, this is a safety critical system. You cannot risk it causing an inadvertent death of a civilian or a member of the armed forces. So you have to be able to prove its safety. Eight years later, we still haven't managed to prove its safety.
[00:28:44.97] Because it is using cameras and LIDAR and other sensors, that's going into machine learning systems which create these models of what the world is doing, which goes into control systems. And unlike advertising which I was doing it Google or many other aspects of the economy, you cannot make mistakes. So a big growth area for us are systems which can make proofs about the behavior of an overall machine learning system.
[00:29:17.95] There are two warring factions in the technology world as to how to do this. And like a good dean, I am encouraging both sides of the war. These folks-- Mike Wagner and Philip Koopman-- are using statistical machine learning methods. They are actually adversarial AIs who watch an autonomous system practicing. And the AI figures out from all the data what is the most difficult thing they could ask the system to do. So it's constantly searching for ways to break the system. It's a technology used in security at the moment, but is now far more automated. And with their help, we've now moved TARDEC several years forward in being able to rapidly discover weaknesses in the overall machine learning system.
[00:30:19.75] The other side of this battle-- I'm going to stand on this side-- are the theorem provers. This, I love because I always used to love mathematical logic. These are the folks who say forget the statistical testing or having an automated agent trying to break it. You need a mathematical proof of correctness of an automated system. The surprising thing is that in some very real situations, we can now get proofs of correctness.
[00:30:53.76] A good example has been the new FAA anti-collision system for aircraft over the United States. That is a new autonomous system powered by-- it doesn't learn online, but it has a model which it learned from data. And then actually, very intelligent aerospace control surface stuff to takeover if two aircrafts are actually about to collide in some way.
[00:31:27.32] When André Platzer applied his theorem proving to prove the safety of this system, he discovered hundreds of thousands of edge cases where the system would actually not follow its specifications and would unnecessarily fail to manage a potential collision. And so in his case, the use of theorem proving has been able to let us iterate to the point where we are not seeing the edge cases where there's a disaster in the system. And so this was a perfect example of theorem proving.
[00:31:59.07] The problem with theorem proving, you end up being a little conservative. You have to-- for instance, if you are putting bounds on the shape of a jet engine, you cannot computationally afford to do anything other than call it a big sphere. And so you're conservative. But obviously in safety engineering, that's the way you want to go.
[00:32:24.70] Another important area which we're betting on is game theoretic reasoning. Here, over and over again-- and I personally had a big disaster in one of my previous jobs based on not accounting for game theory-- when you are using a machine learning system to predict what's going to happen next as the effective action and then you use that to make your choices subsequently, if someone else knows you're using that methodology there are a whole bunch of vulnerabilities for how they can exploit this.
[00:33:01.30] I had an anti-spam system. This was something for detecting if an ad being served on Google's search engine was likely to be a scam of the form of a what we called an unbelievable claim. So unbelievable claim ads, which you've probably all met, usually have a caption-- something like, what the government doesn't want you to know or doctors hate this trick. And not surprisingly, we built good detectors which tried to figure out if either the landing page or the copy on the ad were in this form.
[00:33:50.76] And it worked really well. We had a several factors reduction in the amount of spam ads that got through as a result. And it got better and better. And then its sort of remained at a current sort of stable state. And so we went on to other things to improve. Then about nine months later, it suddenly started to fail miserably. And a whole slew of different spam ads-- not just one, but many, many different spam ads-- were getting through the system.
[00:34:23.21] And as we did a post-mortem, we figured out that someone on the other side had figured out the form of statistical model we were using for doing the spam prediction. And they had successfully sent us a whole bunch of training examples, which meant that our model became uninterested in a certain critical piece of information.
[00:34:49.95] And so they were adversarially gaming the fact that we were using machine learning to make our model continue to perform really well, but actually become unaware that it was creating a vulnerability. And so they were able to attack. It was annoyingly on Thanksgiving a few years ago, so lots of us had to scramble on our Thanksgiving because we had not put a game theoretic assumption into the machine learning system.
[00:35:20.94] So this is another area where you will see places like CMU aggressively hiring at the moment because so many parts of the economy cannot get away with this thing called a certainty equivalence model where you continually learn from data assuming that the data is independently distributed. It's not just that we're looking for outliers. We're looking for adversaries.
[00:35:48.63] Another really nice example of this kind of AI is one of our faculty members who in the spring has come up with a system for an AI to do redistricting, so the kind of problems where we're all worried about gerrymandering. And the interesting thing about this AI is the problem that this faculty member gave himself was how can we have a third party, if you like, do the redistricting which both the Democrats and Republicans will trust?
[00:36:29.23] And thinking about it, he came up with a game where both players can play the game actually as selfishly as they like to come up with the set of districts being drawn in a region and you have a guarantee that the results of the game will be something which an optimal algorithm would have also decided to use based on no concerns about fairness or bias. It's an amazing discovery. It follows a protocol which a few of you may have heard of called the pie cutting problem-- question of how a group of seven people can cut up a pie into seven slices in a way that none of them feels that they've been hard done by.
[00:37:23.72] If you want to look at something even more interesting than the reinforcement learning survey that Manuela mentioned, look up the pie cutting problem on Wikipedia. It's really, really cute, the algorithm that it comes up with. The fun thing is now it's being used really usefully in important areas.
[00:37:43.49] Oh yes, and another really good example of this-- I mentioned earlier-- patrolling jungles. One of our faculty members is very interested in reducing poaching in gang parks, primarily African game parks. She used machine learning-- very similar story to the one I told you about my own use. She used machine learning to predict which areas of the jungle were most likely to experience poaching on any given day according to the migration patterns, the weather, and other information so that rather than deploy the understaffed forest rangers group over to the whole jungle, they could really be focused on the places with high likelihood of poaching.
[00:38:28.34] This was successful for a short while. For a while, they were able to really cause a dip in the amount of poaching that went on. But before long, things went back up. And you won't be surprised to know what was happening was that the poachers were basically choosing the second most likely spot for poaching instead of the first most likely spot and in general, were just think trying to think one step ahead of the algorithm.
[00:38:54.77] With game theoretic reasoning, Fei Fang instead solved this problem using the same kind of technologies that have been used in game theory throughout the 20th century, including the kind of game theory used in building up Cold War protocols, to form a randomized strategy where patrols now are actually are non-deterministic. But they're much more efficient than if they randomly patrolled the whole forest, which almost never hits anyway. They can be maximally inconvenient to the poachers. And they have found what is essentially a Nash equilibrium for what to do.
[00:39:37.31] Again, emphasizing the big theme of you've got to use classic-- or maybe to be discovered-- decision methods on top of your machine learning model. Don't just use the machine learning model naively. That will work fine for little games that us academics play with in our labs. It will not work fine with the real world.
[00:40:05.57] I'm going to move to the top of the stack. There another area where we're hiring and another area where-- again, I don't want to put words in Manuela's mouth-- but we feel as a CMU faculty it would be irresponsible not to train AI practitioners. When you're about to take one of these systems and put it into the world, you have to care about what it actually means to the people who are operating with it. What are the not just immediate effects, but what are the second order effects of having AI assistants or AI advice in your life?
[00:40:44.21] The place that we all think of initially are the big AI chat bots. Whoever created these slides, why the heck didn't they put Google Home on this slide? It was actually me, but-- at the time I created the slides, these were the top three chat bots. So this is an area where when you ask a question like, where can I get a good cappuccino near here-- underneath, that machine learning system is all in place both for your speech and for predicting how satisfied you will be with particular actions. There are search algorithms. Certainly in New York if you ask that question, you're going to be looking at 10,000 possible answers, evaluating each of them before you give your response to the user.
[00:41:40.34] But there's more to it than that. There is also the question of, how am I going to make it so that the user trusts me in future, that I don't risk giving a 1 in 20 of my answers being crazy, wrong answers? Because if I do that, then the human is not going to be able to take advantage of me in the future because I'll lose their trust. So you end up making decisions based on what is known as risk averse reasoning when you're giving advice from a chat bot.
[00:42:10.52] Again in the interest of time, I will move forward on the discussion of a concept called the open knowledge network.
[00:42:19.07] [INTERPOSING VOICES]
[00:42:22.61] Sorry about the noise. I don't know if we can mute the noise.
[00:42:27.35] Here are some other really nice areas of human AI in action. There are really interesting things we can do now with figuring out how people with what some might call impairments such as missing a limb can operate as effectively or more effectively than other folks. This is an example of a game which trains you to do rock climbing by projecting test paths for you to climb with. And it cleverly adapts its level of difficulty so that it can train you from being a beginner up to being a high quality player.
[00:43:13.23] This is a system which allows you to talk to a physician in a way where your emotional state is being recorded during the discussion. And what this is being able to do for this researcher, LP Morrissey, is he has been able to effectively help physicians ask questions which gets to the root of a problem when humans are nervous about talking about a particular subject or going into detail. This has been effective. It has now shown in a bunch of blind tests to come up with better diagnostic prediction.
[00:43:56.75] This question overall of assessing human emotional state is another area which I don't quite know where it's going. But it is a huge growth area in the same way that computer vision was a huge growth area five years ago or natural language processing was a huge growth area 10 years ago. I don't know how it applies in the world of fintech. But the question of being able to get some notion of stress and these things called involuntary microexpressions from users is both Orwellian and really exciting.
[00:44:34.94] Another piece of LP's work has actually been in predicting whether a patient who is under treatment for depression-- predicting whether that specific treatment is going to be successful or not. And he has been able to show that he can predict this before the physician who's treating the patient or the patient themselves, based on analysis of eye movements and microexpressions. So very interesting new world that we're entering.
[00:45:13.20] Many other examples-- I want to keep moving though-- autonomy. So this is an example of an autonomous system produced by some faculty somewhere. I was trying to remember. I don't know who they were. Actually, it was Manuela Veloso for those of you who don't know. To me, it's the perfect example of putting the whole of perceive, decide, act into place. In fact, I'm going to play it again. You're seeing here a group of robots which are cooperatively and in real time planning how to get that ball into the goal. And you see here how the plan is actually executing where these things successfully-- even given uncertainty-- and it was hard to see that that actually went into the goal.
[00:46:03.23] That is an autonomous system. It is the least understood of all the parts of the AI stack. In fact despite the pioneering work here, I am unsatisfied with the work of the academic community or of the commercial community here that at the moment if you want to build an autonomous system, you're going to do artisanally. There is no strong theory to support it. Let me show you some other examples of autonomous systems.
[00:46:35.09] Built with Leidos-- a big defense contractor-- we have a semi-autonomous battleship. It usually has 300 crew. This one has 12 crew. It is currently patrolling the Pacific Ocean. It's a submarine hunter. It does not have weapons on board. That's important to mention. But it does autonomously obey the rules of navigation when other ships or coastal activities are in sight. This has not yet been validated. You can't validate it with a single experiment like this, but there is a real concern that navigation errors are actually kind of serious for the Navy right now. There's so much information that human decision makers are not able to process at all. And so you are seeing ridiculous things like ship collisions.
[00:47:25.49] So that's another highly important autonomous system. Again, it would be against the rules of warfare. And most of us in the world of artificial intelligence would not agree to do it if we were applying this to armed platforms. But for unarmed platforms, this is another place where you want to build in autonomy.
[00:47:49.82] Another really good example is in cyber defense. And here is an example of a rare use of cyber technology. This was a challenge created by DARPA where you're actually using both offensive as well as defensive network security. And this was a challenge with about 16 racks of computers. This was one physical rack. And there were 16 of them arranged in a circle networked together where after the start button was pressed, the systems had to try to break into each other and take over each other. And it really was the last machine standing.
[00:48:38.09] The system from Carnegie Mellon won. It was created by Professor David Brumley. It was the only one which was actually using game theory as well as machine learning and such. What it was doing at a very high level was it had a known set of exploits that it could use. But it was also able to watch for incoming exploits used against it, which it could then redeploy against third parties within the game. And it's a very interesting hidden state game where you hoard the exploits to use at the right time trying not to get them out into the wild until you know you really need to use them. And this won.
[00:49:24.03] The very interesting thing about it is that in Def Con 2017-- which is a human controlled cyber warfare simulation-- this machine placed 17th. So there were still 16 human teams which could beat it. But there were over 200 teams entering. And it was really interesting that we have an autonomous system placing in the top 20. So another area where autonomy is important.
[00:49:58.46] And the thing which I'm not satisfied about is if I want to build an autonomous system like one of these, I can't really train up new engineers with the skills to do it. I have to search around for one of the 100 or maybe 200 people in the world I know who have experience building autonomous systems to put this thing together. So academia or commercial world needs to come up with an engineering discipline around autonomy instead of it being an artisanal thing. Where with autonomy, we're back where we were with data management before the relational database was invented. And something similar to that I believe needs to happen.
[00:50:47.21] Here's a faculty member who really inspires me and really scares me. Her name is Claire [INAUDIBLE]. She has had an important insight-- she's not the only one with this insight-- and now, there's a rapidly growing academic community around it. Those of you who are in software development or manage software development systems, you know you use big change management systems-- probably not get in the commercial world. Maybe you're using perforce or something like that.
[00:51:18.73] As your organization grows, you get lots and lots of data about all the changes to the software which have happened. And it's captured by your source control system. And you get patches where someone has changed the line of code saying, preventing a fence post era or patching security problem 37. What Claire and a few others are doing is using machine learning over these big repositories to predict in advance which pieces of code will eventually get patched and for what reason.
[00:51:58.07] The thing that Claire is also doing, which makes a really stand out, is where it gets a little thrilling, shall we say. She's predicting the fix. So she's trying to predict how a really experienced programmer would actually improve that system. And I want to be really clear. What I'm about to say is science fiction. We have no idea if this is actually going to happen. But if things go well, my hope is that by the middle of the 2020's, we just have autonomous demons running over code, constantly improving it autonomously instead of relying on messy wetwear of human brains to improve the software. So that is a big, grand challenge. I think it could make the world a much better place-- certainly deal with security patches and efficiency mistakes much more effectively.
[00:52:57.07] All right, I want to mention one more point before I wrap up. I'm going to go back to this example. And I've already shown it to you. But you know, you can see it in full scale. This is an AI system which watches you. It gives you tasks of where your hands are meant to get placed. And then it watches you. And it gives you increasingly difficult or less difficult problems. A perfectly fine AI instantiation, not much different from some of the others you've seen here.
[00:53:36.38] The reason I think this is so important is in 2013, this would have been a PhD thesis. 2017, this was built by people who-- most of them-- there was a team of six students who built. Only one of them could program before the class where this was their end project. And by using open source libraries and Cloud for the big data management and so forth, they were able to create this prototype system within the time of their course.
[00:54:13.90] That's really exciting to me because before I went to Carnegie Mellon, I thought that the big problem for the world is that we have one or two orders of magnitude-- too few people who are qualified to build AI systems. Now I believe-- although it has yet to be shown-- that we can educate them relatively rapidly. The reason we're able to move much faster than I had expected is the components you have seen on the stack-- with the exception of autonomy-- we can encapsulate the performance and behavior of those components in a way where you do not have to be an expert in how every single one of those components is implemented. You can glue them together if you know what they're for and how the glue works.
[00:55:02.98] So that is my overall summary of where we see AI at the moment-- what we're scared about what we're excited about. And again, the big call to action here is I've been in the situation where I began to think that machine learning was the only important thing in AI. And over and over again, I've learned that that is wrong. And so places like Carnegie Mellon-- although we continue to invest in machine learning-- to be really relevant in the world of artificial intelligence, we're also investing in the parts below machine learning and above machine learning on the stack. And that's it. Thank you.
[00:55:51.78] MANUELA VELOSO: [INAUDIBLE]
[00:56:00.54] AUDIENCE: Hello. First of all, thank you for the wonderful talk.
[00:56:04.71] ANDREW MOORE: Thank you.
[00:56:05.62] AUDIENCE: I have a question. When you mention proof-- systems like military caravans-- I read somewhere that Google, which is of course advertising search-- believes in testing their systems so that they are 99% correct because it's hard to get to that 100%. But for the kind of the high risk systems you mentioned, do you believe there are no black swans and it can be proven 100%?
[00:56:34.94] ANDREW MOORE: So I would never trust a 100% proof, even for a safety critical system-- but not for the machine learning reason that you're describing. It is absolutely true that I would never allow any product or system to be released which was safety critical and based solely on machine learning. The standard design pattern is-- we are just talking about this at breakfast with some of your team here-- the design pattern is the machine learning system can make recommendations. And then there's a hard wired hand built control system around it, which looks at those recommendations. And using a much, much simpler system-- usually based on linear control rules or hard constraints-- can actually decide what to do. And so the proof work is to show matter what the machine learning system predicts, the overall system is safe.
[00:57:37.82] Now for many robots-- robots, you can say, worst case scenario, we'll just stop. Of course, you can't for things which are in flight or which have momentum, nor for things like big production lines where it may take an hour or two to shut down the lines. So in those cases, this proof method is really difficult. But yeah, I want to just disentangle these. Because really, it would be unethical in my opinion to put the direct output of a neural network into a safety critical system. Great question.
[00:58:17.66] AUDIENCE: Hi, I actually have a couple of questions. Let's see if we have time for that. So one is this last story that you told about just a few novices making this system in 12 weeks or whatever, combining it out of essentially built boxes of AI-- would the next logical step be actually having AI system combine those boxes? So that's one question.
[00:58:43.58] And the next question was about earlier in the talk, you were talking about power management and how important it this. What's a comparison between power used by one of those decision making systems and say, human brain? And are you looking at utilizing some biological components to augment that?
[00:59:06.40] ANDREW MOORE: Very good, those are good questions because they've both got pretty short answers. On the brain power issue, human brains are still the least-- I can say with confidence-- at least three orders of magnitude more power efficient. I think it may be more like six. So there's something going on in our brains which is completely beyond any technology that we know about.
[00:59:30.52] The first question of using AI is to strap together these modules. I actually have not thought about it on that level. It's actually well worth trying. And I bet Manuela is going to strangle me for what I'm about to say. I think it's too difficult. The reason is that that high level of system design-- we're using aspects of intelligence, such as analogical reasoning and sort of higher first order logical reasoning, which we humans can do well, but we AI researchers have not managed to come up with a satisfactory answer for. Manuela, do you agree 100% or only 98%?
[01:00:14.88] MANUELA VELOSO: I agree 10%.
[01:00:16.36] ANDREW MOORE: OK. So you've got two answers there.
[01:00:19.64] AUDIENCE: In the case of [INAUDIBLE]?
[01:00:22.76] ANDREW MOORE: Yes, exactly-- something which I'm extremely pessimistic about.
[01:00:26.90] MANUELA VELOSO: [INAUDIBLE]
[01:00:29.30] ANDREW MOORE: That's right.
[01:00:32.18] AUDIENCE: Hello. I have a question regarding the hardware. Do you think that we will need some kind of significant hardware breakthrough before we can truly move forward with the software of AI? Or do, you think that the current silicon transistors are enough, even though Moore's Law is reaching its limits?
[01:00:53.71] ANDREW MOORE: I do think that we're actually in a place of technology where if there were no further innovations on hardware, we could spend the next 15 years coming up with really powerful algorithms based on the current set of hardware. But as I described earlier, things which are-- for example-- managing a whole city during an emergency, we do not have the bandwidth or the compute at the moment under reasonable power assumptions to be able to do it. So that one seems really important. Also, the complexity of what we humans do-- I don't know how we're going to implement that. But my gut tells me that we'll be using millions of times greater computational power than we're currently using.
[01:01:39.08] MANUELA VELOSO: Last question.
[01:01:46.67] AUDIENCE: I guess my question is almost a little game theory, in a way. Because you mentioned about reducing the battery output needed for these systems. Isn't battery food to a computer? And if we get to a point where they can be almost infinitely powered, isn't that a risk in and of itself? I mean, won't we also-- as the battery becomes more efficient-- also have to restrain the battery to make sure that we have a way to reign in AI? Because that's the only thing we have to keep them from doing anything, that they need power from us, right?
[01:02:19.59] ANDREW MOORE: That's a good point. So I do think there's an issue there. Before I'm worried about the algorithms turning against us, I've got to worry about nation state adversaries building evil AIs. That is a very real issue. And if-- just in case there's a public aspects to this talk, I'm not going to go into specifics. But you could probably all imagine or try to imagine what you would do if you were evil and actually said, I'm going to use AI to disable an economy of a country, maybe with certain kinds of robots.
[01:03:00.21] It is absolutely the case that-- in my opinion-- after the first one of these things has happened that there will be much more regulation on the use of GPUs, TPUs, or other bits of hardware involved. I'm not sure if it's power that's the issue. But misuse of AI by humans is going to happen. It's going to be disastrous. And it will happen probably several decades before even beginning to worry about the AIs doing it.
[01:03:26.63] MANUELA VELOSO: Thank you very much. [INAUDIBLE]
[01:03:28.92] ANDREW MOORE: Thank you.
Impact of AI and Robotics: Advances and Aspirations
October 23, 2018 - New York City
Daniela Rus is the Director of Computer Science and Artificial Intelligence Laboratory, and Andrew and Erna Viterbi Professor in the Department of Electrical Engineering and Computer Science at MIT.
[00:00:00.84] MANUELA VELOSOS: I've known Daniela for many years. She did her PhD at Cornell University. And then she went to Dartmouth, where she pursued her research on distributed robotics, created a great lab. I visited a couple of times in Dartmouth. And then she moved to MIT where she was a professor and then became the head of the CSAIL, that famously CSAIL. And Daniela's been the head for seven years. And she has made a lot of contributions as both research and also as the head of CSAIL.
[00:00:35.64] It is a great pleasure to have her here with us. And she's going to tell us about the impact of AI and robotics advancements and aspirations. And I welcome also all the people that are watching on the webcast. And let's enjoy this talk. Thanks a lot, Daniela.
[00:00:56.79] DANIELA RUS: Thank you so much, Manuela for this very generous introduction. I am so delighted and honored to be the first speaker in your distinguished lecture series. So today, I want to tell you about some of the extraordinary things we are experiencing these days in both academia, business, and industry. So think about it. Today, doctors can connect with patients, and teachers can connect with students that are thousands of miles away.
[00:01:26.19] We have robots that help with packing and factories. We have 3D printing that creates customized goods. And we have network sensors that monitor facilities. We are surrounded by a world of opportunities. And these opportunities will only get bigger as we imagine the impact of the latest and greatest AI and robotics technologies.
[00:01:46.53] So picture this. Picture a world where routine tasks are taken off your plates. Garbage bins take themselves out. And automated infrastructure ensures that they disappear. And food gets delivered to your doorsteps. Fresh fruit and produce gets delivered by drones. And intelligent assistants, whether embodied or not, help you optimize all aspects of your life to ensure that you live well and you work effectively.
[00:02:17.61] How are we going to live in this future? And what are the supports for taking us to this future? Now AI provides support for cognitive tasks by providing autonomy at rest and support for physical tasks by providing autonomy in motion. And I would like to talk about these two topics for the rest of the talk.
[00:02:42.72] Together in the future, these technologies have the potential to eliminate traffic accidents, to better monitor, diagnose, and treat disease, to keep your information private and safe, to ensure that people connect and communicate instantaneously, no matter what language they speak, in general, to take care of the routine tasks and leave people to focus on critical thinking, strategic planning, the kind of thing all of you guys like to do.
[00:03:13.29] Now, progress in these areas is enabled by three interconnected fields. Robotics puts computation in motion and gives machines the ability to move. Artificial intelligence gives machines intelligence that enables machines to see, to hear, and even to communicate like humans. And machine learning cuts across AI and robotics and aims to learn from data and make predictions.
[00:03:40.22] Now, progress in the AI is enabled by a convergence of three things, algorithmic advances, very important. People often talk about the explosion of data and the increase in the power of computing. But without the algorithms, we would not have anything. So these three pillars, algorithmic advances, data, and computing, are enabling this huge progress that we're seeing today in practically any field that has data. So all companies, all industries that have data can benefit.
[00:04:15.34] And the benefits due to machine learning, which refers to a process that starts with a body of data and then aims to learn a rule or make a prediction about future use of the data. And medicine is a great example of a field that can benefit. Today, machines can read more radiology scans in a day than a doctor will see in a lifetime. Think about that.
[00:04:39.42] Now, in a recent study, doctors and machines were given scans of lymph node cells, and they were asked to label them, cancer or not cancer. The humans made 3.5% error as compared to the machine 7.5% error. But working together, the doctors and the machines achieved 0.5% error, which if you think about it, it's a significant reduction, a significant percentage reduction in the air. And it's extraordinary.
[00:05:07.84] Now, these techniques are currently employed by the most advanced treatment centers in the world. But imagine a day when all doctors have access to these techniques, doctors in rural areas, or doctors that are overwhelmed with work and don't have time to stay on top of the latest and greatest clinical trials can offer their patients the most advanced results by taking advantage of AI solutions that will, in principle, bring the most relevant information to the patient for the doctor to make the decision.
[00:05:42.91] Now, machines and people working together can do so much more in finance. I like to think about a machine as kind of an intern running around and doing errands for you. And when the intern finds a good pattern or something interesting, that pattern is brought up to the human to act on, to make a decision about. And so thinking about using these techniques to improve the conversation between people, to improve how you organize a portfolio, or to even make a prediction of what might happen to the markets are extraordinary possibilities. And if you bring blockchain into the whole system, then you ensure that all the processes done by the machine can be checked, can be trusted.
[00:06:36.70] One more example, machines and people working together form better lawyers. So word processing, internet, and email have already revolutionized how we draft documents, how we look up information, how we exchange information. And the next wave of technologies, which is getting really, really good, is natural language understanding. And with natural language understanding, we are able to get machines to read and remember and interpret entire libraries of documents. And so think about how useful that is for lawyers who need to know stuff about thousands and thousands of documents that are all very large.
[00:07:15.49] And so again, the idea is that the machine could bring the right information at the right time. And yet machines are not able to be lawyers, because they cannot write compelling briefs. They cannot counsel clients. And they cannot persuade judges. But they might be able to support in predicting the decision that a lawyer might take.
[00:07:39.83] So my last example is about traffic. And this is actually from my research group. We built a new algorithm that matches supply and demand. And this can be applied to any field where this is a problem. We applied it to the traffic in New York City. And we have shown that, with our algorithm, we can reduce the number of taxis required to meet the 400,000-plus taxi rides requests a day with 3,000 vehicles. So think about getting rid of 11,000 taxes that are constantly roaming.
[00:08:16.83] So the taxes are not like my car. My car drives to my parking garage, stays there for 10, 11 hours, and drives back. But taxis provide constant movement in the city.
[00:08:26.45] And the catch is that these 3,000 cars, if you're taking a ride-- let's say I'm going from here to the airport, to Laguardia, if somebody is at the street corner and is also going to Laguardia, I have to make room for that person in the car. But I only get to share with up to three people, four people total.
[00:08:50.87] So think about the benefits to the city in terms of traffic, in terms of lower pollution, and in terms of noise. And all this introduces only less than three minutes delay in arriving at your destination. And this does not take into account the improved traffic you get by removing 11,000 cars from the roads.
[00:09:13.91] So all of these examples are extraordinary. And many of the advances you hear about in machine learning today are due to a technique called deep neural networks. And in deep neural networks, you have very large computer networks, usually with millions of nodes. And millions of manually labeled data items are presented to the network to figure out the weights of the nodes inside the network.
[00:09:42.34] And so for instance, in the case of a network that processes images, a person might label this picture as beach, palm tree, and sea. And this is so that when another similar image gets presented to the network, the network says, ah, this is a picture of a beach.
[00:10:00.38] So this process with images works in two steps. Given the photograph, the network has to find out which pixels go with what objects. This is called image segmentation. And we have very good algorithms to do that automatically. Now, once you have segmented the image, you to add labels to the objects recognized in the image. And this is actually very challenging.
[00:10:27.69] So for instance, in this case, you might want to say you have a building, sky, and car. And you can do the same thing with more images and even more images. And when it comes to labeling images, this is how it's done. So when it comes to labeling images, lots and lots of people sit across the world and manually say this is a car, this is a building, this is a road.
[00:10:54.15] So when we think about machine learning, we have to consider what it means for the machine to learn. So for instance, when we say that the network has learned that this is a picture of a beach, what this means is that the pixels that form this picture look the same as the pixels in other images that a human being said this is a beach. The system has no idea what the beach represents. It doesn't know what we do with it. Do we eat it? Do we drink it? Do we play on it? What do we do with a beach? How does the beach feel? What is the purpose of the beach? How much does it weigh? None of this is part of what the system learns.
[00:11:37.19] And it's important to remember this, because we tend to anthropomorphize machines. And when we say the machine has learned, we tend to imagine what a human might have learned at the same time. So it's good to keep in mind these issues.
[00:11:52.22] Furthermore, neural networks are not perfect in how they produce their results. So have a look at these two pictures. They're two pictures of two dogs. Do these dogs look the same to you? Yeah, they look the same to me too. But in fact, they're slightly different. The second picture was obtained by injecting a little bit of error. And you can see the noise that was added to the picture. We can't detect that with the naked eye. And yet, this little error is enough to trick the network from saying this is a dog to saying that the second picture is an ostrich. And if I play around with the shape of the error, I could get the second picture to be a car or a chair or anything I want. So while machine learning is making great progress, it's really important to keep perspective on what it can and cannot do for us.
[00:12:49.17] Furthermore, have a look at this video. So this is a video of a child, an 18-month-old child that's watching this scene for the very first time. It has never seen this scene. And look at what this child does. The child has figured out that the adult needed help and has figured out an action to support the adult with help. Our machines are not able to do anything close to this. But this is an important and interesting grand challenge for the future of artificial intelligence and computation. Can we understand what is going on inside the brain of this 18 months old and create machines that are engineered for similar levels of cognition, for similar cooperative behavior?
[00:13:44.97] So machine learning is changing the world. And it's enabling so many extraordinary applications. But there are significant challenges. Fields that have data, that have lots and lots of data benefit much more than feels that don't. So obtaining massive data sets is a challenge.
[00:14:02.09] The image data sets contain tens of millions of images. In fact, image recognition took off about five years ago when ImageNet reached about 10 million images. Today it's much higher.
[00:14:21.27] Data labeling is also a challenge, because these tens of millions of data points have to be manually labeled. And then it's important to know that the results that come out of machine learning systems are not easily explainable nor generalizable.
[00:14:36.39] Finally, your answer is only as good as the data you have to train your network. And so if your data has bias in it, then your results, the results of the network will have bias. And we have to keep in mind what it means to learn.
[00:14:52.02] A few more challenges. Most of today's machine learning solutions are one-off solutions. And machine learning is usually done by experts only. So the program that plays Alpha Go with you is not going to be able to play chess or poker with you, whereas a human would very naturally switch from one to the other.
[00:15:10.05] Also just crunching the data does not mean you have knowledge. And making complex calculations does not mean you have autonomy. And so we have to keep in mind where are the benefits. And we have to think, what problems should we work on in order to expand the capabilities of these techniques.
[00:15:33.12] Still, we have a lot of opportunities with the technology we have today, as we have seen in the area of medicine, of transportation, of finance, of law. And these opportunities are around personalization and customization, around using natural language to interpret knowledge, to interpret what is in our libraries, and to bring that information at our fingertips. And altogether, with machine learning, we can really increase the quality and the efficiency of the time we spend doing various tasks, mostly low-level tasks, routine tasks. But this is all about machines and people working together.
[00:16:19.21] So let me switch gear and say a few words about autonomy in motion. And as I think about this, I observe that our world has been so changed by computation. Just try to imagine a day in your life without your smartphone, without the web, and everything they enable, no social media, no online shopping, no email and texting, no digital media. It's incredible to think about what this might be like. But I will tell you that 20 years ago, which I remember, we didn't have any of this.
[00:16:58.97] So in a world so changed by computation that's helping us with all these different tasks, what might it look with robots helping us with physical work? So I believe that autonomous driving is absolutely going to ensure that there will be no road fatalities. And it will give our parents and grandparents much higher quality of life in their retirements. And it will give all of us the ability to go anywhere, anytime. It is not a matter of if. It is a matter of when, in my opinion.
[00:17:31.74] So how much work should we be able to offload to machines? So imagine driving home from work knowing that your car has the intelligence to keep you safe and the smarts to make that ride fun. Let's say you have to pick up supplies for dinner. Your car pulls over at the nearest grocery store where you hand the dinner manual to a robot at the door. This robot connects with your home. And the home figures out what items you're missing. And a few minutes later, a box gets presented to you by another robot. And when you get home, you hand the box to your kitchen robot. And you might even let your children help with cooking, because even though they make a mess, your home cleaning robot will clean up the mess.
[00:18:17.06] Now, I know what you might be thinking. You might be thinking that this sounds like one of those cartoons about the future that never comes to pass. But that future, in my opinion, is not that far off. Today, robots have become our partners in domestic and industrial settings. They work side by side with doctors in hospitals. They work side by side with workers on the factory floors. They mow our lawns. They vacuum our pools. They even milk our cows. And in the future, they will do so much more for us.
[00:18:50.78] But in order to get to this future, we really need to build the science of autonomy. We need to improve our robots. And that means making robots that are much more capable of figuring things out in the world, making the whole process of creating robots faster, and making the process of interaction between robots and people much more intuitive.
[00:19:14.69] Now, the first two are very important, because each machine is made of a body and a brain. And for any task, the machine has to have the body that is capable of that task and the brain that is capable of controlling the body to do the task. A robot that rolls on wheels is not going to be able to fly, nor is it going to be able to climb stairs. So you see, the body and the brain together are very important in thinking about machines.
[00:19:43.91] So let me spend the rest of the next segment talking about some examples of making better brains, better bodies, and better user interactions. Let me start with brains. And I will say that today, robots have a limited ability to figure things out. Most of their interactions are fairly carefully specified. And in some cases, you have limited adaptation. But robots do not have a general ability of figuring out what is happening around them. And I would like to use the example of self-driving cars to illustrate this point.
[00:20:23.78] This the recipe for making a self-driving car. It's the recipe that's used by all the companies and most research groups around the world. So autonomous driving usually works in a closed environment. In the first phase, the robot drives on every road in the environment, makes a map. And that map is then used to plan paths and execute the parts when ride requests happen.
[00:20:51.06] So here's how you might take your own car and turn it into a robot, if you like. You start with your car. You add sensors, usually laser scanners and cameras. And then you'll write some code. You'll write some code to make maps, to identify obstacles in the maps, and to label what they are. And you also write code to figure out where the new obstacles that were not there when the map was made are located.
[00:21:20.36] And then these maps can also be used in real time to help the robot figure out where it is, because using sensors, the robot makes a profile of the road at the current moment and compares that with a map. This allows the robot to figure out where it is. And then planning and control enables the robot to figure out how to go from one location to another and how to execute that path.
[00:21:47.18] So this is the recipe. And by the way, this is one of our most recent cars. At MIT, we have a whole suite of vehicles for autonomous driving, ranging from wheelchairs, golf carts, and cars. And they were all done with the recipe I showed you.
[00:22:04.54] Now, here is the problem. So generally, the sensor pipeline for autonomous driving has a bunch of steps. And the steps in the middle are usually very carefully and manually fine-tuned. For every instance, we will usually put code, or we specify the parameters that are needed in order to cope with that instance. And this necessarily limits what the robot is able to do. And in particular, it is very hard to deal with nighttime driving. It is very hard to deal with situations where we do not have maps for the road. And it is very difficult to deal with rainy or snowy weather. Rain and snow is actually a deeper problem, because the sensors we have today do not work well in rain and snow.
[00:22:53.84] So what can we do? Well, here comes machine learning where, instead of manually configuring everything that goes in the planning and reasoning system of a car and trying to anticipate all possible road situations that would come to the car, what we can do is we can try to learn how to drive by watching a human how to drive. And in our research, we are asking whether it is possible to learn how to steer the car by looking at single images of the road and watching what a human driver has done in those instances.
[00:23:32.02] So the answer is yes, it is possible to make driving more flexible. And here is our car that's taking a first ride on a country road after being trained by human drivers driving in Cambridge, which is very urban as compared to this road. And so in red and blue, you see what the car should be doing and what the car thinks it should be doing. And you can see that the red and blue arrows are very close together for the shorter distance. But they get further apart as you go further out. But eventually they converge.
[00:24:10.66] And to be honest, I'm really delighted by this first drive of our car. If I think about my own first drive, it was not this smooth. So I am I'm very encouraged by the possibility of using machine learning to move away from having to manually code all the parameters that go in the reasoning engine of a robot.
[00:24:36.49] Now, we can pose similar questions with respect to other robot tasks. So a similar approach was done to teach this robot how to do this task. And I might ask you how many robots does it take to screw in a light bulb. And in this case, the answer is one, but it needs machine learning, and it needs soft hands. And I will tell you in the second what those soft hands are. If you look at these robot hands, they're very compliant. They look a lot like the human hands, unlike today's industrial manipulators, which are very rigid and hard.
[00:25:16.13] So the other thing to notice is that the robot grasps the bottom using one type of grasp and the bulb using a different one. So how should the robot learn how to do that? This is a very important aspect of grasping and manipulation. And it turns out that a very similar approach to the one we have developed for the cars can be used to learn how to grasp objects. And this can be used to teach a robot the approach direction. So should I go this way or this way or this way? And also the pose of the gripper when the robot grasps.
[00:25:55.70] And what's kind of exciting about grasping, especially with soft fingers, is that soft fingers wrap around the object they are grasping. So the soft fingers do not need to know accurate models of what it is that they're going for. And also with soft fingers, we don't need to know exactly the location of the object. So we can do much more compliant, much more error-tolerant behaviors.
[00:26:22.60] Furthermore, what matters mostly is the aspect ratio of the object we are trying to do. So it turns out that for this particular approach, grasping, we do not need millions of examples to train the robot. We just need the examples of the critical aspect ratios that result in different approach directions and in different poses. And if you specify that any object that fits in the enclosing box of that aspect ratio is in some sense considered trained for that particular task.
[00:26:56.98] So here it is. You can see the robot apply using different approach directions and using different grasp poses. And this was all trained using not 10 million examples but using a very small data set. But again, the data sets represented classes of different possibilities for grasping. So you see, it's very exciting to think about bringing machine learning into robot control systems, because through machine learning, we are really advancing and moving away from manually coding all the parameters that go in the brain.
[00:27:43.42] And I should tell you that in fact, Manuela has been quite the pioneer in this. She figured this out way before the rest of the world. So Manuela has been telling us to do this for many, many years. So I'm very excited and optimistic about the possibilities of making the robots much more capable of figuring things out in their surrounding environments.
[00:28:12.58] Now, let me say a few things about the robot bodies, because as I said earlier, a robot really needs to have a body capable of the task that the robot has to do. And right now, I would say that designing new robots is kind of the way things were with programming before we invented the compiler. So every robot is designed from scratch. Every robot is designed in a very bottom-up way. And we have mostly spent the last 60 years of industrial robotics thinking about robot bodies that are either inspired by the human form, so humanoids or robot arms, or robots on wheels.
[00:28:56.14] And occasionally, we had some inspiration from nature. But nature is so much broader. There are so many more things we can keep in mind in terms of robot bodies and shapes.
[00:29:08.32] And so a question is, can we speed up how we design and we fabricate new robots. Can we imagine a future where anybody can make a robot? So let's take Alice, for example. And let's say we want to give others the ability to automate tasks in her home. And Alice works. So let's say Alice wants a robot to play with her cat while she's at work.
[00:29:31.03] Well, to do so in this near future, Alice could head to a new type of store called 24 hour robot manufacturing where, equipped with an intuitive design, Alice could figure out the shape of a machine she wants for the robot. And once she settles on the design, the store could make the robot overnight for a very low cost. And now the cat has a playmate.
[00:29:57.01] So how crazy is this idea? Can you imagine a future where we could specify the function of the robot? Let's say I want a robot to play chess with me. And from this natural language specification, can we imagine using natural language understanding to parse a specification, to identify what behaviors are needed for playing chess? We have to be able to pick up a piece, to move it from here to there, to not to knock off other pieces. And then, using databases of available mechanisms, synthesize a device that is able to do exactly the behaviors that are needed for the task.
[00:30:37.09] And then we'd like to do something very simple like printing to create the robot. And so here is the robot that plays chess with you. Now, this robot was not developed starting with a natural language specification. But many parts of making this robot were in fact automated. And if you're interested in how it was done, I can tell you later in great detail.
[00:31:03.49] But here's the general idea. If you have a body shape, and you want to create that body shape, you can use off the shelf technologies to turn a photo into a mesh and then to take that mesh and unfold it, either by slicing it or by an origami unfolding process to the point where you end up with a flat representation of the face that can be folded into the 3D object. And with that, you can then create a compilation system that starts with a picture and creates an actual robot that turns that picture to life, like this bunny.
[00:31:47.62] And I believe that with this approach, we can stretch our minds to think about all sorts of objects becoming roboticized. Imagine if we can awaken many of the objects in our surrounding world and turn them into a type of robot, into a machine that can exert work for you.
[00:32:10.33] So for instance, we could ask ourselves what might the Sydney Opera House look like if it were a robot. Can we have sound? Can we increase the sound? And this is what the Sydney Opera House would be if it were a robot.
[00:32:25.72] [OPERA MUSIC]
[00:32:28.22] [SINGING IN ITALIAN]
[00:33:06.17] So we can get the Sydney Opera House to make itself. Well, we haven't quite gotten to that point. But here is a robots that starts as a piece of plastic that was designed like in the case of the robot bunny. And when exposed to heat, this robot can grow into a fully fledged three-dimensional object that can move around and can do all sorts of interesting things.
[00:33:34.83] Now, we can make these robots at any scale. But it turns out that even at the small scale that you see in this picture, which is centimeter scale, these robots can do interesting things for us. We're looking to use these kinds of robots, which we call origami robots, to enable a better future for medicine, to enable the creation of mini-surgeons that will provide surgeries without incisions, without pain, without physical infections.
[00:34:12.44] And the particular surgical task we looked at is whether we can use such robots to remove button batteries that people accidentally ingest. And the reason button batteries are dangerous because the acid in the batteries pierces a hole in the stomach very quickly. Like in half an hour, you'll get the battery to be fully submerged. So you want those out today. You have to have surgery for that.
[00:34:42.26] But now, imagine taking that little origami robot, compressing it, and surrounding it in an ice, so putting it inside a pill-shaped ice. The patient could swallow the pill. When the robot gets in the stomach, the ice could melt. The robot could deploy. And then with a little magnet embedded in its body, the robot could go and remove the foreign object using feedback from an fMRI-like machine.
[00:35:15.56] So here is the robot. And here's how it goes. And it pulls the battery. And now the whole thing can be eliminated through the digestive system. And later on, the robot could be sent back in the stomach. And now the robot could serve as a patch, or it could have medicine in its body. And it could be directed to the location of the wound to deliver the medicine in a very precise way. Think about doing that with cancers where, right now today, you expose your whole body to the cancer medicine.
[00:35:51.08] So it's kind of exciting to also think about the fact that these robots are actually made out of food. So we think of robots as being made out of plastic or hard plastics or metals. This robot is made out of sausage casing. And it's digestible. Nevertheless, it's still a robot. So there are so many possibilities for using these technologies to enable a future for medicine that does not require incisions for every procedure that does not lead to the risk of infections and does not give physical pain.
[00:36:30.28] Now, I want to spend the last part about robots looking at machine interactions. And when we talk about machine interactions, we usually talk about interactions between machines. And in fact, Manuela is quite an expert on this. She has been the reigning champion of RoboCup for many, many years, where her machines worked together to win against other machines. I'm going to let her present next time on this topic. Today I will focus on intuitive interactions between machines and people.
[00:37:06.01] And I want to go back to my autonomous driving example. And during Q&A, we can talk about where we are, really, with autonomous driving, because we are not ready for level five autonomy. But we are ready for some applications in level four autonomy.
[00:37:21.76] But another interesting application of all of that know-how, all of those algorithms is in aiding visually impaired people and blind people to experience the world in ways that is unprecedented. There are tens of millions, hundreds of millions of visually impaired people around the world. And today, the most they have is the white walking stick that gives them one point of information about the world.
[00:37:47.41] Close your eyes, and try to imagine what it would be like to enter into this auditorium and find a seat and determine whether it is empty or not using a white walking stick. Well, in today's age of technology where we count our steps and we send machines to other planets, we should be able to do better than the walking sticks.
[00:38:10.42] And here's my idea. So you take the technology of a self-driving car, and you map it onto wearables. So you take the laser scanners, and you make them into laser belts that are backed by vibrating motors. Then you take the cameras and you make them into cool necklaces. And then you sew all the computers and everything else inside the clothes. And with this, you have essentially the hardware of a system that can look at the world, map the obstacles in the world, and give you a vibration when you're close to an obstacle.
[00:38:38.95] So let's say I'm close to this wall. This side of my belt will vibrate. And also, through a Braille belt buckle, the system could also talk to me. It could tell me words. So with this technology, you could then walk down Fifth Avenue and describe the fabulous window displays. Or you could alert the person to obstacles. Or you could say, hey, your friend Alice is walking by, or there's a cat next to you.
[00:39:09.04] And so this kind of solution is really close to being used, to being deployed. And here's an example of an implementation on our user at MIT, where the system aids the user to walk down hallways, to go down steps without bumping into any obstacles. The system is able to guide the user to a bench where he could sit there and wait for the ducks and feed the ducks. This is an incredible increase of quality of life. And so you see, with AI and robotics, we really have the opportunity to make the world better for many people.
[00:39:57.07] With AI and robots, we can also begin to dream about the future of manufacturing, where robots are no longer separated from people on the factory floor. And here we show an example where our human is collaborating with a robot. And using sensors that monitor muscle activity, the robot essentially adapts to what a human is doing. And so by stiffening and relaxing the muscles, the human can communicate to the robot to stiffen or relax its own grip. So in this case, the muscles are stiff. The robot is stiff. In the previous case, the robot was quite limp.
[00:40:37.37] So we can begin to think about interesting sensors that we could put on our bodies. And I suppose the ultimate question is can we go directly from our head to the machine. And the answer is no. We cannot do that in general today. The sensors we have available for that are EEG caps that are quite sparse, 48 sensors distributed on a cap. And they measure electric activity in your brain. And most of the times, what you get from these sensors is really very complex and cannot be decoded.
[00:41:15.74] But it turns out that there is one signal that, with machine learning, we can detect quite accurately. And this is a signal we all make. It's not trained. It doesn't matter what language we think in. You know what the signal is? OK, this is the you are wrong signal. It is called the error related potential. It turns out the you are wrong signal is something very strong and profound that we all feel. And it's a localized sensor. And it has a unique profile. And it can be detected.
[00:41:48.15] So with the you are wrong signal, we can watch machines. And we can tell whether they're performing well or not. And so to demonstrate that this is a possibility, we have--
[00:42:00.73] [RHYTHMIC DRUMMING]
[00:42:01.48] Could you put the sound down?
[00:42:04.43] So we can watch robots and correct their mistakes in real time. These signals can be detected in 100 milliseconds. And here's a user that's watching a robot. The task of the robot is to sort paint cans in a bin labeled paint and wire spooling in a bin labeled wire. The robot is presented these objects in a random order and then randomly goes one way or another. And so here, the robot went to wire and then moved right back, because the human said that is wrong. You see? So it went, and the correction was done instantaneously here it went directly to paint in that was correct
[00:42:42.86] now here's the wire. It goes mistakenly to the paint box. And the robot gets directed by the person to shift over.
[00:42:54.11] So it is really exciting to think about a future where machines adapt to us rather than the other way around. Today's machines do not adapt to us. We actually have to code them. But imagine advancing our know-how, our technologies to the point where we reach that place. It's not quite there yet.
[00:43:18.25] OK, so we talked about AI and its possibilities. We talked about robots and its possibilities. I want to address an elephant that's always in the room when I talk about these technologies, which is jobs. And usually when I tell people what I do, I get one of two reactions, either people start cracking jokes about Skynet and ask me when the robots will take over their jobs, or people ask when their cars will become self-driving. So I believe that, of course, our cars will be self-driving too. I'm very excited about the technology.
[00:43:54.68] But we have to understand the fears of the first group. We have to understand how to provide alternatives for how to see things differently. And this starts with understanding that AI and robotics are tools. They are tools that are not inherently good or bad. They are tools. And they do what we choose to do with them.
[00:44:15.17] And today they can mostly do routine low-level things. So if you think about four classes of jobs in terms of how much cognition and how much manual unskilled labor there is, I will tell you that the jobs at the top and at the bottom are too hard for machines today. It is much easier to send a robot to Mars than it is to get that robot to clear a tabletop. And likewise, robots are not going to make the decisions in finance or in medicine or in any other field. But even the jobs in the middle, like accountancy, where there's a lot routine activity, even those jobs have critical aspects that cannot be done by machines. So for instance, the accountant has to meet with you and discuss with you. And that the machine cannot do.
[00:45:05.24] So it is better to think about what we can automate in terms of the tasks that we do than in terms of professions that disappear. And here you see an excellent study from McKinsey that shows a variety of different jobs and the amount of time people spend managing others, applying expertise, doing stakeholder interactions, unpredictable physical work, data collection, processing of data, unpredictable physical work.
[00:45:32.09] It turns out that the tasks that can be automated today with our level of technology are the data tasks and the predictable physical work tasks. And I'm actually quite excited when I think about the future, because we can spend a lot of time analyzing these graphs and thinking about what might go away. But it's very difficult for us to imagine what will come back.
[00:45:55.62] So for instance, in the 20th century, agricultural employment dropped from 40% to 2%. But nobody predicted the growth in the service jobs. And similarly, when the airline industry took off, the jobs in the airline industry increased, and the jobs in the train industry decreased.
[00:46:21.75] So I would like to say that as we advance technology, we create new jobs. And we have no idea what those jobs might be. In fact, I'm sure most of you can remember 10 years ago. I remember very well 10 years ago. There were no smartphones. There was no social media. There was no cloud. So all of these are sectors that are employing a lot of people today. These jobs did not exist 10 years ago.
[00:46:50.43] So getting there, though, requires that we think about whether we train our kids with the right skills. And we have an education problem, both short-term and long-term. Long-term, if we start teaching computing-- in fact, I like computational thinking and computational making. If we do this from early on, then we will end up with a graduating class in 10 years time where everyone can participate in the IT economy.
[00:47:19.92] In the meantime, we can take lessons from companies like BitSource, which is a startup in Kentucky, a very successful startup. It's been training coal miners to become data miners. And it's one of the great successes in Kentucky.
[00:47:36.72] So I'd like to end by a few reflections on our future with robots and machines. This is the first industrial robot. It is called a Unimate. It was introduced in 1961. By 2020, we will get to 31 million industrial robots, so from one to 31 million. These industrial robots are masterpieces of engineering. They can do so much more than people do. And yet they remain isolated from people on the assembly line, because they are large and dangerous to be around.
[00:48:09.08] But in fact in nature, organisms in nature are soft and compliant and much more dexterous. And so look at this example of an octopus bending and twisting to escape through a narrow hole. Or look at this gentle elephant that is able to pick up the banana from the child. And yet, the elephant can use the trunk to fend off a competitor.
[00:48:37.81] We can begin to think about new materials and new approaches for making robots in a way that enables them to be safer to be around, in a way that is much more inspired by how nature forms its organisms. And I wanted to show this example of one of one of my beloved robots called Sophie. And here's Sophie slimming side by side was with robots in the natural world. And if I didn't tell you that this was a robot, maybe you could indulge me and say, oh, yeah, it looks like a real fish.
[00:49:13.30] So with the development of soft materials, machines and materials are getting closer together with machines getting softer, more like materials, and materials getting much more intelligent, more like machines. And so this raises an interesting question. What is a robot? What kind of materials we use in order to make robots?
[00:49:36.58] And so I would like to propose that we should expand our view of what a robot is so that the next 60 years will usher adaptive soft machines that could work side by side with humans and could give us pervasive support of machines and form diversity. And here's how this future might look like. So imagine waking up enabled by your personal assistant that figures out the optimal time when you wake up and helps you organize the outfit you want to wear and also what you might need for work.
[00:50:15.22] On your way to work, you walk about by beautiful stores that display your own image was the latest and greatest fashions. And when you walk inside the store, your body gets scanned, and you get bespoke shoes and bespoke clothing done right away. And materials that have sensors and computation embedded in them and even the ability to reprogram what they look like might even allow you to match your outfit to your friend's outfit you might come across in the store. So here they're matching the outfits.
[00:50:55.04] And at work, intelligent rooms might notice that you're stressing out over a meeting and adjust the temperature accordingly. And intuitive design interfaces might allow you to connect with your colleagues present virtually far away to design, let's say, the next flying car. And this flying car could be connected to the rest of the infrastructure and to your home to become them the ultimate assistant in managing your transportation and your chores.
[00:51:35.65] So for instance, in this case, Alice's mother gets the task to pick up some plants. And these plants are for Alice's grandmother who's planting a garden. Your packages could arrive delivered by robots. The garbage bins could take themselves out. The bikes could have adaptive wheels that give you just the right level of support when you want to exercise. And while the robot digs in the garden, you can stop to have a nice conversation with your grandmother.
[00:52:12.28] And when the end of the day is near, after a good day, is it time for a bedtime story, which allows you to enter the story and interact with a dragon. So these advances in the science and engineering of autonomy can create an extraordinary future for all of us, with machines taking on increasingly more difficult tasks, machines allowing us to focus on what we find exciting and interesting and also, frankly, each other. So I'm happy to take questions thank you very much.
[00:53:13.75] SUBJECT 1: Oh, thanks. Yeah, so the opera house was amazing, obviously. I was wondering if you could talk about some of the ethical considerations around the future of robotics and AI.
[00:53:25.03] DANIELA RUS: Yeah, it's very important. And it's something that a lot of people in the field have began to think about. It's kind of interesting to think about where we are with computer science. Physicists had to face the consequences of their science. Chemists had to face the consequences of their science. This is the time for computer science to look at all the good but also to think about consequences. And this is in privacy. This is in security. This is in ensuring that all these tools are used for the greater good.
[00:53:56.66] So I would tell you that I don't have a crisp answer. But I can tell you that Harvard has started an amazing, inspiring ethics program in their classes. Every computer science course at Harvard has an ethics module associated with it.
[00:54:12.34] At MIT, we recently launched the College of Computing, where ethics and applying AI for the benefit of society is very important. Stanford just did the same on Friday.
[00:54:26.65] So there is a lot of desire in the community to get that question right. Now, getting that question right requires academics, requires industry leaders, business leaders, and policymakers to come together. And I will tell you that from my point of view, when I think about deployment of AI technologies in the world, I really believe that it is important to put in place what it takes to ensure the consumer confidence in the use of those systems.
[00:54:58.60] And so what does that mean? Well, you want to ensure fairness. So you can think of fairness as an attribute that cuts across all industries. But maybe each industry will have a different way of measuring fairness with respect to manufacturing or finance or transportation. Or additionally, maybe we want the systems to be able to explain themselves.
[00:55:21.01] So in finance, if I did not get a loan, why not? Or in medicine, if I got a diagnosis, why? Or in transportation, if there was an accident, what caused that accident? So I think that we have to identify a certain set of attributes that are generic, that cut across many industries. And then each industry could take those attributes and work out the checks and balances with respect to those attributes.
[00:55:46.33] And some of the ideas are fairness, explanation, accountability, data provenance, eliminating bias, et cetera. This is all ongoing research. And I'm sure that with so many minds focused on these problems, we will get some good solutions.
[00:56:08.09] SUBJECT 1: Thank you.
[00:56:18.04] SUBJECT 2: I know a barrier to entry, a lot of times when you talk about robotics, is the programming element of it. How far away do you think we are from maybe a vocal programming language so that people can just talk their instructions or train through vocal elements instead of actual hardcore programming?
[00:56:32.68] DANIELA RUS: Yeah, so natural language understanding has advanced significantly. And there are a number of researchers who are advancing the use of natural language to guide robots or to task robots. And some things are easy. So if I want to give my robot very simple tasks, like go to the door, go out the door, we can do this already.
[00:56:56.29] But there are other things that are much more challenging. So for instance, in our autonomous driving project, we really want to get the robots to get better situational awareness, the robot cars that is. And you can imagine the human talking with a car and describing the scene. Oh, we're going past the church, and we're down Madison Avenue. So you can imagine that there is a lot of information associated with speech, with the description of a scene. But right now, we do not have a good connection between the visual stream of the robot and the speech part.
[00:57:35.08] And so thinking about more complex interactions where you actually do want to connect the context into the robots' action is much more difficult. So you can talk with a robot. You're going to have context in your own mind. If the robot needs to look at the world with its sensors to figure out that context, then it gets challenging.
[00:58:01.53] SUBJECT 3: Your work is tremendously integrative. You're cutting across materials science, mechanical engineering, electrical engineering, computer science, all sorts of stuff. Is that unique to your laboratory? Is there a broader shift in how knowledge is being organized at MIT and other universities? And if so, are there any implications for industry and our kind of organization?
[00:58:26.54] DANIELA RUS: The discipline of robotics and of artificial intelligence is in itself interdisciplinary. And in fact at MIT, when we announced the new College of Computing, we made a special effort to say that this new structure will advance core computing. And also, we'll look at how computing-- and by computing, we also mean AI and robotics-- how this connects to other disciplines so that we can support scientific advancement in other disciplines but also so that we can enable new applications.
[00:59:02.19] So at MIT, when we think about AI, robotics, computing, we think usually about two broad categories. There's the core of the field, and there's the bridge to other fields. And I personally believe that, at this point in time, it is so extraordinarily to think about bringing different fields together and bringing techniques from different fields that will enable the creation of completely new disciplines of study.
[00:59:30.48] So think about journalism and computation. Or we just launched at MIT a new degree in computational urban science. This is the idea of training students simultaneously in the issues that are related to urban development but also computing. So I think that there is a lot of opportunity for the future to bring knowledge from different things together, invent new things that would not be possible to do by one subgroup or the other.
[01:00:05.81] INTERVIEWER 2: We have one question on the webcast as well.
[01:00:09.55] SUBJECT 4: Thanks a lot for the presentation. My question is more in terms of how we can do use the existing solutions that you have built for autonomous driving or the way you have mentioned about how you used the autonomous driving to use and lift or grab object. How can these existing solutions be used in different disciplines, as you've mentioned? What is the right kind of thought process that one should have in order to understand that is the solution over here, probably we can think about it and put this in a new problem at a different place?
[01:00:37.31] DANIELA RUS: Well, ideally, we would have a compiler that will say, for every problem, this is what the domain specialists do, and this is what the computing specialists do. Right now, we don't have that.
[01:00:46.64] And so we start with connecting and collaborating. And so that's at the level of people. But it's really exciting to think about machines and people as a heterogeneous group where they become, in some sense, super powerful, because people help the machines, and machines help the people.
[01:01:06.73] Now, there is a simpler question under your question, which is can robots share experiences and learn from each other. And through cloud computing, there are a number of research projects around the globe where people are looking exactly at that problem. People are looking at how they can aggregate what different robots in different parts of the world have learned about different tasks and creating, somehow, super activities that aggregate the union of everything that was learned. Again, a very cutting edge research topic. Awesome.