The dangers of AI are very real from self-generated essays and adversarial attacks on self-driving cars to closed loop algorithms, where companies don’t want you to know how their systems work. Alberto Todeschini from Berkeley shares the truths behind much of today’s AI research.
This is HumAIn a weekly Podcast focused on bridging the gap between humans and machines in this age of acceleration. My name is David and on this podcast, I interview experts in sociology, psychology, artificial intelligence researchers on consumer facing products and consumer facing companies to help audiences better understand AI and its many capabilities if you like the show, remember to subscribe and leave a review
Welcome back to the HumAIn Podcast today I have with us from Berkeley and the I school in California.
Alberto Todeschini, thanks for being with us today.
Alberto Todeschini
Thanks for having me.
David Yakobovitch
We were just chatting before we started the episode that both of us travel a lot. I recently got back from Monterrey, Mexico doing a lot of innovation in the data science industry and then you’ve had a lot of opportunity to work in data science and AI, and we’re doing a lot of cool things right now. What’s new in your world right now.
Alberto Todeschini
I just came back from over a year ago, mostly in Asia, between Hong Kong and Singapore and it’s been very interesting to see different approaches to data science and AI and different products and then coming back to the outskirts of Silicon Valley. So that’s been a very fascinating and cultural difference in cultural attitudes about privacy and concerns that exist in some locations, but not so much in others so it’s been really fascinating and it’s been quite enriching
David Yakobovitch
Now privacy, if we go into that topic, that’s a fun one to talk about, because I know in the United States, privacy is quite different than it is in Europe than it is in Asia, or at least that’s how it seems. What was your experience seeing privacy in both Hong Kong and Singapore?
Alberto Todeschini
I also want it to be a European, so I going to have a full in Europe, one in the US and one in Asia and it’s just very interesting so it’s a bit like the difference in attitudes to freedom of speech in Europe and in the US where freedom of speech is protected a lot more in the US whereas in Europe for historical reasons, partially cultural heritage of World War II and propaganda, we value freedom of speech I’m tempted to say less, but let’s say differently then similarly, it’s something similar with privacy.
So I see a lot of cultural differences ando I see all of us failing to understand what is fine for someone else may not be fine for us and then contrary wise so I see all the people judging with our, and I say, well, your average us expectations of our privacy judging how privacy is understood and taken care of in some Asian countries, for instance but we are using the wrong metrics. We are looking at our standards, it jives with our standards, right and you travel and you’re talking to people and sometimes I say, well, sure, we use AI algorithms. You’ll learn ever more about us.
You can learn about psychological makeups with people what their mood is and things like that but they say we live for instance, in Hong Kong or places like this, I say, they may say, well, we’ll leave in a very crowded place and we need to make sure that it’s nice and orderly, and it’s perfectly fine if certain data is checked by the government or whatnot, because it’s a net positive. So it’s been enlightening and I will suggest anyone who wants to take privacy seriously on a global level we need to talk to people from around the world and actually understand them, understand that their standards may be different. Understand that, say population density, economic growth, and other factors are very important. So listen, read and listen.
David Yakobovitch
Because here as someone who’s originally born in Florida and I’ve lived in New York for much of my adult life, I see a certain way of being in a certain attitude that Americans have to culture and to driving business decisions but you mentioned very valid points that be so different to those who have Singapore nationals those who are Hong Kong nationals, or are they from Shenzhen working in Hong Kong and vice versa.
There’s so much diversity there and that diversity is fascinating because when we talk, technically speaking about translating results for business, and here we are in 2019, where we’re seeing that AI has taken the world by storm and especially in business but everyone’s trying to see, how do you merge this into your business model into products? And one of the challenges about AI being merged into business is the interpretability right.
We’re now seeing companies are attempting to interpret, they’re attempting to better understand results, especially if you don’t have full time data scientists on your team, but what’s your take on interpretability for comparables.
Alberto Todeschini
I’ve just interviewed 10, very senior executives for an executive education program building with Berkeley and we’re launching late June and they had to put up, they came up multiple times so technically speaking, it’s just difficult as it’s just difficult some of the algorithm is just difficult to get them to output anything that we can interpret it’s especially important in finance and in healthcare so the also with autonomous vehicles so finance in healthcare, it’s pretty obvious so if, for instance, you work for a hedge fund and you go to investors and you say, ‘Hey, I need 400 million us dollars that I want to invest’ So while that’s great. So how does your algorithm work? And if you can’t explain that if you can’t interpret and explain that to your investors that you may have kind of a hard time, find someone who signs those checks seemingly in healthcare, if you’re diagnosing things but also eventually if you’re choosing right, as in healthcare, there are a lot of constraints given by costs.
So a hospital can only spend so much on their patients and if increasingly from simple diagnostics to actually say, optimizing, I run a whole hospital including expenses this is literally a matter of life and death someone will get more treatments someone else will get less treatment because of costs.
So naturally into the durability and explainability in contexts like there’s a very desirable, same autonomous driving so there’ll be a few cases of a Tesla’s as laminate to a similar type of concrete structures or fire trucks. I don’t remember exactly, both of them guess what it’s a matter of life and death. It would be nice to know what happened, can you interpret it? Can you explain that? What is the feature right? In other domains, I hear requests for interpretability hours less.
So one of the interesting things, so as I say, for some of the algorithms is just very difficult to get into output anything that can be reasonably interpreted and as you mentioned sure it’s also a matter of time and buds, can you interpret it? Sure give me someone with an appropriate PhD and six months and I budget and I’ll end up with whatever you want versus can you just look at it for a couple of hours? So there’s a matter of costs and another trade-off is so if interpretability really is important that you and you need that interpretability to be instantaneous or pretty quick right now doing a post-mortem of six months or on some neural network then you may have to trade off some performance.
So your top performer algorithm, maybe it’s less interpretable than another one so you may choose the second one in terms of whatever metrics you’re tracking, accuracy or whatnot, you may choose something that works a little less well, but it is a lot easier to interpret.
David Yakobovitch
You’ve had the opportunity to see a lot of projects. I’m sure pro from these different executives and organizations globally, in addition to students who you’ve worked with at University of California, Berkeley, myself I’m also an educator I teach a lot more in the bootcamp space, but we also do capstones so capstones are quite essential and I know you mentioned these different industries like finance healthcare, autonomous driving vehicles. If someone wanted to get involved in data science or AI today, they wanted to do a capstone in these industries. Would you have any recommendations on this is a great project to get started or here’s some best practices to get your foot in the door?
Alberto Todeschini
The best practices assuming as humans there’s a certain technical proficiency that doesn’t have to be extraordinary or not all of us need to develop novel self-driving technologies. So assuming there’s some level of proficiency and curiosity that a math teacher that works is similar to what we are taught with regards to startups.
So basically the idea is this lean methodology they get used by all those stats that you remember around the corner from Silicon Valley so everyone, half the population of the UC working in cafes are CEOs of startups. So what does that mean? One thing is they always tell the students, you must have the domain knowledge you are working with, we had a fall detection system just being built. For instance, what do you do, if there’s a monster of hardware?
That you can carry around your neck for instance, if you’re the elderly, for instance, then if you fall, you press the red button and as your services are called, well it turns out that it’s a very perfect solution that often is not used what if you’ve lost consciousness or if you’re in the shower or in the bathroom where a lot of accidents happen.
So they’re about the solutions, how about we just put a microphone and an accelerometer on a table. It looks like one of those Google or Amazon, the parks Alexa, and that stuff we’d accelerometer plus the sound we can train an algorithm to figure out some, a false right but thinly clearly you must understand the problem very well. If you want to build something like this.
So what we did with the students was we talked to gerontologists. It turns out their whole conferences, people who spent their entire careers figuring out false. Why? Because it’s the number one cause of injury for the elderly. So this is what I tell him. I said you need to get out of your basement total off wherever you are. You need to talk to domain specialists because there are no shortcuts. Sometimes we work with a project where marine biologists identify certain malign seaweeds and again, there’s not a shortcut so my students have very minimal knowledge of seaweeds or these toxic seaweeds and with Dr. Song Marine biologist who has been studying it for 30 years.
So absolutely you need them to send your users you need to understand your stakeholders which entails a lot of and alone meetings some of us, we just want to write a nice coat but you have to get out the building at least figuratively and talk to experts otherwise it’s difficult to solve real problem experts and users who may or may not be the same people so that’s a general rule for building AI products is like, do people actually have the problem you’re trying to solve? Yes. No. Who are the experts in this area? What can you learn from them now? Because AI is managing it because of 2019 all the people are very happy and delighted to work with us. We’ll see when the magic wears off, but for now, everyone wants to work with us.
David Yakobovitch
And this is so fascinating about interpretability right you’re talking about business case and if students are working on projects for case competitions at business school or hackathons in industry, often you’re getting out there and you’re beginning to talk to insiders and you beginning to talk to stakeholders to understand more about projects and what is possible to have solutions but what is super interesting is not all problems are able to be solved in real time and one big problem that we can revisit from your previous remarks is with Tesla and not necessarily isolated to Tesla, but more with self-driving technology is the challenges of the cars running into roadblocks in the highways, into fire trucks, and even into stickers on stop signs.
It’s there was actually a hack earlier in 2019, where some researchers just to test it put some QR codes, stickers on certain different yield signs and the Tesla vehicle got confused and just stopped or didn’t move or something of that nature and these are beginning to become more common and I know you’ve done some work in this space you’ve talked about it with some companies and then your lectures a lot of these attacks are going to become more common and I’d love to learn more about them and have our audience here.
Alberto Todeschini
It’s a really fascinating topic. It’s very beautiful. The math is beautiful. The techniques are beautiful and I’m absolutely certain it’s going to bypass so I have absolutely no doubt whatsoever it’s going to bite us so this adversarial attacks, adversarial Michelle or there’s several examples so we’ve been doing it for a few decades, it’s nothing new, except that it’s really exploded recently.
There’s a full three streams or publications, one a few people out of Berkeley papers out of a Google with Ian Goodfellow that’s a very, well-known AI researcher and a few other peoples out of MIT and obviously other places, but these are the three, three most important sources of the recent techniques so how does it work? So it’s basically, let’s say that I built an app that recognizes fraud so I can go out and find the wild blueberries and wild raspberries and whatnot, and there’s that tells me if the fruit that I’m taking a photo of is edible or not we don’t want to the wrong batteries out in the forest.
So simple as that, so we can figure out weaknesses in these classification systems and we can attack them and there are different types of attacks one is targeted, meaning I just want this classifier to misfire. I don’t care how so if I see wild blueberries, it may instead return like it could be anything, it could be bananas. It could be rocks, can be whatever, depending on how we are trimming it but in any case, I just wanted to misclassify that generally and targeted then there are targeted attacks that are probably more vicious potentially. You mentioned one, so it’s as simple as a stop sign being read specifically as a skid sign or bumps I had sign or whatever.
So this is targeted, So I want the right turn being red as a left turn. It’s kind of a lot more scary than Indian targeted so one of the problems here is that, as you said, there have been attacks that are one single pixel attacks so literally, so imagine you have these cameras on your car, on your drone, on your phone. It literally just changes one pixel, it’s enough to make a classifier misbehave, right then the ones that you mentioned. So how does it work? I put a sticker on a stop sign and the sticker looks innocent enough. It could be the name of a hip hop band or something like that.
So humans are not going to be thinking that Tika is trying to make me a Islami Toby, this just looks like a piece of advertising or something like that, but it’s crafted, it’s optimized to push basically it pushes the decision boundary of this classifier and it pushes the data across this decision boundaries so that you go from one class, one other class so we’re not going to get too technical, but so that’s the point then there’s another type of attacks they’re also very interested in they are a type of parasitic computer so we can figure out basically how to use computer resources. That was also very fascinating.
I can use a company, say a company has a service that I upload pictures and it tells me what’s in the pictures, something like that and there’s a bunch of stuff that’s doing this and these parasitic attacks. I can reprogram the impact of sufficient tuition to do some computation for me. So maybe this is probably less practical, but it’s still very interesting how it’s the only way interesting.
Now it all sounds very exotic it all sounds like very far fetched and then who cares if, instead of blueberries, you terminates a raspberry, the most worrying thing, one of the most worried things is that we are moving into a world with hundreds of millions, billions of gadgets that do machine learning that do this classification or whatever they’re on the gadget and as security practices for gods is once assaulted to be like very shoddy so maybe the company went bust, but it’s still working.
So you end up with possibly with tens of millions of identical gadgets with so machine learning capabilities, right in houses, in hospitals, on army bases or wherever, and I can buy the stuff and I have years to figure out how to look back then. I’m not in a rush the 99% of his companies, I know it to be as sophisticated as Tesla or Google and Android so there you go suddenly you have this huge amount of pieces of a cheaper business of technology out there that are, that I have is to learn how to exploit and these techniques are evolving very rapidly.
I’m pretty sure that there’s a mismatch between the so-called white Packers and white hat and black hat, then that the black hats are probably, there could be further ahead so there could be working on this as we speak certainly I have no doubt. So we just need to pay attention. It’s absolutely obvious it will be a problem we may be a big problem and we may eventually perhaps soon find it could be, it could cost all the lives. What do I mean? There’s not just about trying to scam a bank out of a monitor, things like that. It could be one of those nasty pranks or whatever, but it could very conceivably result in death.
David Yakobovitch
And do you think there’s a way that we can protect the guns then? The reason I asked that is to me, the most common way is maybe we’ll have a new startup in the future, the new Symantec of the future that will have its own API or SDK that will just be behind the scenes, constantly monitoring and wondering if that’s the reality we’re going to start living with as AI starts living on the edge on all devices and in this IOT space of every device has a sensor and soon we’re going to have thousands of sensors in every home.
Alberto Todeschini
Yes, and it’s not just because of this adversarial attacks, it’s also because we’re very quickly moving into world where we are not going to be able to tell truth from fiction so simple as that so there’s a family of techniques that go under the name generative adversarial networks, and from about 2014 and they were very interesting from the beginning, but they were, the performance was kind of a party trick at the beginning and then I started working and then I started working even better and better and better and now the work awfully well for certain applications like generation of say new faces.
So this is problematic because we can use them to copy and paste someone’s appearance on someone else’s body in a video, there was a paper from 2018 from Berkeley. I was a 17 called everybody dance now, for instance, so there’s literally, you can copy and paste my appearance on someone else. It makes it look like I executed those things and they were in when they built this, they were training on dance videos and hip hop and things like that they’re not just right.
You can get a politician or CEO of some company, you could put videos out that are much more compromising and just showing your average president doesn’t seem hip-hop and there was another study. I don’t remember where it was published, whether we’re showing these images and even though they were not perfect, the perception, there’s only one reply from the people that they showed it to.
It wasn’t their fault. It was strange. They’re funny, they were already good enough that most people would think. Well they must be true, but it’s kind of funny it’s the something more than that is more right same techniques can also be used to clown people’s voices so that I’m phoning you and I sound like your mother and or vice versa, and I’m asking bank details or your kids sound like your significant other, your brother, your sister and you’re not going to be able to tell without a phone technology, whether it’s me or this hacker speaking so you’re absolutely right there will be startups in the space where that we had a project called to find these some of these images, deep fakes to where the product was doing exactly that and even over a semester, it was moving so quickly that it was very difficult to keep up because every few weeks there was a new technique.
So is he going to be on service, such that Google never serves these fake images? Is it going to be on your phone? Is it going to be on your laptop? We’ll have to see. You will need to be updated continuously because this stuff is just moving so fast cat the mouse, but they’re both moving at the light speed.
David Yakobovitch
We have other examples of this it’s back to the early two thousands, that was Jim jab and we would send grieving cards when it was be very funny, because you would know it’s not really you, but you in a cartoon and even Disney had done that with their parks, but then now a great case of that deep fake from October, 2018 the actor, Jordan Peele he was in a BuzzFeed video, basically spoofing Obama the former president of the United States and took the audio and the lips and the voice and made certain words that Obama would never say and it was quite comical, but it was also quite scary because you imagine if that was done with a current sitting president or leader and so forth, what that could cause issues and there needs to be a way to authenticate and secure is all these protocols. So that’s going to be a big challenge and I wonder almost if Ganz should have never been released to the world.
Alberto Todeschini
It’s do we work building them? We’re building this the case so the argument often, even from open AI yes that large and well-funded nonprofit with funds from Elon Musk and same old one, a seminar this way so as a few months ago, they built this natural language processing model, which you could create essentially infinite new text, infinite new tags this sounds pretty good, actually good enough to pass for text written by humans and I’m barely open AI didn’t release the full model, meaning they released this smaller model that wasn’t working as well, but the full model.
So there was a concern that what are we doing now that we can cheaply create infinite techs? And we can spam Twitter, which is half bots anyway, we can spam Facebook and anything with infinite new text so meaning you don’t need to hire 200 people in some location to write up fake news. No, you can generate infinite new text confused people in everything so again, we are just not going to be able to trust our unaided human census with anything that comes to us digitally, not the audio, not the video, not the stills, not the movies and not text either.
What are we going to do? if you spot this after the fact, how useful is it? So there’s a video that goes viral with some important ramifications in politics or in for the finance of a country or whatnot, but politics is especially right. So the video was out, do people see the impression is made and then someone at some obscure or semi obscure, famous university says by the way, we discovered that it’s fake the damage is done so what are we going to do? And I have no answer here. Is this a thought something I want our audience to think of what we are going to do?
David Yakobovitch
As I’m pondering this right now some sort of live authentication technique would be needed almost like how you have multi-factor authentication for when you’re assigning into cloud environments and technology. It would be that you start a Skype session or a FaceTime session, and then you instantly verify your fingerprint or something that then live then you both receive like a trusted badge to know that this is a non deep fake call, a non deep fake video you’re in a secure environment.
Alberto Todeschini
We’ll see what the mechanism is going to be like but for now some of these techniques have the statistical analysis very easy to spot for a computer the computer will immediately tell you. Yes, it’s a fake in many cases but not so for humans so already humans will be confused or thoroughly and completely sold on the truth of something whereas for the computer, for many of these attacks, these statistics of these fakes are such that we can spot them easily, but it’s not going to last very long so this is the usual space and it’s happening so quickly. It’s difficult to keep track of what’s happening and that’s just the staff.
Those are happening out in the open they’re very well funded AI private but also there are well-funded parts of the world that I’ve already allegedly engaged in trying to modify the outcomes of elections right in the United States and also with Brexit news from this week that the same was happening with elections in South Africa so these people have the, they have the hackers, they have the buzzes, they have the determination and what are we going to do again?
David Yakobovitch
Well, the best we can do is to understand and learn what’s available there and the more that we’re educated, the better we can hopefully recognize for help others recognize what’s going on with these different sources of data and one underlying theme that you’ve been sharing Alberto is that with these generative adversarial networks is the data is not just video. It’s not just a photo. It’s not just audio. It’s all of that and more, in fact, it’s multimodals right we have this term that you’ve also done research with as multimodality is actually quite cool when it’s used for good and when it’s here are some things that you’ve discovered with it.
Alberto Todeschini
So this is one of the most interesting things. It’s a little, maybe it’s easier for inside this to understand why I’m so excited about it. Let’s make it clear so everyone can appreciate it. So for all of this machine learning and AI systems that we’ve built in the last decades, the work on a single modality, a single type of data for instance, we’ve been doing so-called sentiment analysis for decades there’s a piece of news there’s a customer feedback or whatnot and then my algorithm tells me this person was happy or not.
This is the most basic wait to do this now we can do more sophisticated things we can do it alone, different aspects of it for instance, you own a chain of hotels or perhaps your airline and you want to know from written feedback yes from text, you want to know what do people feel about the catering on the flight? The cleanliness promptness, helpfulness so there are different aspects but this is still only one modality this is all from text but there was a paper thing for Korea last year, they were using people’s voice as well as sort of the voice, the actual sound of the voice, as well as the content of the patterns, what was said, and together they could do the sentiment analysis in maths better than individually.
So think about it so if I could type something and it sounds like I’m kind of irritated however, I’m just maybe it’s a joke maybe it’s an active, that irony that often in my case, especially if it falls flat maybe it’s dry humor and you don’t know how many misunderstood emails and texts physical text out there however, if you’re also hearing me yes and even more so if she has seen me, you may be able to that he’s smiling. It’s a joke so there’s more time ability allowing us it’s increasingly allowing us to feed different types of data into a single system. So that’s that the whole overall system works better. So I can feed images. So you look at the shape of my eyes.
You look, if I’m smiling or not the sound. So sometimes you can hear the voice that sounds a little angry. It sounds a little anxious or sounds, or happier and relaxed as well as the actual content, the transcript or what I’m seeing so this is very cool. Another example is in medical imaging so think about the different images that you see in a hospital.
There’s the x-ray yes, there’s an MRI there’s a Doppler machine and others, right. They’re completely different and each one requires specialization to work properly but it’s also very difficult for a computer to figure out how it will reconstruct some parts of a shape or functionality of the brain from completely different images, even though it’s exactly the same brain so with these multi-modality, we’re getting better and better.
I’ve been able to extract similar information from these different types of images, such that the final result is better quality to them that was possible previously. So it’s for working data scientists working, especially if there’s going to be a lot easier to feed different modalities of texts into data, into our algorithms and for users will just get richer products and richer experiences.
David Yakobovitch
I’ve noticed that even with all our families, if we’re talking about healthcare, we’ve all been around hospitals and different offices and just in the past couple of years, I’ve noticed it’s quite cool for the ones that I visited there is a lot of automation technology, but there is also now information such as the doctors in Stable to see what happened in my last visit or certain analytics that the blood pressure is 20% higher or lower than it was before solving these calculations for the physicians without them having to manually do that and whether it’s one piece of data or multiple piece of data, this multi-modality that you’re explaining, it sounds that it’s creating augmented human intelligence. It’s creating an experience where data science and this machine driven learning is able to augment user experience and then ultimately hopefully create better solutions for society.
Alberto Todeschini
Certainly it’s an obvious step and unavoidable step we’ll be ministers of is augmenting the human capabilities this it’s a bit like building on exoskeleton so kind of this mechanical skeleton around you, that gives you extra strength, such that you can go to high mountains or carry wounded soldiers similarly, we’ll have our cognitive abilities augmented.
I don’t think we can foresee for which kind of tasks the augmented, the human plus algorithm will be better and for a long so for instance, if I play chess and to begin with, I become a better chess player because I’m learning with a computer but what if the computer is still better than the human plus the computer we’re just a little wind down the computer so similarly for AI in corporations and so forth, we talk about open meant in the workforce a lot, and that certainly happened.
It will continue to happen, but eventually it may just be that humans are kind of a drag sounds awful but we’ll see where that goes. We’ll see where one of the most fascinating things actually in this and being that I’m asked pretty regularly is about creativity, so with these algorithms, we can increase creativity for instance, we can, you can get better brushes, so to speak, you can get.
So all my students build deep lyrics. It was a project that brought lyrics basically, and it was already better. Sometimes it misfired the spectacularly and hilariously, but some of the lyrics were actually pretty good, certainly better than I would write them so you can see the generation of lyrics with the theme that you want and you can change them and polish them so that’s the augmented creativity, but it’s also interesting that being paid, those sources called creative adversarial networks and others. Well, the final output shown to humans appeared as creative of that as things created by humans and that’s the final, the final proof of whether that’s creativity is, does it look creative to humans?
And I find it very interesting that creativity seems like one of the most intimately human qualities and one of the things that a computer should not be able to do right, hang on a second or computer can be genuinely artistic and creative and paint, and then write lyrics and I read the paper. It’s like that was a clever math trick but does it really matter if the human finds the painting nice or the lyrics? Nice so the music is nice and creative and moving and you get that experience. You get that experience when you see a good, can you hear a good song? It doesn’t matter where it comes from.
So we’ll see if creativity is the last thing to go or to be augmented and we’ll see how long after that, if that’s a version one, 1.1 or 2.0, the company will be more creative than us.
David Yakobovitch
Well, only time will tell and with this whole topic that we’ve covered on today’s show for adversarial networks, it seems there can be a lot of positives and a lot of, to be aware of and that we’re going to strike the right balance.
There will be combination of startups and regulations and new inventions coming out, hopefully all for the better, hopefully with more laughs and negative emotions there and this is a super exciting time to be alive and actually see all the employees of the AI that’s been trained for so many years now so thanks so much for sharing all your insights.
Alberto really appreciate it here for being on the HumAIn show.
Alberto Todeschini
Thank you so much.
David Yakobovitch
That’s it for this episode of HumAIn I’m David Yakobovitch and if you enjoyed the show, don’t forget to click subscribe on Apple podcasts or wherever you are listening to this. Thanks so much for listening and I’ll talk to you in the next one.