David Yakobovitch

Welcome to our newest season of HumAIn podcast in 2021. HumAIn is your first look at the startups and industry titans that are leading and disrupting ML and AI data science, developer tools, and technical education. I am your host, David Yakobovitch, and this is HumAIn. If you like this episode, remember to subscribe and leave a review. Now on to our show.

David Yakobovitch

Welcome, listeners, back to the HumAIn podcast. Today, we’re talking about language, the language that we speak and you listen to here on the show. You hear from very popular apps, like Clubhouse, that are going audio only. So, today we actually have a very exciting founder, Vasco Pedro¹, who’s the CEO and co-founder of Unbabel², an exciting startup split between San Francisco and Lisbon, Portugal. They’re helping with the future of translation for language. Vasco, thanks so much for joining us on HumAIn.

Vasco Pedro

It is a pleasure to be here. Thank you for having me.

David Yakobovitch

2020 and 2021 have been hyper growth years for your startup because, as we know, as we moved into a digital-only, and digital-first world, language has become even more important to manage online. Can you start sharing with our listeners here a little bit about your company, the history, and why is now such an important time for language?

Vasco Pedro

Certainly. So, translating a language has been a task that humans have been doing since the beginning, really. And it’s largely an installed task. Unbabel really comes to life, we started in 2013. We hit it at the right moment where artificial intelligence was starting to be useful for human translators, where if you tried to do Unbabel three or four years before, and you would take the output of a machine translation and gave it to a human translator, the first thing that we did was erasing all of it. And we kind of came to this problem at a point where these two worlds now had enough overlap that you could really leverage #artificialintelligence to start enhancing productivity of humans. And so that’s how we started thinking about it, as well.

Until now, translation has always been a service, a human service. How is AI going to transform this and really take it to the next level, both in terms of speed, scalability and even quality and reliability? And so, when we started, the first thing we did was saying, we need to create a new version of the translation service that blends artificial intelligence and humans in a number of different varieties to provide just this very simple, straightforward API for translation. That was the original idea.

And my background and my co-founder are both in NLP. So, I did a PhD at Carnegie Mellon in Natural Language Processing and João, my co-founder and CTO, did his PhD in machine translation at UPenn. And so we had been thinking about this for a few years, on problems related to language. And when we started this, it made sense. Machine translation is not there. It’s not going to be there really for the next few years, at least, and it needs a very heavy amount of human intervention to be usable.

And especially, we started thinking at the time that rather than going on a consumer side, like Google Translate and other machine translation engines, we wanted to enable enterprises, because we felt that there was a very big need for an enterprise grade solution that would enable scalability of different language processes. And our perspective evolved since then. We started with this idea of replacing the typical translation service with something better.

But since then, we understood that, actually, translation is part of a bigger problem within companies. Translation is a solution for a problem. But the problem really comes from a basic fact, which is, every human pretty much speaks one language, some humans speak two, very few speak more than two. And this idea that we’re all going to speak the same #language is proven false over and over again. And so, if that is the case, then we have this problem, which is, you will always have humans speaking different languages.

And as the world becomes more digital, and when you have physical barriers, it becomes less of an issue. Then, companies are pressured earlier to be able to serve multiple markets. And as you expand to multiple markets, you face the fact that people in that market will speak a different language and I need to be able to serve them.

And there’s this weird phenomenon going on. Not weird. It’s just a phenomenon going on. In my perspective, it potentially aligns along the same reasons why we’re seeing a lot of polarization on opinions, which is, in language, when it comes to the internet, not so long ago, the vast majority of content was in English. And so everybody was sharing the same content. You’d go out and you had the same sources of truth. You’d have the same sources of information.

But, actually, you look online right now and English has counts for about 20% of content. And it’s actually decreasing and is expected to bottom out of 10% because the vast majority of the content that is created now is through social networks, and people do it in their native language. And so you look at different countries and what you see is users and consumers are really used to consuming information in their own language, whether it’s news, whether it is what’s going on.

And with that comes the expectation that they should be able to consume services and goods in their native language from beginning to end. And so on one hand, it’s easier to access the global market. On the other hand, there is an expectation of a hyper localization and #personalization of service according to that market.

This creates an even bigger pressure for companies to be able to function in multiple languages. And as we’re thinking about this, we’re realizing this happens across all of the use cases inside a company.

The typical use case of translation was in localization, and localization is typically applied more to marketing and product. The problems of how do I get people to know about me and how do I get people to use my product, but actually the same problems extend throughout all of the organization.

How do I support my customers once I have them? How do I sell my product? How do I actually communicate internally as a team and really leverage talent globally? And the more we thought about that, the more we realized that it’s actually more than translation. It’s really about language operations.

It’s how do I enable, how do I use LangOps as a way to empower different aspects of the business to function in scale across languages. And translation is a big part of that. And that’s where we started. But you start seeing that there are a lot of other aspects that play a role in enabling language operators to function into effect all areas of the business.

And so, off there, all of this evolution we’re seeing is, our goal is to build the language operations platform that enables every enterprise to seamlessly scale across languages. And a big part of that is the full stack that we’ve built on translation and different components of AI, quality estimation or anonymization, or the actual interfaces for humans to translate and all the different components. But the outcome of it is how do businesses leverage that to, then, scale seamlessly across languages?

David Yakobovitch

Your story, Vasco, really resonates with me because prior to the founding of Unbabel, I was working at a company in Gainesville, Florida, called University Transcription Services, where I was one of those transcriptionists that was dictating audio from lawyers and doctors. And that was helping me pay my way through college. So I got to see firsthand how prior to machine #translation that requires very fast keyboard shortcuts and macros and dictation and adjusting in a manual process and how that’s evolved.

As you mentioned today, there’s a myriad of services that are helping out. In fact, for the HumAIn podcast, we’re working on expanding into the Spanish vertical with Spanish channels. And so that’s a very interesting project we’ve been going through discovering how much can we automate, how much still requires the human element to augment the translation. So it’s very fascinating.

My experience, and hearing from you here on the show brings a big question about the future of jobs, the threat of AI, what’s happening as these jobs that I did a decade ago in college, no longer exist. People are not manually doing the full translation. Of course you have behemoth companies like TransPerfect that are very focused in the enterprise space, but they’re being disrupted by technologies like that Unbabel is building. And so, where do you see that going with new opportunities being created for our #futureworkforce?

Vasco Pedro

That’s a great question. So our thesis is really that AI will have the biggest impact in areas that are highly commoditized and require a lot of human effort. Those are the low-hanging fruit. And certainly, translation and transcription were one of them. And that’s why we’re seeing such impact. Because it is commoditized in the sense that a lot of humans can acquire the knowledge and the skillset to do translation and to do transcription.

And there is a fairly straightforward, and there’s a part that was particularly hard. And translation is a good example, because it expands and applies to other areas. In general, what we’re seeing is that, overall, AI is not replacing humans, it is augmenting humans. And it’s enabling humans to be more productive as a tool, so far. Until someone comes up with fully AI, which I don’t think we’re anywhere close.

Different AIs will be tools to empower and increase productivity as humans. And one example is Unbabel. So a translator outside of #Unbabel typically would do about 300 to 400 words an hour, the old traditional way. In Unbabel, now he does between 2,500 to 3000 words an hour. You can look at this from a value-creation proposition, which is, maybe before, if you were doing transcription, you’d have to do all of this manual things.

And now you have a lot of different AI tools that maybe give you the first pass. Maybe give you something that is very close to finish, but your job becomes more of an editor of that, rather than the initial creator. And if you look at it from that perspective, it means that you can be responsible for a much higher amount of transcribed words which will still need some sort of human verification and correction and addition and tweaks, et cetera.

But you can now say, before, I would be able to produce X nine. I’m able to produce five X, 10 X, 20 X. And as that grows, you can leverage some of that to transform that into that value that you’re creating in a better paying job, and also in welfare’s creating for the actual human component.

So, you will need a smaller amount of human effort per unit, but that human effort overall would be more valuable, because it translates into a higher value. And we’re seeing that in Unbabel, as well. So this interplay between humans and AI has already had a lot of impact and will continue to do so, but it’s also not a zero, it’s not a binary thing. It’s not, I used to do this thing and now it’s done in a different way. It’s a gradual process that also enables humans to adapt to it, and to really leverage it, to then generate more value. And with this, what I mean by this is going to be, give an example.

So, if you take translation, translating something like a chat or an email or a document, a legal document, or an ad, they’re very different things. And they require different amounts of human effort. So you could say that for chat, for example, if you look inside of Unbabel. When we translate, when our customers use to enable check communication, the process that we have worked is humans are still involved in the process, but they’re involved pretty much in terms of continuously training and retraining models.

So they’re generating the high quality data that models need to continuously learn, to then be able to do it in production. And it’s not perfect. But it’s enough to enable that #communication. And so that’s one extreme of the spectrum. But if you look at emails, just the fact that you now have multiple sentences, instead of the average translation does on a sentence per sentence level, it means that as you accumulate, the probability of errors increases. Especially inconsistencies across different sentences, and a few other specific types of errors. That in itself means that 70% of the emails that we translate require human correction afterwards.

If you look at something like a legal document, a hundred percent of the documents require human correction. And not only that, but you have much more of the traditional translation review. So, a lot of times you have two humans looking at that. If you look at something like an ad, something that has a very strong creative component, you’re not talking about transcreation. And that realm, for the most part, is still very much human. So you have this spectrum where AI is helping from the low-hanging fruit, but we’re still far off from being able to really enable that across the entire set of channels.

And so, you have this varying amount of human effort required on different types. And this is the same that is happening across all really AI components that I see, even things like autonomous driving. You now are able to deal with a lot of the base cases in a highway if Tesla could drive on a highway, but as soon as you get to a different situation, you might need a lot of human oversight, because AI won’t be able to cope with it. And this blend is really the key. I don’t see, unless you’re talking about very basic repetitive tasks, I see the real value is in this interaction of being able to give the boring task to AI and to let the human do the higher cognitive load function type of tasks.

Obviously, this is a complicated topic, because even though I don’t know pretty much anyone that likes to be doing road tasks, as a human, you talk about a cashier in a supermarket, it’s no one’s dream job. But if you just now said, this is all replaced by #AI, this would create a lot of pressure and stress in society. So, we know that the jobs that AIs are not so much replacing, but enabling humans to go up the cognitive ladder, are typically jobs humans don’t want to do anyway, but they’re required in terms of the economy. And so, that’s the blend that AI plus humans can really help.

David Yakobovitch

And it’s so fascinating to think about this blend because we’ve seen the last few years becoming conversational first societies. Content is no longer being streamed and consumed only as video, but we’ve gone audio first.

And this is a global phenomenon. We’ve seen it in Korea first hand with Spoon Radio. We’ve seen in the United States with Clubhouse and its expansion. How now there’s a lot of offline and online audio, both async and sync available that includes access to global audiences. And to make that available, there will be different technologies to help improve that customer experience. And it sounds like the work that your company Unbabel does is assisting all elements of those audios.

Vasco Pedro

Yes. So, when I was talking about language operations and the different use cases within the enterprise, we started by focusing on customer service and the drive behind that was a number of things.

One, the conversational interaction is particularly suited for enabling AI to have a large impact. And so, we wanted to start with an area that the stuff we’re doing would have the biggest impact. But also, behind that, there’s this sense of almost the inequality of #customerservice, depending on language.

If you happen to speak English, typically, you have a much better customer service experience and a much better customer experience in general than if you don’t. And something as simple as if I call British Airways on the English line, it’s open 24/7. If I try to call them the Portuguese line, it’s open 9 to 5, five days a week.

And that’s a small difference, but it means that you don’t actually have access to the same level of customer service just because of your language. And so, this idea that how do we enable a seamless and consistent excellent customer experience throughout all the languages is something that can have a lot of impact.

And what we’re doing is empowering agents to be language agnostic. So until now, the way that you needed to deal with customer service, multiple languages, fundamentally by hiring people that speak that language. And this is really hard to do at scale because it’s very hard to maximize the usage and to be able to suddenly start having a bunch of different teams. And you have to take into account each team’s resources compared to the needs that you have, which is a logistical nightmare.

And a lot of times there’s a lot of turnover. And there’s a lot of issues with it. And what we did was we said, let’s actually enable throughout direct integrations with CRM. So Zendesk and Salesforce and dynamics and other CRMs that companies use and focus for now on the non-voice. So we’re still focused on text, chat and email, but in a way that I, as a customer service agent, don’t have to really care about the language you’re talking.

You, as an agent, focus on being an amazing customer service agent and really understanding your product and providing that level of customer service. And we act, we sit in between to make sure that that communication happens at a high quality human level on both ways, both from the customer to the customer service agent and vice versa.

This is interesting because in a way this is a small change, but it creates a lot of very interesting impact in companies in the way they actually tackle customer service. Until we came along, it was very expensive to do multilingual customer support, which is why only large corporations really did it. But there is no reason for any company to not do it now. Because you can have a team anywhere and you plug in Unbabel and your team can handle any language seamlessly. And so, that changes also the way that companies think about structuring the customer service operations. It enables them to do 24/7 support in all languages, much more easily.

And then one thing that we saw, which was very interesting is, it also enables them to maximize the effectiveness of their workforce, which typically leads to a faster time to first response, higher completion rate, which leads to higher customer satisfaction. So what we’re saying here is, Hey, actually doing multilingual customer support is much more cost-effective and it creates more customer satisfaction and a much stronger relationship with your customer because you’re doing it in their #nativelanguage. So there’s all of these benefits. And once you compare the core KPIs becomes a no brainer, which is particularly exciting for us.

David Yakobovitch

It sounds more likely to meet the SLA, have a lower total cost of ownership, help all the enterprises, and have success there. And when we’re thinking about everything on customer experience, with these tickets, as we went into a digital-first world, everything was digital support.

I recall many companies that said we’re no longer doing phone support because all of our staff have gone into a distributed model and that’s provided on the need for agility with customer service. As you mentioned with the different languages, thinking multi-language support. I remember the days when I would travel, and as I was seeking customer experience firsthand, it would be very challenging to communicate in another language. And recently, I spent time in Taipei and was able to use one of these apps.

The one you mentioned earlier, Google Translate, and actually conduct a conversation very surprisingly to move through that conversation in the voice medium, but that’s not only voice. It’s non-voice, using other apps to translate images from language. There’s so much opportunity here. And that sounds at Unbabel you’re working on a lot of these problems. You have different products, you have your portal, you have comments, you have Maya. Can you share with us a little bit about your products that are helping to improve this multi language or LangOps support for both voice and non voice?

Vasco Pedro

Certainly. And the three things you mentioned are part of our solution. So, Unbabel is a platform and solution for language operations that relies on multiple things. So the portal is really the product that the LangOps use to implement, manage and scale the translation layer. This is powered by the underlying platform, which is the actual bit that does a translation. would set up pipelines. And that’s where a lot of the AI and human work combined to provide fast, scalable, robust and high quality translations.

Maya is actually something that we’ve launched recently, which is a new framework to evaluate the quality of machine translation. So typically what was happening pre neural networks was that the state of the art was something called bluff for analyzing machine translation.

But once you had deep learning coming, and most of the models now are neural network based that just wasn’t sufficient, your ability to understand the quality of a model was very insufficient. And so we felt that we needed to come up with a better way of doing this. And we released it to the world.

We did it open source because we want to be able to have a much faster way of understanding whenever we train a new model to be able to do it in real time. Is this better or not? And should I actually be deploying this in a particular use case?

And then, Maya is an initiative that we’re investing in to take that to the next level in terms of agents support, where it’s no longer just about translation, but it’s providing all of the support tools for an agent, from a language perspective, to enable them to do a much more effective work.

A lot of times, my experience has been that there’s this simplification of the expectation where we say, Oh, it’s machine translation plus human. But AI is actually present in Unbabel in a lot of different ways. So Comet is one of them, as it’s also a neural network based framework to evaluate machine translation.

Quality estimation is something that we’re one of the pioneers, where you’re trying to also train a #neuralnetwork to try to make a determination in real time, whether that particular output, that translation, is good enough or if you need humans, we have an incredibly sophisticated anonymization pipeline.

The actual tools, the CAT tools for our translators are AI powered. Unbabel is an AI first company. And so we always look at a problem and imagine it, what does it mean to be AI first? What’s the experience? For whether it’s someone inside a company that is managing their translation layer and deploying translation across different use cases, or whether it’s a translator, what does it mean to be AI first, how can AI really empower you and augment your capabilities in the job that you’re doing?

David Yakobovitch

And in the technology that is sweeping the globe, what we know for certain here in 2021 is that voice and non-voice, they are the new inputs for all our devices. Back in 2019, I actually spoke with Dan O’Connell, one of the board members for Dialpad, as they were bringing in voice AI technology. And it was fascinating to see the conversation rising with voice AI conferences and everything between both voice and non-voice.

But beyond that, what we’ve seen there to where we’ve gone today it’s still just the beginning. And as we’re continuing to experience the ramifications and the growth changes as a digital-first society, as a result of the pandemic language becomes ever more important. Because, as you mentioned here today, Vasco, we’re not in person, we’re not able to use our body language in communication, have live interpreters to engage in meetings and settings.

It’s all being digital-first. And whether that’s voice or non-voice, that provides a lot of new opportunities for language operations. I wanted to hear about your thoughts on the language barrier, this language barrier that we’re seeing in a digital divide or an in-person divide that we can unify. How do we create more accessibility and mindshare?

Vasco Pedro

I feel that, as you mentioned, the #digitalfirstworld that we’re accelerating into, and despite all the very, really bad things that the pandemic brought, that’s probably the silver lining in terms of accelerating into the future, highlighting the need for that, for the ability to overcome language challenges. So for example, in this pandemic Logitech was a huge success case for us. And they are typically in the sense that they saw the dramatic increase of the need for their customer service, which happened. There were COVID winners and COVID losers, but a lot of digital companies were COVID winners in the sense of access to products.

And suddenly you had way more users requiring customer service, requiring support in some way or another because of Logitech and with cameras and a bunch of other stuff that obviously everybody needs right now, that was a big issue. And when you think about where we are in terms of being digital-first, it means that if they needed to hire a lot of agents across the globe to support all the markets, that would have been a logistical nightmare. But using Unbabel, they had 300% increase in tickets that they were able to solve by hiring in one location and then not having to worry about language.

And so, I love Logitech. Obviously, I love all our customers. But Logitech has been an incredible success case for us. It also really highlights what’s happening, which is this stress that external situations create and the need to overcome them in a very quick way. Typically, that is when it really enables companies to supercharge their change management processes and to take on solutions that maybe before they were going to do it in a certain speed. And now they can say, you know what, this really solves our problem that we’re experiencing right now. Let’s just go ahead with this. And we had a number of customers for which this was definitely true throughout the pandemic.

And what we’re seeing now more than ever is a lot of companies went remote, a few of them, it’s unclear how many yet, but there’s certainly as good enough amount that says, you know what? This is going to be the new normal, we’re going to continue to be remote. Even the ones that are going back to some sort of hybrid model.

I haven’t heard any company saying we’re going back to full as it was before. Every company’s at best saying it’s going to be a hybrid model of some sort where there’s more flexibility and this highlights that the ability to access talent throughout the globe is changing.

Because you no longer need to be physically in one place to be able to be part of some company that is digital-first, which makes sense. If I have a digital product that can be sold and experienced and used anywhere in the globe and up forcing artificial physical barriers, because I happen to be in a rural location and now more and more, that’s not the case. And once that happens, the issue with language becomes even more of an issue, and I, by no means, think that we have solved all of it.

We were talking about this before, it’s very clear that even in Unbabel, which is a company that’s focused on eliminating language barriers, everyone that we hire needs to speak English, because otherwise we can’t really communicate yet at the level that we do, we need to do. So it’s clear that those barriers are there. Now, I do think that, and you mentioned quite well, that voice and texts are now the interface for a lot of the stuff that we do.

And if you take an example, something like VR, which is also starting to explode that even takes it to the next level. You’re now really being able to overcome physical barriers, but still have some sort of pseudo physical presence. And so the glaring barrier becomes language. If your appearance and location are not an issue for communication, then, really, the language that you use becomes the number one barrier for it.

And we’re going to see a lot of exciting things in the future. You mentioned something like a Clubhouse, you still have very apparent the language barrier and Clubhouse. You should be able to have people from all over being able to communicate seamlessly. We’re not nearly there. But it’s becoming very obvious that we need to. And I am particularly excited about the next 10 years, because humans are particularly bad at predicting the future. We always overestimate or underestimate, it’s very canonical. Where’s my hoverboard?

We thought by now we’d have hoverboards and flying cars. We certainly don’t, but we’re more advanced than other areas. And so, it’s more of a matter of necessity that drives invention. And the huge highlight in glaring pressure on overcoming language issues is going to help us to find solutions. And those solutions are not going to be AI only. No. We’re not AI only, it’s just not there yet. But it’s going to vastly accelerate whatever we can do in it. So that’s particularly exciting for me.

David Yakobovitch

And so looking forward at Vision 2030 beyond the pandemic, Unbabel being the world’s translation layer, what are some of the trends and highlights you’re seeing that the industry players and companies should be thinking about with language? I asked this because we see some of the big tech companies, Facebook and Apple, talking about AR and VR and new devices that could become the new normal by 2030. So what should we start planning and thinking about so we can be a part of this new digital age?

Vasco Pedro

If you look at the next two, three years, we’re going to see a resurgence of voice. So, two years ago, we thought everything was going to be conversational, but more on chat-like interfaces.

Conversational is still going to be, you mentioned the interface, but it’s going to expand more beyond text into voice. And we’re seeing that through devices like Alexa and Google homes, et cetera. If you look 10 years from now, you’re going to see that AR and VR really take off. And enable a lot of use cases, business use cases that just make a lot of sense.

Something as simple as this podcast, for example. If we had actually met in person for every podcast you do, it would probably be really hard. Which was pioneered on media is going to migrate into a lot of business use cases because we were forced to do it. So I do think that business travel is going to take a little bit longer to come back. And a lot of it is never going to come back because there’s a lot of transactional meetings that we can do.

And as VR and AR really come to life, the ability to have a seamlessly, in-person meeting through AR classes. That’s one of the killer use cases. And so, in 10 years, we’ll definitely have that.

If you look beyond 10 years, another thing that is very in the beginning, but I am particularly excited is a brain to computer interface. And we’re seeing that with Neuro-Link. Facebook also has a neurophotonics project. And we’re just in the beginning of the real hybrid connection between AI and humans at the brain level.

It’s going to take 10 years to evolve the interface to a point where you can start really texting language with it. But when it comes to language, what I see in 15 years from now is, neural link implant or whatever implant you’re using will provide those capabilities. And so that will be native and you’ll be able to use it in whatever form, whether it’s AR, VR or in real life. And so that’s going to be particularly exciting.

David Yakobovitch

And so that’s exciting for technology because this means you and I are getting to experience a new world, maybe a little bit of this ready player. One that we’ve been seeing in the movie theaters with Pixar and Disney coming to life. Taking it home to more practical, to today, what’s some of our call to action and next steps you have for our listeners on the show?

Vasco Pedro

So number one, if you’re a consumer, don’t settle for bad customer service, just because they don’t speak English. If you’re a company, there is no reason anymore for you to not provide seamless customer service in every language. Especially if you are focused on non-voice, there’s really no reason. And so, if you want to figure out how to do that in an effective way, how to scale your customer service across multiple markets, you should be talking to us.

That’s an easy win right there. Make sure you can talk to your customers in their own language because that’s more and more of the expectation and the benefits that it creates for your business, that it creates for you and the world are very tangible and achievable. So that’s an immediate step. You can do that. It wasn’t really available until recently. So that would be my call to action.

David Yakobovitch

Fantastic. Vasco Pedro, the CEO and co-founder of Unbabel. Thank you for joining us on HumAIn.

Vasco Pedro

My pleasure. Thank you very much.

David Yakobovitch

Thank you for listening to this episode of the HumAIn podcast. Did the episode measure up to your thoughts on ML and AI data science, developer tools and technical education?

Share your thoughts with me at humainpodcast.com/contact. Remember to share this episode with a friend, subscribe and leave a review. And listen for more episodes of HumAIn.

Works Cited

¹Vasco Pedro

Companies Cited

²Ubabel