June 17, 2024
Cohere CEO Aidan Gomez sees AI’s pathway to profitability

Today, I’m talking with Aidan Gomez, the CEO and co-founder of Cohere. Notably, Aidan used to work at Google, where he was one of the authors of a paper called “Attention is all you need” that described transformers and really kicked off the LLM revolution in AI.

Cohere is one of the buzziest AI startups around right now, but its focus is a little different than many of the others. Unlike, say, OpenAI, it’s not making consumer products at all. Instead, Cohere is focused on the enterprise market and making AI products for big companies.

Aidan and I talked a lot about that difference and how it potentially gives Cohere a much clearer path to profitability than some of its competitors. Computing power is expensive, especially in AI, but you’ll hear Aidan explain that the way Cohere is structured gives his company an advantage because it doesn’t have to spend quite as much money to build its models.

One interesting thing you’ll also hear Aidan talk about is the benefit of competition in the enterprise space. A lot of the tech industry is very highly concentrated, with only a handful of options for various services. Regular Decoder listeners have heard us talk about this a lot before, especially in AI. If you want GPUs to power your AI models, you’re probably buying something from Nvidia — ideally a big stack of Nvidia H100s, if you can even get any.

But Aidan points out that his enterprise customers are both risk averse and price sensitive: they want Cohere to be operating in a competitive landscape because they can then secure better deals instead of being locked into a single provider. So Cohere has had to be competitive from the beginning, which Aidan says has made the company thrive.

Aidan and I also talked a lot about what AI can and can’t do. We agreed that it’s definitely not “there” yet. It’s not ready, whatever you think the future might hold. Even if you’re training an AI on a limited, specific, deep set of data, like contract law, you still need a human in the loop. But he sees a time when AI will eventually surpass human knowledge in fields like medicine. If you know anything about me, you know I am very skeptical of that idea.

And then there’s the really big tension you’ll hear us get into all the way through this episode: up until recently, computers have been deterministic. If you give computers a certain input, you usually know exactly what output you’re going to get. It’s predictable. There’s a logic to it. But if we all start talking to computers with human language and getting human language back… well, human language is messy. And that makes the entire process of knowing what to put in and what exactly we’re going to get out of our computers different than it has been before. I really wanted to know if Aidan thinks LLMs, as they exist today, can bear the weight of all of our expectations for AI given that messiness.

Okay, Aidan Gomez, CEO of Cohere. Here we go.

This transcript has been lightly edited for length and clarity.

Aidan Gomez, you are the co-founder and CEO of Cohere. Welcome to Decoder.

Thank you. I’m excited to be here.

I’m excited to talk to you. It feels like Cohere has a very different approach to AI, you have a very different approach to AI. I want to talk about all of that and the competitive landscape. I’m dying to know if you think it’s a bubble.

But I want to start with a very big question: you are one of the eight co-authors on the paper that started this all. “Attention is all you need.” That’s the paper that described transformers at Google. That’s the T in “GPT.” I always ask this question of people who have been on the journey — like when I think about music documentaries, there are the kids in the garage playing their instruments, and then they’re in the stadium, and no one ever talks about act two.

You were in the garage, right? You’re writing this paper; you’re developing transformers. When did you know this technology would be the basis for everything that’s come in the modern AI boom?

I think it was not clear to me — certainly while we were doing the work, it felt like a regular research project. It felt like we were making good progress on translation, which is what we built the transformer for, but that was a pretty well-understood, well-known problem. We already had Google Translate; we wanted to make it a little bit better. We improved the accuracy by a few percent by creating this architecture, and I thought it was done. That was the contribution. We improved translation a little bit. It happened later that we started to see the community pick up the architecture and start to apply it to way more stuff than we had ever contemplated when building it.

I think it took about a year for the community to take notice. First, it was published, it went into an academic conference, and then we just started to see this snowball effect where everyone started adapting it to new use cases. It wasn’t just for translation. It started being used for all of these other NLP, or natural language processing, applications. Then we saw it applied toward language modeling and language representation. That was really the spark where things started to change.

This is a very familiar kind of abstract process for any new technology product: people develop a new technology for a purpose, a lot of people get their hands on it, the purpose changes, the use cases expand beyond what the inventors ever thought of, and now the next version of the technology gets tailored to what the users are doing.

Tell me about that. I want to talk about Cohere and the actual company you’re building, but that turn with transformers and LLMs and what people think they can do now — it feels like the gap is actually widening. [Between] what the technology can do and what people want it to do, it feels like that gap is widening.

I’m just wondering, since you were there at the beginning, how did you feel about that first turn, and do you think we’re getting beyond what the technology can do?

I like that description, the idea that the gap is widening because it’s inspired so many people. I think the expectations are increasing dramatically, and it’s funny that it works that way. The technology has improved massively and it’s changed in terms of its utility dramatically.

There’s no way, seven years ago when we created the transformer, any of us thought we’d be here. It happened much, much faster than anticipated. But that being said, that just raises the bar in terms of what people expect. It’s a language model and language is the intellectual interface that we use, so it’s very easy to personify the technology. You expect from the tech what you expect from a human. I think that’s reasonable. It’s behaving in ways that are genuinely intelligent. All of us who are working on this technology project of realizing language models and bringing AI into reality, we’re all pushing for the same thing, and our expectations have raised.

I like that characterization that the bar for AI has risen. Over the past seven years, there have been so many naysayers of AI: “Oh, it’s not going to continue getting better”; “Oh, the methods that we’re using, this architecture that we’re using, it’s not the right one,” etc.

And [detractors] would set bars saying, “Well, it can’t do this.” But then, fast-forward three months, and the model can do that. And they say, “Okay, well, it can do that, but it can’t do…”

That goalposts moving process, it’s just kept going for seven years. We’ve just kept beating expectations and surpassing what we thought was possible with the technology.

That being said, there’s a long way to go. As you point out, I think there are still flaws to the technology. One of the things I’m nervous about is that because the technology is so similar to what it feels like to interact with a human, people overestimate it or trust it more than they should. They put it into deployment scenarios that it’s not ready for.

That brings me to one of my core questions that I think I’m going to start asking everybody who works in AI. You mentioned intelligence, you mentioned the capabilities, you said the word “reasoning,” I think. Do you think language is the same as intelligence here? Or do you think they’re evolving in the technology on different paths — that we’re getting better and more capable of having computers use language, and then intelligence is increasing at a different rate or maybe plateauing?

I don’t think that intelligence is the same thing as language. I think in order to understand language, you need a high degree of intelligence. There’s a question as to whether these models understand language or whether they’re just parroting it back to us.

This is the other very famous paper at Google: the stochastic parrots paper. It caused a lot of controversy. The claim of that paper is that these [models] are just repeating words back at us, and there isn’t some deeper intelligence. And actually, by repeating things back to us, they will express the bias that the things are trained on.

That’s what intelligence gets you over, right? You can learn a lot of things and your intelligence will help you transcend the things that you’ve learned. Again, you were there at the beginning. Is that how you see it — that the models can transcend their training? Or will they always be limited by that?

I would argue humans do a lot of parroting and have a lot of biases. To a large extent, the intelligent systems that we do know exist — humans — we do a lot of this. There’s that saying that we’re the average of the 10 books we read or the 10 people closest to us. We model ourselves off of what we’ve seen in the world.

At the same time, humans are genuinely creative. We do stuff that we’ve never seen before. We go beyond the training data. I think that’s what people mean when they say intelligence, that you’re able to discover new truths.

That’s more than just parroting back what you’ve already seen. I think that these models don’t just parrot back what they’ve seen. I think that they’re able to extrapolate beyond what we’ve shown them, to recognize patterns in the data and apply those patterns to new inputs that they’ve never seen before. Definitively, at this stage, we can say we’re past the stochastic parrot hypothesis.

Is that an emergent behavior of these models that has surprised you? Is that something you thought about when you were working on transformers at the beginning? You said it’s been a journey over seven years. When did that realization hit for you?

There were a few moments very early on. At Google, we started training language models with transformers. We just started playing around with it, and it wasn’t the same sort of language model that you interact with today. It was just trained on Wikipedia, so the model could only write Wikipedia articles.

That might have been the most useful version of all of this in the end. [Laughs]

Yeah, maybe. [Laughs] But it was a much simpler version of a language model, and it was a shock to see it because, at that stage back then, computers could barely string a sentence together properly. Nothing they wrote made sense. There were spelling mistakes. It was just a lot of noise.

And then, suddenly one day, we kind of woke up, sampled from the model, and it was writing entire documents as fluently as a human. That just came as this huge shock to me. It was a moment of awe with the technology, and that’s just repeated again and again.

I keep having these moments where, yeah, you are nervous that this thing is just a stochastic parrot. Maybe it’ll never be able to reach the utility that we want it to reach because there’s some sort of fundamental bottleneck there. We can’t make the thing smarter. We can’t push it beyond a particular capability.

Every time we improve these models, it breaks through these thresholds. At this point, I think that that breakthrough is going to continue. Anything that we want these models to be able to do, given enough time, given enough resources, we’ll be able to deliver. It’s important to remember that we’re not at that end state already. There are very obvious applications where the tech isn’t ready. We shouldn’t be letting these models prescribe drugs to people without human oversight [for example]. One day it might be ready. At some point, you might have a model that has read all of humanity’s knowledge about medicine, and you’re actually going to trust it more than you trust a human doctor who’s only been able to, given the limited time that humans have, read a subset. I view that as a very possible future. Today, in the reality that exists, I really hope that no one is taking medical advice from these models and that a human is still in the loop. You have to be conscious of the limitations that exist.

That’s very much what I mean when I say the gap is widening, and I think that brings us to Cohere. I wanted to start with what I think of as act two, because act two traditionally gets so little attention: “I built a thing and then I turned it into a business, and that was hard for seven years.” I feel like it gets so little attention, but now it’s easier to understand what you’re trying to do at Cohere. Cohere is very enterprise-focused. Can you describe the company?

We build models and we make them available for enterprises. We’re not trying to do something like a ChatGPT competitor. What we’re trying to build is a platform that lets enterprises adopt this technology. We’re really pushing on two fronts. The first is: okay, we just got to the state where computers can understand language. They can speak to us now. That should mean that pretty much every computational system, every single product that we’ve built, we can refactor it to have that interface and to allow humans to interact with it through their language. We want to help industry adopt this tech and implement language as an interface into all of their products. That’s the first one. It’s very external-facing for these companies.

The second one is internally facing, and it’s productivity. I think it’s becoming clear that we’re entering into a new Industrial Revolution that, instead of taking physical labor off the backs of humanity, is focused on taking intellectual labor. These models are smart. They can do complicated work that requires reasoning, deep understanding, access to a lot of data and information, which is what a lot of humans do today in work. We can take that labor, and we can put it on these models and make organizations dramatically more productive. Those are the two things that we’re trying to accomplish.

One of the things about using language to talk to computers and having computers speak to you in language, famously, is that human language is prone to misunderstandings. Most of history’s great stories involve some deep misunderstanding in human language. It’s nondeterministic in that way. The way we use language is really fuzzy.

Programming computers is historically very deterministic. It’s very predictable. How do you think philosophically about bridging that gap? We’re going to sell you a product that makes the interface to your business a little fuzzier, a little messier, perhaps a little more prone to misunderstanding, but it’ll be more comfortable.

How do you think about that gap as you go into market with a product like this?

The way that you program with this technology, it’s nondeterministic. It’s stochastic. It’s probabilities. There’s literally a chance that it could say everything. There’s some probability that it will say something completely absurd.

I think our job, as technology builders, is to introduce good tools for controllability so that probability is one in many, many trillion — so in practice, you never observe it. That being said, I think that businesses are used to stochastic entities and conducting their business using that because we have humans. We have salespeople and marketers, so I think we’re very used to that. The world is robust to having that present. We’re robust to noise and error and mistakes. Hopefully you can trust every salesperson, right? Hopefully they never mislead or overclaim, but in reality, they do mislead and overclaim sometimes. So when you’re being pitched to by a salesperson, you apply appropriate bounds around what they’re saying. “I’m not going to completely take whatever you say as gospel.”

I think that the world is actually super robust to having systems like these play a part. It might seem scary at first because it’s like, “Oh, well, computer programs are completely deterministic. I know exactly what they’re going to output when I put in this input.” But that’s actually unusual. That’s weird in our world. It’s super weird to have truly deterministic systems. That’s a new thing, and we’re actually getting back to something that’s much more natural.

When I look at a jailbreak prompt for one of these chatbots, you can see the leading prompt, which typically says something like, “You are a chatbot. Don’t say these things. Make sure you answer in this way. Here’s some stuff that’s completely out of bounds for you.” Those get leaked all the time, and I find them fascinating to read. They’re often very long.

My first thought every time is that this is an absolutely bananas way to program a computer. You’re going to talk to it like a somewhat irresponsible teenager and say, “This is your role,” and hopefully it follows it. And maybe there’s a one in a trillion chance it won’t follow it and it’ll say something crazy, but there’s still a one in a trillion chance that even after all of these instructions are given to a computer, it’ll still go completely off the rails. I think the internet community delights in making these chatbots go off the rails.

You’re selling enterprise software. You’re going into big companies and saying, “Here are our models. We can control them, so that reduces the possibility of chaos, but we want you to reinvent your business with these tools because they will make some things better. It will make your productivity higher. It’ll make your customers happier.” Are you sensing a gap there?

That’s the big cultural reset that I think about. Computers are deterministic. We’ve built modernity around the very deterministic nature of computers; you know what outputs you’ll get versus what inputs. And now you have to say to a bunch of businesses, “Spend money. Risk your business on a new way of thinking about computers.”

It’s a big change. Is that working? Are you seeing excitement around that? Are you seeing pushback? What’s the response?

That goes back to what I was saying about knowing where to deploy the technology and what it’s ready for, what it’s reliable enough for. There are places where we don’t want to put this technology today because it’s not robust enough. I’m lucky in that, because Cohere is an enterprise company, we work really closely with our customers. It’s not like we just throw it out there and hope they succeed. We’re very involved in the process and helping them think through where they deploy it and what change they’re trying to drive. There’s no one who’s giving access to their bank account to these models to manage their money, I hope.

There are places where, yeah, you want determinism. You want extremely high confidence guardrails. You’re not going to just put a model there and let it decide what it wants to do. In the vast majority of use cases and applications, it’s actually about augmenting humans. So you have a human employee who is trying to get some work done and they’re going to use this thing as a tool to basically make themselves faster, more effective, more efficient, more accurate. It’s augmenting them, but they’re still in the loop. They’re still checking that work. They’re still making sure that the model is producing something that’s sensible. At the end of the day, they’re accountable for the decisions that they make and what they do with that tool as part of their job.

I think what you’re pointing to is what happens in those applications where a human is completely out of the loop and we’re really offloading the full job onto these models. That’s a ways away. I think that you’re right. We need to have much more trust and controllability and the ability to set up those guardrails so that they behave more deterministically.

You pointed to the prompting of these models and how it’s funny that the way you actually control them is by talking to them.

It’s like a stern lecture. It is crazy to me every time I look at one.

I think that it’s somewhat magical: the fact that you can actually control the behavior of these things effectively using that method. But beyond that, beyond just prompting and talking to this thing, you can set up controls and guardrails outside of the model. You can have models watching this model and intervening and blocking it from doing certain actions in certain cases. I think what we need to start changing is our conception of, is this one model? It’s one AI, which we’re just handing control over to. What if it messes up? What if everything goes wrong?

In reality, it’s going to be much larger systems that include observation systems that are deterministic and check for patterns of failure. If the model does this and this, it’s gone off the rails. Shut it down. That’s a completely deterministic check. And then you’ll have other models, which can observe and sort of give feedback to the model to prevent it from taking actions if it’s going astray.

The programming paradigm, or the technology paradigm, started off as what you’re describing, which is, there’s a model, and you’re going to apply it to some use case. It’s just the model and the use case. It’s shifting toward bigger systems with much more complexity and components, and it’s less like there’s an AI that you’re applying to go do work for you, and it’s actually a sophisticated piece of software that you’re deploying to go do work for you.

Cohere right now has two models: Cohere Command and Cohere Embed. You’re obviously working on those models. You’re training them, developing them, applying them to customers. How much of the company is spending its time on this other thing you’re describing — building the deterministic control systems, figuring out how to chain models together to provide more predictability?

I can speak to the enterprise world, and there, enterprises are super risk averse. They’re always looking for opportunities, but they’re extremely risk averse. That’s the first thing that they’re thinking about. Pretty much every initial conversation I have with a customer is about what you’re asking — that’s the first thing that comes to a person’s mind. Can I use the system reliably? We need to show them, well, let’s look at the specific use case that you’re pursuing. Maybe it is assisting lawyers with contract drafting, which is something that we do with a company called Borderless.

In that case, you need a human in the loop. There’s no way you’re going to send out contracts that are completely synthetically generated with no oversight. We come in and we try to help guide and educate in terms of the types of systems that you can build for oversight, whether it’s humans in the loop or more automated systems to help de-risk things. With consumers, it’s a little bit different, but for enterprises, the very first question we’ll get from a board or a C-suite at the company is going to be related to risk and protecting against it.

To apply that to Cohere and how you’re developing your products: how is Cohere structured? Is that reflected in how the company is structured?

I think so. We have safety teams internally that are focused on making our models more controllable, less biased, and at the same time, on the go-to-market side, because this technology is new, that project is an education campaign. It’s getting people familiar with what this technology is.

It’s a paradigm shift in terms of how you build software and technology. Like we were saying, it’s stochastic. To educate people about that, we build stuff like the LLMU, which is like the LLM university, where we teach people what the pitfalls might be with the tech and how to protect against those. For us, our structure is focused on helping the market get familiar with the technology and its limitations while they’re adopting it.

How many people are at Cohere?

It’s always shocking to say, but we’re about 350 people at the moment, which is insane to me.

It’s only insane because you’re the founder.

It was like yesterday, it was just Nick [Frosst], Ivan [Zhang], and I in this tiny little… basically a closet. I don’t know how many square meters it was but, you know, single digits. We had a company offsite a few weeks back, and it was hundreds of people building this thing alongside you. You do ask yourself, how did we get here? How did all of this happen? It’s really fun.

Of those 350 people, what’s the split? How many are engineering? How many are sales? Enterprise companies need a lot of post-sales support. What’s the split there?

The overwhelming majority are engineers. Very recently, the go-to-market team has exploded. I think that market is just going into production now with this technology. It’s starting to actually hit the hands of employees, of customers, of users.

Last year was sort of the year of the POC, or the proof of concept. Everyone became aware of the technology. We’ve been working on this for nearly five years now. But it was only really 2023 when the general public noticed it and started to use it and fell in love with the technology. That led to enterprises… there are people, too, they’re hearing about this, they’re using this, they’re thinking of how they can adopt the technology. They got excited about it, and they spun up these tests, these POCs, to try and build a deeper understanding of and familiarity with the tech.

Those POCs, the initial cohort of them, they’re complete now, and people like the stuff that they’ve built. Now, it’s a project of taking those predeployment tests and actually getting them into production in a scalable way. That’s the majority of our focus is scalability in production.

Is that scalability as in, “Okay, we can add five more customers without a massive incremental cost”? Is it scalability in compute? Is it scalability in how fast you’re designing the solutions for people? Or is it everything?

It’s all of the above. As a lot of people may have heard, the tech is expensive to build and expensive to run. We’re talking hundreds of billions, trillions, of tunable parameters inside just a single one of these models, so it requires a lot of memory to store these things. It requires tons of compute to run them. In a POC, you have like five users, so scalability doesn’t matter. The cost is kind of irrelevant. You just want to build a proof of what is possible. But then, if you like what you’ve built and you’re going to push this thing into production, you go to your finance office and you say, “Okay, here’s what it costs for five users. We’d like to put it in front of all 10 million.”

The numbers don’t compute. It’s not economically viable. For Cohere, we’ve been focused on not making the largest possible model but, instead, making the model that market can actually consume and is actually useful for enterprises.

That’s doing what you say, which is focusing on compression, speed, scalability, on ensuring that we can actually build a technology that market can consume. Because, over the past few years, a lot of this stuff has been a research project without large-scale deployment. The concerns around scalability hadn’t yet emerged, but we knew for enterprises, which are very cost-sensitive entities, very economically driven, if they can’t make the numbers work in terms of return on investment, they don’t adopt it. It’s very simple. So we’ve been focused on building a category of technology that is actually the right size for the market.

You obviously started all of this work at Google. Google has an infinite amount of resources. Google also has massive operational scale. Its ability to optimize and bring down the cost curve of new technologies like this is very high, given Google’s infrastructure and reach. What made you want to go and do this on your own without its scale?

Nick was also at Google. We were both working for Geoff Hinton in Toronto. He was the guy who created neural networks, the technology that underpins all of this, that underpins LLMs. It underpins pretty much every AI that you interact with on a daily basis.

We loved it there, but I think what was missing was a product ambition and a velocity that we felt was necessary for us to execute. So we had to start Cohere. Google was a great place to do research, and I think it has some of the smartest people in AI on the face of the planet. But for us, the world needed something new. The world needed Cohere and the ability to adopt this technology from an organization that wasn’t tied to any one cloud, any one hyperscaler. Something that’s very important to enterprises is optionality. If you’re a CTO at a large retailer, you’re probably spending half a billion dollars, a billion dollars, on one of the cloud providers for your compute.

In order to get a good deal, you need to be able to plausibly flip between providers. Otherwise, they’re just going to squeeze you ad infinitum and rip you off. You need to be able to flip. You hate buying proprietary technology that’s only available on one stack. You really want to preserve your optionality to flip between them. That’s what Cohere allows for. Because we’re independent, because we haven’t gotten locked into one of these big clouds, we’re able to offer that to market, which is super important.

Let me ask you the Decoder question. We’ve talked a lot about the journey to get here, the challenges you need to solve. You’re a founder. You’ve got 350 people now. How do you make decisions? What’s your framework for making decisions?

What’s my framework… [Laughs] I flip a coin.

I think I’m lucky in that I’m surrounded by people who are way smarter than me. I’m just surrounded by them. Everyone at Cohere is better than me at the thing that they do. I have this luxury of being able to go ask people for advice, whether it’s the board of Cohere, or the executive team of Cohere, or the [individual contributors], the people who are actually doing the real work. I can ask for advice and their takes, and I can be an aggregation point. When there are ties, then it comes down to me. Usually, it’s just going with my intuition about what’s right. But fortunately, I don’t have to make a lot of decisions because I have way smarter people that surround me.

There are some big decisions you do have to make. You just, for example, announced two models in April, Command R and one called Rerank 3. Models are costly to train. They’re costly to develop. You’ve got to rebuild your technology around the new models and its capabilities. Those are big calls.

It feels like every AI company is racing to develop the next generation of models. How are you thinking about that investment over time? You talked a lot about the cost of a proof of concept versus an operationalized thing. New models are the most expensive of them all. How are you thinking about those costs?

It’s really, really expensive. [Laughs]

Can you give us a number?

I don’t know if I can give a specific number, but I can say, like, order of magnitude. In order to do what we do, you need to spend hundreds of millions of dollars a year. That’s what it costs. We think that we’re hyper capital-efficient. We’re extremely capital-efficient. We’re not trying to build models that are too big for market, that are kind of superficial. We’re trying to build stuff that market can actually consume. Because of that, it’s cheaper for us, and we can focus our capital. There are folks out there spending many, many billions of dollars a year to build their models.

That’s a huge consideration for us. We’re lucky in that we’re small, relatively speaking, so our strategy lends itself toward more capital efficiency and actually building the technology that market needs as opposed to building prospective research projects. We focus on actual tech that the market can consume. But like you say, it’s hugely expensive, and the way that we solve that is a) raising money, getting the capital to actually pay for the work that we need to do, and then b) choosing to focus on our technology. So instead of trying to do everything, instead of trying to nail every single potential application of the technology, we focus on the patterns or use cases that we think are going to be dominant or are dominant already in how people use it.

One example of that is RAG, retrieval augmented generation. It’s this idea that these models are trained on the internet. They have a lot of knowledge about public facts and that type of thing. But if you’re an enterprise, you want it to know about you. You want it to know about your enterprise, your proprietary information. What RAG lets you do is sit your model down next to your private databases or stores of your knowledge and connect the two. That pattern, that’s something that is ubiquitous. Anyone who’s adopting this technology, they want it to have access to their internal information and knowledge. We focused on getting extremely good at that pattern.

We’re fortunate. We have the guy who invented RAG, Patrick Lewis, leading that effort at Cohere. Because we’re able to carve away a lot of the space of potential applications, it lets us be dramatically more efficient in what we want to do and what we want to build with these models. That’ll continue into the future, but that’s still a multi-hundred million dollar a year project. It’s very, very capital-intensive.

I said I wanted to ask you if this was a bubble. I’ll start with Cohere specifically, but then I want to talk about the industry in general. So it’s multiple hundreds of millions of dollars a year just to run the company, to run the compute. That’s before you’ve paid a salary. And the AI salaries are pretty high, so that’s another bunch of money you have to pay. You have to pay for office space. You have to buy laptops. There’s a whole bunch of stuff. But just the compute is hundreds of millions of dollars a year. That’s the run rate on just the compute. Do you see a path to revenue that justifies that amount of pure run rate in compute?

Absolutely. We wouldn’t be building it if we didn’t.

I think your competitors are like, it’ll come. There’s a lot of wishful thinking. I’m starting with that question with you because you have started an enterprise business. I’m assuming you see a much clearer path. But in the industry, I see a lot of wishful thinking that it’ll just arrive down the road.

So what is Cohere’s path specifically?

Like I said, we’re dramatically more capital-efficient. We might spend 20 percent what some of our competitors spend on compute. But what we build is very, very good at the stuff that market actually wants. We can chop off 80 percent of the expense and deliver something that is just as compelling to market. That’s a core piece of our strategy of how we’re going to do this. Of course, if we didn’t see a business that was many, many billions in revenue, we wouldn’t be building this.

What’s the path to billions in revenue? What’s the timeline?

I don’t know how much I can disclose. It’s closer than you’d think. There’s a lot of spend that’s being activated in market. Certainly already there are billions being spent on this technology in the enterprise market today. A lot of that goes to the compute as opposed to the models. But there is a lot of spending happening in AI.

Like I was saying, last year was very much a POC phase, and POC spend is about 3–5 percent of what a production workload looks like. But now those production workloads are coming on line. This technology is hitting products that interact with tens or hundreds of millions of people. It’s really becoming ubiquitous. So I think it’s close. It’s in a matter of a few years.

It’s typical for a technology adoption cycle. Enterprises are slow. They tend to be slow to adopt. They’re very sticky. Once they’ve adopted something, it’s there in perpetuity. But it takes them a while to build confidence and actually make the decision to commit and adopt the technology. It’s only been about a year and a half since people woke up to the tech, but in that year and a half, we’re now starting to see really serious adoption and serious production workloads.

Enterprise technology is very sticky. It will never go away. The first thing that comes to mind is Microsoft Office, which will never go away. The foundation of their enterprise strategy is Office 365. Microsoft is a huge investor in OpenAI. They’ve got models of their own. They’re the big competitor for you. They’re the ones in market selling Azure to enterprise. They’re a hyperscaler. They’ll give you deals. They’ll integrate it directly so you can talk to Excel. The pitch that I’ve heard many times from Microsoft folks is that you have people in the field who need to wait for an analyst to respond to them, but now, they can just talk to the data directly and get the answer they need and be on their way. That’s very compelling.

I think it requires a lot of cultural change inside some of these enterprises to let those sorts of things happen. You’re obviously the challenger. You’re the startup. Microsoft is 300,000 people. You’re 350 people. How are you winning business for Microsoft?

They’re a competitor in some respects, but they’re also a partner and a channel for us. When we released Command R and Command R Plus, our new models, they were first available on Azure. I definitely view them as a partner in bringing this technology to enterprise, and I think that Microsoft views us as a partner as well. I think they want to create an ecosystem powered by a bunch of different models. I’m sure they’ll have their own in there. They’ll have OpenAI’s, they’ll have ours, and it’ll be an ecosystem as opposed to only proprietary Microsoft tech. Look at the story in databases — there, you have fantastic companies like Databricks and Snowflake, which are independent. That’s not a subsidiary of Amazon or Google or Microsoft. They’re an independent company, and the reason they’ve done so well is because they have an incredible product vision. The product that they’re building is genuinely the best option for customers. But also the fact that they’re independent is crucial to their success.

I was describing where CTOs don’t want to get locked into one proprietary software stack because it’s such a pain and a strategic risk to their ability to negotiate. I think the same is going to be true. It’s even more important with AI where these models become an extension of your data. They are the value of your data. The value of your data is that you’ll be able to power an AI model that drives value for you. The data in itself is not inherently valuable. The fact that we’re independent, folks like Microsoft, Azure, AWS, and GCP, they want us to exist, and they have to support us because the market is going to reject them.

If they don’t, the market is going to insist on being able to adopt independence that lets them flip between clouds. So they kind of have to support our models. That’s just what the market wants. I don’t feel like they’re exclusively a competitor. I view them as a partner to bring this technology to market.

One thing that’s interesting about this conversation, and one of the reasons I was excited to talk with you, is because you are so focused on enterprise. There’s a certainty to what you’re saying. You’ve identified a bunch of customers with some needs. They’ve articulated their needs. They have money to spend. You can identify how much money it is. You can build your business around that money. You keep talking at the market. You can spend your budget on technology appropriately for the size of the money that’s available in the market.

When I ask if it’s a bubble, what I’m really talking about is the consumer side. There are these big consumer AI companies that are building big consumer products. Their idea is people pay 20 bucks a month to talk to a model like this, and those companies are spending more money on training than you are. They’re spending more money per year on compute than you are. They are the leading-edge companies.

I’m talking about Google and OpenAI, obviously, but then there’s a whole ecosystem of companies that are paying OpenAI and Google a margin to run on top of their models to go sell a consumer product at a lower rate. That does not feel sustainable to me. Do you have that same worry about the rest of the industry? Because that’s what’s powering a lot of the attention and interest and inspiration, but it doesn’t seem sustainable.

I think those folks who are building on top of OpenAI and Google should be building on top of Cohere. We’ll be a better partner.

[Laughs] I laid that one out for you.

You’re right to identify that the companies’ focus, the technology providers’ focus, might conflict with its users, and you might find yourself in situations where — I don’t want to name names, but let’s say there’s a consumer startup that’s trying to build an AI application for the world and it’s building on top of one of my competitors who is also building a consumer AI product. There’s a conflict inherent there, and you might see one of my competitors steal or rip off the ideas of that startup.

That’s why I think Cohere needs to exist. You need to have folks like us, who are focused on building a platform, to enable others to go create those applications — and that are really invested in their success, free of any conflicts or competitive nature.

That’s why I think we’re a really good partner is because we’re focused and we let our users succeed without trying to compete or play in the same space. We just build a platform that you can use to adopt the technology. That’s our whole business.

Do you think it’s a bubble when you look around the industry?

I don’t. I really don’t. I don’t know how much you use LLMs day to day. I use them constantly, like multiple times an hour, so how could it be a bubble?

I think maybe the utility is there in some cases, but the economics might not be there. That’s how I would think about it being a bubble.

I’ll give you an example. You’ve talked a lot about the dangers of overhyping AI, even in this conversation, but you’ve talked about it publicly elsewhere. You’ve talked about how you’ve got two ways to fund your compute: you can get customers and grow the business, or you can raise money.

I look at how some of your competitors raise money, and it’s by saying things like, “We’re going to build AGI on the back of L1s” and “We actually need to pause development so we can catch up because we might destroy the world with this technology.”

That stuff, to me, seems pretty bubbly. Like, “We need to raise a lot of money so we can continue training the next frontier model before we’ve built a business that can even support the compute of the existing model.” But it doesn’t seem like you’re that worried about it. Do you think that’s going to even itself out?

I don’t know what to say, aside from I very much agree that is a precarious setup. The reality is, for folks like Google and Microsoft, they can spend billions of dollars. They can spend tens of billions of dollars on this, and it’s fine. It doesn’t really matter. It’s a rounding error. For startups taking that strategy, you need to become a subsidiary of one of those big tech companies that prints money or do some very, very poor business building in order to do that.

That’s not what Cohere is pursuing. I agree with you to a large extent. I think that’s a bad strategy. I think that ours, the focus on actually delivering what market can consume and building the products and the technology that is the right size or fit for our customers, that’s what you need to do. That’s how you build a business. That’s how all successful businesses were built. We don’t want to get too far out in front of our skis. We don’t want to be spending so much money that it’s hard to see a path toward profitability. Cohere’s focus is very much on building a self-sustaining independent business, so we’re forced to actually think about this stuff and steer the company in a direction that supports that.

You’ve called the idea that AI represents existential risk — I believe the word you’ve used is “absurd,” and you’ve said it’s a distraction. Why do you think it’s absurd, and what do you think the real risks are?

I think the real risks are the ones that we spoke about: overeager deployment of the technology too early; people trusting it too much in scenarios where, frankly, they shouldn’t. I’m super empathetic to the public’s interest in the doomsday or Terminator scenarios. I’m interested in those scenarios because I’ve watched sci-fi and it always goes badly. We’ve been told those stories for decades and decades. It’s a very salient narrative. It really captures the imagination. It’s super exciting and fun to think about, but it’s not reality. It’s not our reality. As someone who’s technical and quite close to the technology itself, I don’t see us heading in a direction that supports the stories that are being told in the media and, often, by companies that are building the tech.

I really wish that our focus was on two things. One is the risks that are here today, like overeager deployment, deploying them in scenarios without human oversight, those sorts of discussions. When I talk to regulators, when I talk to folks in government, that’s the stuff they actually care about. It’s not doomsday scenarios. Is this going to hurt the general public if the financial industry adopts it in this way or the medical industry adopts it in this way? They’re quite practical and actually grounded in the reality of the technology.

The other thing that I would really love to see a conversation about is the opportunity, the positive side. We spend so much time on the negatives and fear and doom and gloom. I really wish someone was just talking about what we could do with the technology or what we want to do because, as much as it’s important to steer away from the potential negative paths or bad applications, I also want to hear the public’s opinion and public discourse about the opportunities. What good could we do?

I think one example is in medicine. Apparently, doctors spend 40 percent of their time taking notes. This is in between patient visits — you have your interaction with the patient, you then go off, go to your computer, and you say, “So and so came in. They had this. I remember from a few weeks ago when they came in, it looked like this. We should check this the next time they come in. I prescribed this drug.” They spend a lot of time typing up these notes in between the interactions with patients. Forty percent of their time, apparently. We could attach passive listening mics that just go from patient meeting to patient meeting with them, transcribe the conversations, and pre-populate that. So instead of having to write this whole thing from scratch, they read through it and they say, “No, I didn’t say that, I said this and add that.” And it becomes an editing process. We bring that 40 percent down to 20 percent. Overnight, we have 25 percent more doctor hours. I think that’s incredible. That’s a huge good for the world. We haven’t paid to train doctors. We haven’t added more doctors in school. They have 25 percent more time just by adopting technology.

I want to find more ideas like that. What application should Cohere be prioritizing? What do we need to get good at? What should we solve to drive the good in the world that we want to see? There are no headlines about that. No one is talking about it, and I really wish we were having that conversation.

As somebody who writes headlines, I think, one, there aren’t enough examples of that yet to say it’s real, which I think is something people are very skeptical of. Two, I hear that story and I think, “Oh, boy, a bunch of private equity owners of urgent care clinics just put 25 percent more patients into the doctor’s schedule.”

What I hear from our audience, for example, is that they feel like right now the AI companies are taking a lot without giving enough in return. That’s a real challenge. That’s been expressed mostly in the creative industries; we see that anger directed at the creative generative AI companies.

You’re obviously in enterprise. You don’t feel it, but do you see that — that you’ve trained a bunch of models, you should know where the data comes from, and then the people who made the original work that you’re training on probably want to get compensated for it?

Oh yeah, totally. I’m very empathetic to that.

Do you compensate where you train from?

We pay for data. We pay a lot for data. There are a bunch of different sources of data. There’s stuff that we scrape from the web, and when we do that, we try to abide by people’s preferences. If they express “we don’t want you to collect our data,” we abide by that. We look at robots.txt when we’re scraping code. We look at the licenses that are associated with that code. We filter out data where people have said clearly “don’t scrape this data” or “don’t use this code.” If someone emails us and says, “Hey, I think that you scraped X, Y, and Z, can you remove it?” we will of course remove that, and all future models won’t include that data. We don’t want to be training on stuff that people don’t want us training on, full stop. I’m very, very empathetic to creators, and I really want to support them and build tools to help make them more productive and help them with their ideation and creative process. That’s the impact that I want to have, and I really want to respect their content.

The flip side of it is: those same creators are watching the platforms they publish on get overrun with AI content, and they don’t like it. There’s a little bit of a competitive aspect there. That’s one of the dangers you’ve talked about. There’s a straightforward misinformation danger on social platforms that doesn’t seem to be well mitigated yet. Do you have ideas on how you might mitigate AI-generated misinformation?

One of the things that scares me a lot is that the democratic world is vulnerable to influence and manipulation in general. Take out AI. Democratic processes are [still] very vulnerable to manipulation. We started off the podcast saying that people are the average of the last 50 posts they’ve seen or whatever. You’re very influenced by what you perceive to be consensus. If you look out into the world on social media and everyone seems to agree on X, then you’re like, “Okay, I guess X is right. I trust the world. I trust consensus.”

I think democracy is vulnerable and it’s something that needs to be very vigorously protected. You can ask the question, how does AI influence that? What AI enables is much more scalable manipulation of public discourse. You can spin up a million accounts, and you can create a million fake people that project one idea and present a false consensus to the people consuming that content. Now, that sounds really scary. That’s terrifying. That’s a huge threat.

I think it’s actually very, very preventable. Social media platforms, they’re the new town square. In the old town square, you knew that the person standing on their soapbox was probably a voting citizen alongside you, and so you cared a lot about what they said. In the digital town square, everyone is much more skeptical of the stuff they see. You don’t just take it for granted. We also have methods of confirming humanity. Human verification on social media platforms is a thing, and we need to support it much more thoroughly so that people can see, is this account verified? Is it actually a person on the other side?

What happens when humans start using AI to generate lies at scale? Like me posting an AI-generated image of a political event that didn’t happen is just as damaging, if people believe it, as thousands of robots doing it.

When you can have a single entity creating many different voices saying the same thing to present consensus, you can stop that by preventing fake accounts and confirming with each account that there’s a human verified behind it, so you know it’s another person on the other side and that stops that scaling of millions of fake accounts.

On the other side, what you’re describing is fake media. There’s already fake media. There’s Photoshop. We’ve had this tech for a while. I think it becomes easier to create fake media, and there’s a notion of media verification, but you also, you’re going to trust different sources differently. If it’s your friend posting it, who you know in the real world, you trust that a lot. If it’s some random account, you don’t necessarily believe everything that they claim. If it’s coming from a government agency, you’re going to trust it differently. If it’s coming from media, depending on the source, you’re going to trust it differently.

We know how to assign appropriate levels of trust to different sources. It’s definitely a concern, but it’s one that is addressable. Humans are already very aware that other humans lie.

I want to ask you one last question. It’s the one I’ve been thinking about the most, and it brings us back to where we started.

We’re putting a lot of weight on these models — business weight, cultural weight, inspirational weight. We want our computers to do these things, and the underlying technology is these LLMs. Can they take that weight? Can they withstand the burden of our expectations? That’s the thing that is not yet clear to me.

There’s a reason Cohere is doing it in a targeted way, but then you just look broadly, and there’s a lot of weight being put on LLMs to get us to this next place in computing. You were there at the beginning. I’m wondering if you think the LLMs can actually take the weight and pressure that’s being put on them.

I think we’ll be perpetually dissatisfied with the technology. If you and I chat in two years, we’re going to be disappointed that the models aren’t inventing new materials fast enough to get us whatever, whatever. I think that we will always be disappointed and want more because that’s just part of human nature. I think the technology will, at each stage, impress us and rise to the occasion and surpass our previous expectations of it, but there’s no point at which people are going to be like, “We’re done, we’re good.”

I’m not asking if it’s done. I’m saying, do you see, as the technology develops, that it can withstand the pressure of our expectations? That it has the capability, or at least the potential capability, to actually build the things that people are expecting to build?

I absolutely think it will. There was a period of time when everyone was like, “The models hallucinate. They make stuff up. They’re never going to be useful. We can’t trust them.” And now, hallucination rates, you can track them over the years, they’ve just dropped dramatically and they’ve gotten much better. With each complaint or with each fundamental barrier, all of us who are building this technology, we work on it and we improve the technology and it surpasses our expectations. I expect that to continue. I see no reason why it shouldn’t.

Do you see a point where hallucinations go to zero? To me, that’s when it unlocks. You can start depending on it in real ways when it stops lying to you. Right now, the models across the board hallucinate in honestly hilarious ways. But there’s a part, to me anyway, that says I can’t trust this yet. Is there a point where the hallucination rate goes to zero? Can you see that on the roadmap? Can you see some technical developments that might get us there?

You and I have non-zero hallucination rates.

Well, yeah, but no one trusts me to run anything. [Laughs] There’s a reason I sit here asking the questions and you’re the CEO. But I’m saying computers, if you’re going to put them in the loop like this, you want to get to zero.

No, I mean, humans misremember stuff, they make stuff up, they get facts wrong. If you’re asking whether we can beat the human hallucination rate, I think so. Yeah, definitely. That’s definitely an achievable goal because humans hallucinate a lot. I think we can create something extremely useful for the world.

Useful, or trustworthy? That’s what I’m getting at is trust. The amount that you trust a person varies, sure. Some people lie more than others. The amount that we have historically trusted computers has been on the order of a lot. And with some of this technology, that amount has dropped, which is really interesting. I think my question is: is it on the roadmap to get to a place where you can fully trust a computer in a way that you cannot trust a person? We trust computers to fly F-22s because a human being cannot operate an F-22 without a computer. If you’re like, “the F-22 control computer is going to lie to you a little bit,” we would not let that happen. It’s weird that we have a new class of computers where we’re like, “Well, trust it a little bit less.”

I do not think that large language models should be prescribing drugs for people or doing medicine. But I promise you, if you come to me, Aidan, with a set of symptoms and you ask me to diagnose you, you should trust Cohere’s model more than me. It knows way more about medicine than I do. Whatever I say is going to be much, much worse than the model. That’s already true, just today, in this exact moment. At the same time, neither me nor the model should be diagnosing people. But is it more trustworthy? You should genuinely trust that model more than this human with that use case.

In reality, who you should be trusting is the actual doctor that’s done a decade of education. So the bar is here; Aidan’s here. [Gestures] The model is slightly above Aidan. We will make it to that bar, I absolutely think, and at that point, we can put on the stamp and say it is trustworthy. It’s actually as accurate as the average doctor. One day, it’ll be more accurate than the average doctor. We will get there with the technology. There’s no reason to believe we wouldn’t. But it’s continuous. It’s not a binary between you can’t trust the technology or you can. It’s, where can you trust it?

Right now, in medicine, we should really rely on humans. But in other places, you can [use AI]. When there’s a human in the loop, it’s actually just an aid. It’s like this augmentative tool that is really useful for making you more productive and doing more or having fun or learning about the world. There are places where you can trust it effectively and deploy it effectively already today. That space of places that you can deploy this technology and put your trust in it, it’s only going to grow. To your question about, will the technology rise to the challenge of all the things that we want it to do? I really deeply believe it will.

That’s a great place to end it. This was really great.

Decoder with Nilay Patel /

A podcast about big ideas and other problems.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *