Large language models cost a fortune to build. OpenAI, which is reportedly in the process of raising 6.5billion,∗needs∗6.5 billion dollars, because, “by some estimates, it’s burning through 7billionayeartofundresearchandnewA.I.servicesandhiremoreemployees.”Anthropic[isexpectedtospend](https://www.theinformation.com/briefings/anthropic−projected−to−burn−more−than−2−7−billion−in−cash−this−year)2.7 billion this year. Facebook is spending billions more. (View Highlight)
It probably won’t get cheaper. Chips might get better; compute costs might go down; Moore’s law; etc, etc, etc. But as models get better, pushing the frontier further out will likely get more difficult. The research gets more harder, and the absolute amount of compute required to train a new model goes up. It’s climbing Mount Everest: The higher you go, the thinner the air, and the tougher each step gets.1 Even if it gets cheaper to do the math required to build new models, that math has diminishing returns. To build a better model in 2024, you have to do more and harder math than you had to do in 2023. (View Highlight)
Despite these costs, people will probably keep building new models. People believe that LLMs are the next technological gold rush, and the companies that build the best ones will make their employees and investors a fortune. They are trying to build artificial general intelligence. Human nature compels us to make everything faster, higher, and stronger. (View Highlight)
If the industry does keep building new models, the value of old models decays pretty quickly. Why use is GPT-3 when you can start using GPT-4 by changing a dropdown in ChatGPT? If a competitor puts out a better model than yours, people can switch to theirs by updating a few lines of code. To consistenly sell an LLM, you have consistently be one of the best LLMs. (View Highlight)
Even if the industry doesn’t keep building new models, or if we hit a technological asymptote, the value of old models still decays pretty quickly. There are several open source models like Llama and Mistral that are, at worst, a step or two behind the best proprietary ones. If the proprietary models stop moving forward, the open source ones will quickly close the gap. (View Highlight)
Therefore, if you are OpenAI, Anthropic, or another AI vendor, you have two choices. Your first is to spend enormous amounts of money to stay ahead of the market. This seems very risky though: The costs of building those models will likely keep going up; your smartestemployees might leave; you probably don’t want to stake your business on always being the first company to find the next breakthrough. Technological expertise is rarely an enduring moat. (View Highlight)
So here’s an obvious prediction: AI will follow a nearly identical trajectory [as AWS, Azure, and GCP]. In ten years, a new type of cloud—a generative one, a commercial Skynet, a public imagination—will undergird nearly every piece of technology we use. (View Highlight)
Other people have made similar comparisons. And on the surface, the analogy seems roughly reasonable. Foundational models require tons of money to build, just like cloud services do. Both could become ubiquitous pieces of the global computing infrastructure. The market for both is easily in the tens of billions of dollars, likely in the hundreds of billions, and potentially in the trillions. (View Highlight)
There is, however, one enormous difference that I didn’t think about: You can’t build a cloud vendor overnight. Azure doesn’t have to worry about a few executives leaving and building a worldwide network of data centers in 18 months. AWS is an internet business, but it dug its competitive moat in the physical world. The same is true for a company like Coca-Cola: The secret recipe is important, but not that important, because a Y Combinator startup couldn’t build factories and distribution centers and relationships with millions of retailers over the course of a three month sprint. (View Highlight)
But an AI vendor could? Though OpenAI’s work requires a lot of physical computing resources, they’re leased (from Microsoft, or AWS, or GCP), not built. Given enough money, anyone could have access to the same resources. It’s not hard to imagine a small team of senior researchers leaving OpenAI, raising a ton of money to rent some computers, and being a legitimate disruptive threat to OpenAI’s core business in a matter of months. (View Highlight)
In other words, the billions that AWS spent on building data centers is a lasting defense. The billions that OpenAI spent on building prior versions of GPT is not, because better versions of it are already available for free on Github. Stylistically, Anthropic put itself deeply in the red to build ten incrementally better models; eight are now worthless, the ninth is open source, and the tenth is the thin technical edge that is keeping Anthropic alive. Whereas cloud providers can be disrupted, it would almost have to happen slowly. Every LLM vendor is eighteen months from dead.3 (View Highlight)
What, then, is an LLM vendor’s moat? Brand? Inertia? A better set of applications built on top of their core models? An ever-growing bonfire of cash that keeps its models a nose ahead of a hundred competitors? (View Highlight)
I honestly don’t know. But AI companies seem to be an extreme example of the market misclassifying software development costs as upfront investments rather than necessary ongoing expenses. An LLM vendor that doesn’t spend tens of millions of dollars a year—and maybe billions, for the leaders—improving their models is a year or two from being out of business. (View Highlight)
Though that math might work for huge companies like Google and Microsoft, and for OpenAI, which has become synonymous with artificial intelligence, it’s hard to see how that works for smaller companies that aren’t already bringing in sizable amounts of revenue. Though giant round funding rounds, often given to pedigreed founders, can help them jump to front of the race, it’s not at all obvious how they stay there, because someone else will do the same thing a year later. The have to either raise enormous amounts of money in perpetuity,4 or they have to start making billions of dollars a year. That’s an awfully high hurdle for survival. (View Highlight)
In this market, timing may be everything: At some point, the hype will die down, and people won’t be able to raise these sorts of rounds. And the winners won’t be who ran the fastest or reached some finish line, but whoever was leading when the market decided the race is over. (View Highlight)