This doesn't make sense. Inference alone is profitable but you have to continually train new models. There isn't a point where you will have a model that is the final model and you can just serve inference and profit, you always have to train more models.
It's not at all the same as what Amazon was doing. At any point, Amazon could have turned off the expansion engine and turned on the profits. AI companies don't have that luxury, if they stop training they'll just fall behind and die because they don't have a competitive model. They are locked into training in order to be competitive, they are not by default profitable and choosing growth over profit.
I think there's another point of view - let's consider each model as an investment. It is now sufficient that each model earns more than it cost to develop. And this generally holds (I heard that GPT-4.5 was a notable exception).
> At any point, Amazon could have turned off the expansion engine and turned on the profits
Maybe, but they wouldn't be in the dominant position they are today if they had turned off the expansion early - spending to suppress big competitors like Wallmart in the online shopping market pays dividends.
Yeah they can probably be almost entirely depreciated within 2-3 years, and pretty substantially front-loaded within that. Like, the expensive part of being a CPU company is making new ones, but Intel can't exactly just rest on their laurels and sell Pentiums for the rest of time.
Amazon would absolutely take a loss on certain products in order to dominate the category, squeeze out competitors and then bring the price back up. It's one of the reasons they're so dominant in general now. Also one of the reasons why Amazon Basics has basically everything that exists and they're usually at or near the top of their respective categories -- third-party sellers simply can't compete.
I have 128 GB of unified memory (M4 Max) and the user experience with local inference is still pretty bad. I'm so glad something like llama.cpp exists so I don't have to wrangle Python (which I hate), but OpenCode is entirely disrespectful of the KV-cache so I had to switch to Pi (but Pi is going relatively well actually).
Even so, I can't really run at hundreds of tokens per second which is practically table stakes for my work. Even if I did manage to run that fast, the model would probably be completely braindead and stomp all over the task.
Wish I could afford an M5 Max but I've been between jobs for months without even a single interview. Sucks to be a developer these days.
I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.
To a certain extent, but not completely. OpenAI and Anthropic are taking losses on their entire offering—that is a huge difference. Amazon, for example, has pumped its profits back into R&D for decades. What AI companies are doing right now is running the Uber playbook on an epic scale. In the US, there isn't much competition, so they can maintain a duopoly. But look at what happened in China: Uber collapsed and pulled out.
Now, the entire world is facing competition from DeepSeek and Qwen at a fraction of the cost. According to a reliable Shenzhen source, they will halve their prices again by the end of this year using newer Huawei GPUs. The current 7nm chips are already bleeding OpenAI dry. By the end of this year, they will upgrade to 5nm, and by June of next year, 3nm. They don't even have to be better—just 95% as good at 1/20th of the price.
I don't see OpenAI and Anthropic surviving much outside of America; they are likely staring down Groupon’s fate. You can research it yourself: China has no issues with electricity because they have a massive power surplus. This is why OpenAI and Anthropic are so scared right now. They must IPO by 2027, because after that, they will suffer the exact same fate as Groupon.
I do love the DeepSeek models, they're so incredibly cheap and for functionality that nears Sonnet. Weeks of heavy usage still lands squarely under $10 for me.
Compare that with how I pay $200 a month for Claude and am still hitting the limits with any sort of sustained usage. They even have a special usage limit for Sonnet to prevent you from using too much of that either.
I'm super frustrated with how slow DeepSeek is though. And it's not nearly ready to be unsupervised for long periods of time like Claude is. Just this morning I left Fable 5 unsupervised for about eight hours straight. Single turn. DeepSeek often gets even much shorter turns wrong, so I wouldn't trust it with anywhere near that length of time alone. Not to mention it'd get so much less done because of how slow it is.
Also, did you use an LLM to correct your grammar after you posted? Lol
Increasingly it looks like it will end with a bubble bursting. LLMs and AI will survive, like the internet survived the dotcom bubble. But OpenAI and Anthropic could just be today's AOL and Yahoo.
I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model. Currently I don’t like where we are heading with the sabotaging models because its “too dangerous”
> I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model.
Even if you don't acquire hardware to do host local models, a hardware crash means that I should be able to rent the crashed hardware at just above cost of electricity + bandwidth.
Like the way I can now, for $7/m, rent a VPS that can run my B2B webapp for a company with 10k users, I look forward to buying a timeshare on GPUs that let me pay $12/m for all-you-can-eat GPU.
However, I think actually that while it won't give the results expected (AI agents run the company, build all features, etc.), it will nevertheless become a developer tool like IDEs, something "you have to have".
It's here to stay but probably with more realistic expectations than some CEO/CTO are pushing for (agents for everything, nobody writes 1 LOC, self healing systems, etc).
So the market expectations will be probably resized, but these tools are here to stay. Be it for cybersecurity (from CVEs to cyber warfare) alone, that's already worth all the money they are throwing a it.
It reminds me of the Chinese bike wars where everyone was slashing prices trying to keep marketshare until the bubble burst and everyone lost billions.