Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are more factors to cost than just the raw compute to provide inference. They can’t just fire everyone and continue to operate while paying just the compute cost. They also can’t stop training new models. The actual cost is much more than the compute for inference.


Yes, there are some additional operating costs, but they're really marginal compared to the cost of the compute. Your suggestion was personnel: Anthropic is reportedly on a run-rate of $3B with O(1k) employees, most of whom aren't directly doing ops. Likewise they also have to pay for non-compute infra, but it is a rounding error.

Training is a fixed cost, not a variable cost. My initial comment was on the unit economics, so fixed costs don't matter. But including the full training costs doesn't actually change the math that much as far as I can tell for any of the popular models. E.g. the alleged leaked OpenAI financials for 2024 projected $4B spent on inference, $3B on training. And the inference workloads are currently growing insanely fast, meaning the training gets amortized over a larger volume of inference (e.g. Google showed a graph of their inference volume at Google I/O -- 50x growth in a year, now at 480T tokens / month[0])

[0] https://blog.google/technology/ai/io-2025-keynote/


That's all the more reason to run at a positive margin though - why shovel money into taking a loss on inference when you need to spend money on R&D?


I heart you.

Classic fixed / variable cost fallacy: if you look at the steel and plastic in a $200k Ferrari, it’s worth about $10k. They have 95% gross margins! Outrageous!

(Nevermind the engine R&D cost, the pre-production molds that fail, the testing and marketing and product placement and…)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: