Hacker Newsnew | past | comments | ask | show | jobs | submit | ainch's commentslogin

OpenAI are testing ads in the free tier of ChatGPT, but they state that the actual LLM responses won't include advertising/product placement [0].

[0]: https://openai.com/index/our-approach-to-advertising-and-exp...


I'm not sure Anthropic will become ad-supported - the vast bulk of their revenue is b2b. OpenAI have an enormous non-paying consumer userbase who are draining them of cash, so in their case ads make a lot more sense.

As LLMs are productionised/commodified they're incorporating changes which are enthusiast-unfriendly. Small dense models are great for enthusiasts running inference locally, but for parallel batched inference MoE models are much more efficient.


Perhaps they need more advertising around the correct spelling of his name.


Good catch but it was just an honest typo.


Gemini 3.0's cutoff is January. I think you can get away with it if the model has good search/tool use capability.


Which other group is that?


Of course people don't say it, but there are many cases where reported algorithmic improvements are attributable to poor baseline tuning or shoddy statistical treatment. Tao is exhibiting a lot more epistemic humility than most researchers who probably have stronger incentives to market their work and publish.


Inferior in what sense? Genie 3 is addressing a fundamentally different problem to a physics sim or procgen: building a good-enough (and broad-enough) model of the real world to train agents that act in the real world. Sims are insufficient for that purpose, hence the "sim2real" gap that has stymied robotics development for years.


Genie 3 is inferior in the sense you just described: the sim2real gap would be greater, because it's a less accurate model of the aspects of the world that are relevant to robotics.


The reports are definitely bland, but I find them very helpful for discovering sources. For example, if I'm trying to ask an academic question like "has X been done before," sending something to scour the internet and find me examples to dig into is really helpful - especially since LLMs have some base knowledge which can help with finding the right search terms. It's not doing all the thinking, but those kind of broad overviews are quite helpful, especially since they can just run in the background.


I caught myself that most of my LLM usage is like this:

ask a loaded, "filter question" I more or less know the answer for, and mostly skip the prose and get to the links to its sources.


The "loaded question" approach works for getting MUCH better pro/con lists, too, in general, across all LLMs.


I do that too, I wonder how much of it is the LLM being helpful and how much of it is the RAG algorithm somehow providing better references to the LLM than a google search can?


Generally you train each expert simultaneously. The benefit of MoEs is that you get cheap inference because you only use the active expert parameters, which constitute a small fraction of the total parameter count. For example Deepseek R1 (which is especially sparse) only uses 1/18th of the total parameters per-query.


> only uses 1/18th of the total parameters per-query.

only uses 1/18th of the total parameters per token. It may use the large fraction of them in a single query.


That's a good correction, thanks.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: