Foundation Models without significant RLHF *and* access to high quality proprietary datasets are likely the fastest depreciating assets in human history. @ericvishria
I think only four are likely to have enduring value and transition into “Foundation Agents” over the next few years:
ChatGPT, Gemini, Grok/Tesla/X and Llama.
ChatGPT by virtue of RLHF and Microsoft’s various datasets plus access to closed, internal data at most enterprises via CoPilot. If OpenAI ever separated from Microsoft then its value would asymptote to zero. OpenAI trying to make both a GPU competitor *and* a phone would be crazy bearish and epic strategic mistake. Azure OpenAI doing much better than standalone OpenAI on enterprise side. Enterprise is hard.
Gemini by virtue of RLHF (via the SGE) and Google’s many datasets (Youtube transcripts, gmail).
Grok by virtue of RLHF via inclusion in X’s premium tier and access to X’s real-time data. Combination of Grok with the visual dataset and v12 algorithm from Tesla will likely create the best AI for robotics and the “real world.” Likely to see a much better version of RT-2 in Optimus. Could also be an insertion point for Tesla to enter cloud computing as application SaaS is replaced by MaaS (models as a service).
Llama is the only model on the list that is open-source. The one with the widest range of outcomes. Might be wise to invest more in this and less in the Metaverse in front of the coming humiliation from Apple's headset. The current virtual personality strategy seems deeply strange, but I am old. Could rapidly iterate on Llama, put it in Insta/WhatsApp/BigBlue for RLHF, try to compete in search while agentic AI slowly replaces search and then go into cloud computing via Llama (open source but needs to run in our cloud). Just a thought.
Obviously all four of these will need to be iterated every 12-18 months (GPT v5, etc.) as they are also depreciating.
This isn’t a “Game of Kings.” This is a “Game of Emperors.”
Amazon trying to enter the game via Anthropic which is a “Crown Prince” at best right now. Google invested in Anthropic primarily to help TPU ecosystem - Amazon likely needs both the LLM engineering talent (no great internal LLM yet) and to help Trainium ecosystem. Bedrock good strategy though and P5 is the best H100 instance. Apple is nowhere which is a risk to them - their only potential friend is Grok/Tesla with some shot of Microsoft/OpenAI via a search deal for Bing. Meta could’ve been a friend, but whoops - ATT sure was fun but a few years later turned out to have been irrelevant and only increased the competitive advantages of the largest apps like Meta. Exact opposite of what Apple wanted to accomplish. Possible that if Gemini leads to a dramatically superior assistant vs. Siri then Android starts really gaining share.
Verticalized AI’s like MidJourney will also have value. Maybe a *lot* of value.

Regulatory capture - see @bgurley thoughts - also increases the odds of the outcome described above. As an investor, this is great. As a human, I think some smart regulation is probably wise, but smart regulation is rare. Most important outcome to avoid as a human is a world with only one dominant model. Open source good in this sense.
Key assumption underpinning all of this is that scaling laws will continue (i.e. the loss ratio prediction from the GPT4 technical report) such that “intelligence is an engineering problem.”
If not, then might be a free for all although arguably proprietary data and RLHF would be even more important.
Nvidia is a wild card - they would like more than four dominant Foundation Models in the same way they want more than a few cloud computing providers. They will be able to impact the outcome in the same way they have given Coreweave and Lamda unnaturally high incremental share of GPU cloud computing revenues this year. Wonder what the competitive landscape would look like absent Megatron and what they will do next.


The upcoming launch of the MI300 likely going to accelerate all of this by roughly cutting the cost of inference in half. High inference costs drives high prices for consumers to prevent negative variable margins, which results in less RLHF. B100 will probably accelerate further - especially on the training side as will be first chip whose netlist was completed post release of ChatGPT. And then the MI400 could further lower inference costs and maybe even compete in training if Infinity Fabric competitive with NVLink (tough) and Ethernet competitive with Infiniband.

Improving GPU utilization via faster storage (Weka/Vast/Pure), better back end networking (Astera/Enfabrica/Arista/Broadcom/Annapurna/xSight and eventually coherent optics inside the datacenter with linear pluggables/co-packaged optics happening now) and improved CPU/GPU integration (GraceHopper, MI300 and interestingly Dojo) that combine to shatter the "memory wall" will further improve the ROI on training - both by directly lowering training costs and indirectly by increasing margins via lower inference costs.

A wise man said that the architecture with the most “useful compute per unit of energy” would win and that upfront costs are essentially irrelevant - agreed and we are moving fast towards more useful compute per unit of energy. A 1M GPU cluster would be wild but seems possible in 5 years via commentary at OCP.
Sidenote: Grok is the best name for an LLM thus far. We are all about to be strangers in a strange land. Claude, Pi and Bard competing for the worst. And Grok looks like actually might be funny, unlike the others especially the annoying super dork Pi. I like what Pi was trying to accomplish, but execution is everything - maybe has improved since I tried it, but I couldn't take it.

A little embarrassed that I exceeded the 2500 word count limit.