Reasoning Models and the Future of AI Startups

Written by Justin Hutchens | October 17, 2024

By Justin "Hutch" Hutchens | Trace3 Innovation Principal

OpenAI recently released the o1-preview model, its first “reasoning model” and a significant step toward achieving Artificial General Intelligence (AGI). AGI refers to AI that surpasses human capabilities across most tasks—a long-term goal of OpenAI. OpenAI has outlined a five-step roadmap toward AGI^[1]:

[1] Cook, J. (2024, August 27). OpenAI’s 5 levels of “super ai” (agi to outperform human capability). Forbes. https://www.forbes.com/sites/jodiecook/2024/07/16/openais-5-levels-of-super-ai-agi-to-outperform-human-capability/

Is this what Ilya saw?

Nearly a year ago (in November 2023), Sam Altman was briefly ousted from his position as OpenAI’s CEO. One of the leading figures in the attempted coup was Ilya Sutskever, the company’s chief researcher. Sutskever was reportedly alarmed by OpenAI’s scaling efforts outpacing its safety measures. Rumors swirled that he witnessed something called Q-Star, a secret project involving reinforcement learning to optimize chain-of-thought reasoning in large language models (LLMs). This event sparked the meme, “What Did Ilya See?”

Almost a year later, the o1-preview model has emerged, aligning with the rumored Q-Star project details. OpenAI’s blog claims that the o1 model improves reasoning through reinforcement learning algorithms that reward effective chain-of-thought, and by spending additional time “thinking” to solve complex tasks. In this blog, OpenAI stated:

“Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. We have found that the performance of o1 consistently improves with more reinforcement learning (train-time compute) and with more time spent thinking (test-time compute).” ^[2]

The o1-preview model excels at multi-step tasks using chain-of-thought reasoning. It breaks down complex problems into sub-tasks, leading to impressive results in areas like algorithm generation, advanced mathematics, and scientific analysis. In a test against PhD-level math problems from the International Mathematics Olympiad (IMO), the o1 model achieved an astonishing 70 percent improvement over GPT-4o, OpenAI’s previous leading model.^[3]

[2] Learning to reason with LLMS. OpenAI. (2024, September 12). https://openai.com/index/learning-to-reason-with-llms

[3] Introducing openai O1-preview. OpenAI. (2024, September 12). https://openai.com/index/introducing-openai-o1-preview

The Impact on AI Startups

Prior to the release of the o1-preview model, there were already many AI startups on the market who had built solutions on top of LLMs to equip them with chain-of-thought reasoning. There is no question that the introduction of this capability as a native feature from OpenAI is going to step on a lot of toes. This should probably come as no surprise, as Sam Altman himself has publicly stated that OpenAI will “steamroll” AI startups that seek to capitalize on AI hype by building solutions that attempt to improve upon the core capabilities of their models. In an interview with Harry Stebbings on 20VC, Altman stated:

I think fundamentally there are two strategies to build on AI right now. There’s one strategy which is to assume the model is not going to get better and then you kind of build all these little things on top of it. There’s another strategy which is build assuming that OpenAI is going to stay on the same rate of trajectory and the models are going to keep getting better at the same pace. It would seem to me that 95% of the world should be betting on the latter category but a lot of the startups have been built in the former category. When we just do our fundamental job, because we have a mission, we’re going to steamroll you.^[4]

While the o1 model is impressive, it is also the first step in the fulfillment of Altman’s promise — that they will steamroll much of the AI startup market. One of the most important factors in evaluating a startup -- regardless of whether you are founding, investing, or adopting -- is the question of the company’s long-term viability. This question is uniquely challenging in the fast-growing market of generative AI. As OpenAI and other major players build more advanced models, many adjacent solutions could become obsolete. Anticipating which solutions will become obsolete, and at what pace, is the real challenge.

[4] Levi, D. (2024, April 22). Sam Altman: OpenAI is “Going to steamroll you” if your startup is a wrapper on GPT-4 - tech startups. Tech Startups - Startups and Technology news. https://techstartups.com/2024/04/22/sam-altman-openai-is-going-to-steamroll-you-if-your-startup-is-a-wrapper-on-gpt-4/

The Questions to Ask

The technology industry is still investing massive amounts of money into building larger and more capable frontier AI models. We can expect to see continued rapid advancement for the foreseeable future, most likely at intervals of 2-3 years as the future generations of larger-scale models are released.

The most important question that founders, investors, and adopters should be asking themselves is how the value-proposition of a startup solution transforms as general AI capabilities rapidly advance. Will a more advanced AI model soon be capable of solving the startup’s problem-statement natively, and without an additional solution connected to it? This is the most imperative question of long-term viability in the generative AI boom.

Unfortunately, this is not an easy question to answer. You are required to speculate on a still very uncertain future trajectory of AI. But if nothing else, the o1-preview model serves as a stark reminder that we must start to think ahead, make best-effort attempts to forecast the trajectory of AI advancement, and draw informed conclusions to best position ourselves in this rapidly changing market.

Justin “Hutch” Hutchens is an Innovation Principal at Trace3 and a leading voice in cybersecurity, risk management, and artificial intelligence. He is the author of “The Language of Deception: Weaponizing Next Generation AI,” a book focused on the adversarial risks of emerging AI technology. He is also a co-host of The Cyber Cognition Podcast, a show that explores the frontier of technological advancement and seeks to understand how cutting-edge technologies will transform our world. Hutch is a veteran of the United States Air Force, holds a Master’s degree in information systems, and routinely speaks at seminars, universities, and major global technology conferences.

View full post