According to news on September 13, after weeks of speculation and anticipation, OpenAI finally launched its first “inference model” o1. The product is considered one of the company's most powerful artificial intelligence products to date, and its problem-solving capabilities demonstrate unprecedented human-like thinking qualities. At least, that's the company's pitch.
However, like OpenAI’s previous research and product releases, o1 is still somewhat “appetizing”. OpenAI claims that this model performs better when handling complex tasks, but has revealed few details about model training. Currently, o1 is only offering a limited preview to paid ChatGPT users and select programmers.
OpenAI confidently stated that o1 has demonstrated the depth of thinking similar to that of a doctoral student in fields such as physics, chemistry and biology. This progress was considered so important that OpenAI decided to start over from the existing GPT-4, reset the model's numbering to "1", and even considered dropping the widely recognized "GPT" branding. It’s a brand that not only defines its chatbots, but marks the takeoff of the entire field of generative artificial intelligence.
The research report and blog post released by OpenAI today demonstrate many o1’s amazing capabilities in handling complex reasoning tasks. These tasks range from advanced mathematics, programming puzzles, code decryption, and even specialized problems from the fields of genetics, economics, and quantum physics. A large number of charts show that o1 has significantly surpassed its top language model GPT-4o in the company's internal evaluation, and its performance is particularly outstanding in the fields of programming, mathematics and science.
The key to these improvements comes from a deep insight into children's education - "think before you act." OpenAI said that OpenAIo1 will spend more time "thinking deeply" before answering, similar to the human thinking process. The company calls this process a "thought chain," a term in AI research that refers to a solution strategy that breaks a problem into multiple intermediate steps. This "thought chain" mechanism enables the model to gradually solve small tasks, self-correct and optimize the solution. When the user asks o1 a question, the model will display "Thinking" and then show some steps in its reasoning process, such as "tracing historical evolution" or "integrating pieces of evidence." Eventually, it will mark the duration of the thought, such as "Think for 9 seconds," and then give the answer.
Although o1's complete "thinking chain" when generating answers is invisible to users to simplify the user experience, this also sacrifices some transparency, making it difficult for users to understand how the model reaches the final conclusion. This is also to protect the core technology of the model from being mastered by competitors. Regarding the construction details of o1, OpenAI revealed very little, saying only that its training is based on "new optimization algorithms and training data sets."
Despite OpenAI's unprecedented marketing efforts, it remains uncertain whether o1 will bring a revolutionary experience to ChatGPT or be just an incremental improvement on the existing model. However, judging from the research results presented by the company and my preliminary testing, o1's output results are indeed more comprehensive and logical. This reflects OpenAI’s confidence in the scale effect: larger AI models, more data, and more powerful computing power will drive leaps in AI performance. The longer the training time, the better o1's performance.
However, long periods of thinking also come with higher costs. OpenAI allows programmers to pay to use its technology, and o1’s per-word output fee is roughly four times that of GPT-4o. The high-performance chips, power and cooling systems required for generative AI are extremely expensive. To meet these massive computing needs, technology companies, energy companies and other industries are expected to invest trillions of dollars. This has raised concerns about whether AI will become a new bubble, like the cryptocurrency or dot-com bubble era. As O1 takes longer to respond to problems, it consumes more resources, further exacerbating the uncertainty of when AI technology will become profitable.
Perhaps the most significant impact of this extended processing time is not a technical or financial burden, but a rebranding. Compared with the obscure terms such as "converter" and "diffusion" in past AI models, OpenAI's "inference model" and "thinking chain" sound closer to everyday language and have a "humanized" color.
This language strategy is not unique to OpenAI. The startup Anthropic describes its main model Claude as having a "personality" and a "brain." Google hypes its AI's "reasoning" capabilities, and the AI search startup Perplexity claims its product "understands you." OpenAI's blog directly states that o1 "thinks like a human," "works like a real software engineer," and "has human-like reasoning capabilities." Although the research leader emphasized that OpenAI does not consider its product to be equivalent to the human brain, he also admitted that o1 does appear more "human" in some aspects than previous models.
For an industry whose product positioning is not yet clear, "humanized" expressions are undoubtedly a powerful marketing tool. The definition of intelligence is inherently vague, and the actual value of language models is difficult to accurately assess. The name "GPT" may seem simple, but it hardly conveys any real meaning. Although OpenAI's chief research officer Bob McGrew believes that OpenAIo1 is a first step towards "more sensible naming" aimed at expressing its products more clearly, these subtle differences in letter and number combinations are often irrelevant to ordinary people.
However, marketing a tool that can "think like you" is nothing like science lab jargon and more like a concept in literature. Although this description is not more precise than other AI terms, and may even be vaguer, it also gives it its unique charm. An AI model that claims to "think like a human" leaves room for imagination, allowing each user to fill in the gaps and conceive of a machine that "functions like me." Perhaps the key to selling generative AI is to let customers build and fill in the "magic" themselves.