OpenAI’s long-hyped agent has just been officially released! Official introduction:Operatoris one of our first agents. These AIs can do the work for you independently -Just give it a task and it will execute it. Belike, give it a shopping list, and Operator can help you buy good things completely autonomously.


You can see that the operator's hands have left the keyboard, and all operations on the screen are completed by the Operator itself.

You can also use it to make restaurant reservations:


As soon as Ultraman's live broadcast ended, OpenAI President Brockman couldn't wait to announce:

2025 is the year of the intelligent agent.


And this time, Operator officially announced that it will be launched soon - but it is only available to Pro users for the time being. Yes, that is the big membership that costs 200 U.S. dollars (approximately RMB 1,458) a month.

After watching the live broadcast, netizens were still very excited and called it "Crazy Thursday".


But...


Well, Operator is very popular, but it would be even better if it was open source. DeepSeek and Meta are going to be doge.

Play with the browser without human assistance

There is no proof, let’s go through the official demo first to see how “independent” the Operator is.

It can be used on almost any website without human assistance.


Like finding a clam linguine recipe from Allrecipes and adding all the ingredients to my instacart cart?


The logic it operates is the same as humans, which pictures it sees and which buttons it should click.

This is different from other agents that use APIs or programming interfaces. It is based on text-based thinking chains for reasoning.


After confirming the menu, which store should you go to place your order?

The human further gives instructions, using Gus’s, and then the Operator will go to the corresponding website to start placing orders.


When encountering login, payment and other operations, the Operator will return the operation rights to the user.

In actual user testing, some bloggers found that if the Operator was blocked by Reddit, it would add the "Reddit" keyword to its search to find relevant posts.


Users can also add custom instructions to get a personalized experience. For example, set your preferred airline when booking flights.

The Operator allows users to save prompts for quick access on the home page, making it ideal for repetitive tasks such as replenishing stock on a shopping site.

The Operator can also run multiple tasks at the same time, like opening multiple web pages, such as ordering a personalized enamel mug on Etsy and booking a campsite on Hipcamp.


The bottom layer of Operator uses a new modelComputer-Using-Agent (CUA).

By combining GPT-4o’s visual capabilities with advanced inference reinforcement learning, CUA enables GUI interaction.

Operator can see the content of the web interface and use all operations allowed by the mouse and keyboard. This allows it to operate automatically without the need for custom API integration.

If you encounter problems or errors,Operators can self-correct using reasoning capabilities. and hands control back to the user when it gets stuck and needs help.

CUA achieved SOTA in both WebArena and WebVoyager benchmarks.


Currently, Pro members in the US can already use Operator through operator.chatgpt.com. Paying users such as Plus, Team, Enterprise and fat friends in other regions will have to wait, but OpenAI promises to integrate these functions into ChatGPT in the future.

OpenAI enters “Level 3”

In July 2024, OpenAI released the "Five-Step Process from AI to AGI":

Level1: Chatbots, AI can interact with people in a conversational manner.

Level2: Reasoners, AI technology solves human-level problems.

Level3: Agents, AI can perform some action tasks as a system.

Level4: Innovators, AI can develop innovative AI.

Level5: Organizations, AI can complete the work completed by an organization.

In its definition and planning at the time, OpenAI stated that it was only in the Level 1 stage and was approaching Level 2.

And now, with the release of Operator, Ultraman announced:

This is the beginning of our entry into Level 3.

It is worth noting that, as mentioned at the beginning, OpenAI quietly drew an important point: Operator is still just "first batch” rather than the only intelligent agent.

During the live broadcast, Ultraman also announced:

We will also be launching additional agents in the coming weeks and months.


OneMoreThing

Just before OpenAI’s live broadcast today, there is a small tidbit.

Two hours before the release of the Operator, OpenAI sent a tweet stating that it had fixed the problem of high error rates in ChatGPT and API.


Another false shot (doge) among netizens.


Another good news is that Ultraman also announced that the free version of ChatGPT will be able to use o3-mini.