The creators of ChatGPT, OpenAI, are continuously advancing AI Agents, aiming to set the standard for the next big thing in technology. We've delved into the concept of AI Agents numerous times. With their latest introduction, OpenAI has launched 'Operator,' their initial AI Agent that is publicly available, albeit in a limited capacity.
OpenAI has released a research preview of Operator. According to the company's blog, Operator autonomously performs tasks with minimal human intervention. Unlike current AI Chatbots or AI tools, which always require user input.
Operator is designed to independently interact with web pages using a web browser. It can type queries, click, and scroll on its own. Essentially, this means there's little left for you to do.
This tool by OpenAI is presently available for users in the United States, limited to ChatGPT Pro subscribers.
How Operator Functions
When you access a website, it generally involves clicks and typing, often using a mouse, keyboard, or touch screen. But with Operator, all these actions occur automatically.
Operator manages everything from clicking to typing and employs reasoning for self-correction. If it encounters difficulties, it will seek your assistance, ensuring users retain control.
Embedded with a virtual browser, Open AI’s Operator will open the website as you type commands. It interacts with websites in the same manner a human does, but without the use of keyboard or mouse.
Autonomous Website Interaction
OpenAI has also released a demonstration video on YouTube showcasing how Operator operates. The demo illustrates Operator autonomously navigating a shopping website, adding items to the cart, and proceeding to payment without user input.
Moreover, the demonstration reveals Operator's ability to book a hotel table without user intervention through its built-in browser, all visible in real-time. If necessary, you can take control anytime and guide the Operator. The company affirms that Operator can perform multiple tasks concurrently.
Source: aajtak
Should a website request sensitive details, Operator will prompt you to take over. Human intervention is required for login credentials or card information input.
OpenAI is collaborating with several companies to enhance this AI Agent, including a partnership with Uber to better understand real-world needs. From booking a cab to conversing with the driver, Operator can manage it all.
Currently, this model is not fully perfected—it's not optimized for complex interfaces, encountering issues with tasks like creating slideshow presentations and organizing calendars. The company has acknowledged that there are several challenges and it’s not entirely accurate. However, they plan to extend Operator’s availability to GPT Plus, Team, and Enterprise users shortly.