top of page
Search

ChatGPT Agent

OpenAI has announced a new functionality within ChatGPT – the agent.


I read, watched, researched.

Naturally, I have questions.

But, for starters – just the basics.



Unlike the standard model that simply responds to prompts, the agent plans and executes tasks independently, using a virtual computer, browser, terminal, APIs, and connectivity with external services.

The user gives it a task – the agent decides how to accomplish it. This means it can, for example:


  • open your calendar and prepare a briefing

  • check your CRM status and update a table

  • write an email

  • book a flight

  • create a presentation

  • use your Gmail, Google Drive, Slack, Asana, Notion – if you allow it.


It operates in real time and provides a narrative for every step.

You can pause it, interrupt it, ask for explanations, or intervene manually at any moment.

For any action with consequences (purchases, sending messages, editing files), it asks for explicit user confirmation.


The agent runs in a sandboxed environment, isolated from your computer. For code and files, it uses its own virtual terminal. It browses websites via its own browser. It accesses external services through so-called connectors, each of which must be activated individually by the user.


Currently, it’s available to users on the ChatGPT Pro plan (400 messages per month).

Plus and Team plans will have limited access (40 messages). Rollout coming soon.

Enterprise and Education rollout will begin in the coming weeks.


From a technical standpoint, the agent combines previous prototypes, Operator and Deep research, into a single solution.

In practical terms, this is ChatGPT that no longer just responds – it takes initiative.


What ChatGPT agent knows, sees, and (does not) remember.


When software gains the ability to browse the web, write emails, and launch a terminal on its own, security is no longer a nice-to-have – it becomes the central topic.


OpenAI has (rightly) introduced its strictest security framework yet.


They say the following.


Control remains in your hands. The agent can’t do anything without your knowledge.

Every action with consequences (e.g., purchasing tickets, sending an email, editing a document) requires explicit user confirmation. If you leave the tab while the agent is doing something (important?), Watch Mode activates automatically – pausing until you return.

You can pause it at any time, interrupt, take over manually, or ask for a summary of what has been done so far.


Virtual computer, isolated environment. The agent operates in a sandbox (an isolated virtual environment).

This means it has no access to your local computer; everything runs on OpenAI’s infrastructure, and the terminal is restricted (no unauthorized network calls).


Data. My favorite topic. In takeover mode (e.g., when the agent accesses your Gmail), passwords and cookies are not visible or stored. OpenAI does not remember this data. You enter it directly into the session.

With a single click (Settings → Privacy → “Delete browsing data”) you can fully erase all cookies and log out of all sessions opened by the agent. Or so they claim. Based on experience, we’ll take that with a grain of salt.

Also, memory is disabled in agent mode – what you do in that session does not enter the model’s permanent memory. As I’ve said before, let’s wait and see.


Security. The model is specially trained to ignore manipulation attempts (e.g., third-party instructions injected into sites the agent reads). Prompt-injection protections function for both the text and visual browser. OpenAI uses automated filters and monitors that check the agent’s behavior in real time.


Certification and compliance. OpenAI states compliance with standards like SOC 2 Type 2, CSA STAR Level 1, DPA for users under GDPR, regular external pen-tests, and an active bug bounty program.

Risk classification. Due to potential for abuse, the agent is internally classified as “High Biological & Chemical capability” within the OpenAI Preparedness Framework, meaning the highest level of security protocols apply.


....


Everything you just read is from OpenAI.

Interestingly (or not so), OpenAI (again) does not share information about resource consumption or environmental impact. Why, I wonder?

But, let’s get back to the topic.


...


What you can do


If you plan to use the agent functionality (or already have), here are a few (unsolicited) tips.


Grant the minimum permissions. Each connector (e.g., Gmail, Google Calendar, Slack) is activated separately. If it doesn’t need to read your emails – don’t give it access.

If the task is a one-off – disable access after it’s done.

Less access = less risk. As with any delegation, share only what you would with a human collaborator. Nothing more.


Step in when things get tricky. Watch mode is there for a reason. If the agent is sending an email, editing a presentation, or changing a file, watch what it does. Watch mode brings you back into the loop automatically. Use it. Follow along. Intervene as needed.


Regularly delete sessions and cookies. The agent has its own browser – and creates its own cookies. Click Settings → Privacy → Delete browsing data. This deletes all temporary sessions and instantly disables access you no longer use.


Track the activity log. For every task the agent executes, you receive a detailed narration of its actions. This isn’t just a nice-to-have – it’s your own trace to the source of xyz. If something goes wrong, you can always check where things broke down.


Do not delegate important decisions. The agent can write an email, research a topic, draft a document. But it doesn’t know the context. It doesn’t know reputation. It doesn’t know consequences. Simply – it doesn’t know.

Keep decision-making for yourself – especially for sensitive, legal, financial, and reputational matters.


The agent is not smart. Just capable. The agent can act. It knows how to search, click, run code, process data, send emails. But it doesn’t know why it’s doing something, who it’s sending it to, or what the consequences might be.


No understanding. No intuition. No responsibility.


And that brings us to the most important point.


The more powerful the AI tools, the more important human involvement becomes.

Not the other way around.

The paradox is obvious.


Before, when the model only responded to prompts, the damage was limited to bad advice.

Today, if you give it access to your systems and tasks – consequences occur in the real world.


So, this isn’t about trusting the tool, but about user responsibility. The agent won’t tell you “we shouldn’t do this” – it won’t even know there’s a problem. You are still the only one who understands the context, risk, and consequences.


In this sense, this phase of AI evolution doesn’t require less oversight, but more. Not passive observation, but active participation. Not delegating the decision, but shaping it together with a new operator – who knows nothing about you except what you explicitly tell it.


Reality


In a time when every new (ChatGPT) AI tool is positioned as the solution to everything, the agent functionality in ChatGPT will quickly become part of the sales narrative.


Before you jump in expecting it to replace teams, integrations, and procedures, it’s important to set real boundaries and expectations. For example:


It’s not plug-and-play automation. The agent can execute a series of tasks – but only within a single session, single user context, and with limited permissions. There is no background orchestration. There is no persistent or multi-user workflow. Everything is immediate, conversational, and within a sandbox.

It is not a business system – it’s an operational assistant.


It’s not a replacement for your integration layer. The agent doesn’t understand exceptions. It doesn’t manage dependencies. It has no logic, it doesn’t know what rollback is, and it has no memory outside the active session. It can click, process, send – but that doesn’t mean it understands all process variants.


It’s not ready for enterprise environments. No team-level insight. No granular access control. No audit mechanisms by organizational units. No SLAs. For more serious uses (especially in regulated industries), the agent still requires additional oversight infrastructure.


It’s not a tool that understands your business context. The agent reacts to prompts. It doesn’t know team relations, reputational risks, legal constraints, or commercial implications. Anything not explicitly stated will be omitted from its decision. This isn’t negligence – it’s a limitation.


It’s not instant savings. For every time gain, there’s a learning period – like process mapping, user education, setting boundaries, and defining internal usage rules. I lecture you about AI governance every day, I know.

Without that extra invested effort, the benefits remain fragmented. Without structure, the agent may create more confusion than value.


It is not automatically compliant with regulatory requirements. The fact that the agent runs in a sandbox and uses explicit confirmations doesn’t relieve you of obligations regarding data protection. If you use the agent for business purposes, you are still the data controller under GDPR/ZZPL.


This entails obligations, including:

  • keeping records of processing

  • conducting DPIA when applicable

  • reviewing contractual relationships with OpenAI as a processor

  • implementing additional technical and organizational measures if data is transferred outside the EEA

  • and more


The EU AI Act will further tighten these standards. If the agent in your system processes data considered high-risk (e.g., HR, health, finance, legal documents), you’ll be required to conduct a broader compliance and accountability assessment.


In other words – the agent does not assume your organizational responsibility. It only performs tasks – everything else remains on you.


So.


The agent can be a valuable tool in a business environment, but it is not an out-of-the-box solution. It requires thoughtful introduction, oversight, and a clear understanding of its actual capabilities and limitations.

For organizations able to provide this, the agent can bring operational advancement.

For everyone else – premature trust can carry significant risk.


So what now?


The agent is that next step. Toward what, exactly?

Not for the hype.

But because it changes the relationship between users and AI tools. It’s no longer just about “what to ask it.” It’s about “what I can entrust it with.”


Technically – impressive.

Organizationally – potentially disruptive.

Ethically – we are only now entering uncharted territory.


Because when the tool searches on its own, connects to your accounts, makes micro-decisions, and executes them – there’s no more neutrality. This is an agent. And that obligates it – and you with it.


That’s why it’s extremely important to approach this phase of AI not as consumers, but as active participants.

Not swept up, but not paranoid, either.


Informed. Present. Critical.


Is the agent revolutionary? Depends who you ask.


For users delegating a digital task for the first time – it is.

For companies who know how to use it – it can be.

For those not keeping an eye on what it’s doing – it will be, but probably not in a good way.


In any case – it’s no longer a tool waiting for your prompt. And that changes things.


My metaphor through Midjourney eyes.



 
 
 

Comments


bottom of page