Getting started with AI agents (part 2): autonomy, safeguards and pitfalls

Sign up for our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Find out more

In ours first installmentWe’ve outlined key strategies for leveraging AI agents to improve business efficiency. I explained how, unlike autonomous AI models, agents iteratively refine tasks using context and tools to improve outcomes such as code generation. I also discussed how multi-agent systems foster communication between departments, creating a unified user experience and driving productivity, resiliency, and faster updates.

Success in building these systems depends on mapping roles and workflows, as well as establishing safeguards such as human oversight and error control to ensure safe operation. Let’s dive into these critical elements.

Protections and autonomy

Agents imply autonomy, so various guarantees must be built into an agent within a multi-agent system to reduce errors, waste, legal exposure or harm when agents operate autonomously. Applying all of these security measures to all agents may be overwhelming and a resource challenge, but I highly recommend considering each agent in the system and consciously deciding which of these security measures they would need. An agent should not be allowed to operate autonomously if any of these conditions are met.

Explicitly defined conditions of human intervention

Activating any of a set of predefined rules determines the conditions under which a human must confirm an agent’s behavior. Such rules should be defined on a case-by-case basis and can be declared in the agent system prompt — or, in more critical use cases, be applied using deterministic code external to the agent. One of these rules, in the case of a purchasing agent, would be: “All purchases should first be verified and confirmed by a human. Call your function ‘check_with_human’ and don’t proceed until it returns a value.”

Safeguarding agents

A safeguarding officer can be paired with an officer with the role of checking for risky, unethical or non-compliant behaviour. The officer may be forced to always verify all or some elements of his behavior towards a safeguarding officer and not to proceed unless the safeguarding officer gives the green light.

Uncertainty

Our lab recently published a paper on a technique that can provide a measure of uncertainty for what a large language model (LLM) generates. Given LLMs’ propensity to confabulate (commonly known as hallucinations), giving a preference to a certain output can make an agent much more trustworthy. There is a cost to pay here too. Evaluating uncertainty requires us to generate multiple outputs for the same request so that we can classify them based on certainty and choose the behavior that presents the least uncertainty. This can slow down the system and increase costs, so it should be considered for the most critical agents within the system.

Disengagement button

There may be times when you will need to stop all autonomous agent-based processes. This could be because we need consistency or that we have detected behavior in the system that needs to be stopped while we figure out what’s wrong and how to fix it. For more critical workflows and processes, it is important that this disengagement does not cause all processes to stop or become completely manual, so it is recommended to provide a deterministic fallback mode of operation.

Work orders generated by the agent

Not all agents within an agent network need to be fully integrated into apps and APIs. This may take some time and require a few iterations to get the right result. My advice is to add a generic placeholder tool to agents (typically leaf nodes in the network) that would simply issue a report or work order, containing suggested actions to be taken manually on behalf of the agent. This is a great way to get your agent network up and running in an agile way.

Test

With LLM based agentswe are gaining robustness at the expense of coherence. Furthermore, given the opaque nature of LLMs, we are dealing with black-box nodes in a workflow. This means that we need a different testing regime for agent-based systems than that used in traditional software. The good news, however, is that we are used to testing such systems, as we have been running human-driven organizations and workflows since the dawn of industrialization.

While the examples I showed above have a single entry point, all agents in a multi-agent system have an LLM as a brain and therefore can act as an entry point for the system. We should use “divide and conquer” and first test subsets of the system starting from various nodes within the hierarchy.

We can also use generative AI to come up with test cases that we can run on the network to analyze its behavior and push it to reveal its weaknesses.

Lastly, I’m a big believer in sandboxing. Such systems should initially be launched on a smaller scale within a controlled and secure environment, before being gradually rolled out to replace existing workflows.

Touch-ups

A common misconception about AI generation is that it gets better the more you use it. This is obviously wrong. LLMs are pre-formed. That said, they can be fine-tuned to influence their behavior in various ways. Once we devise a multi-agent system, we can choose to improve its behavior by taking each agent’s logs and labeling our preferences to build a tuning corpus.

Pitfalls

Multi-agent systems can fall into haywire, meaning that occasionally a query may never finish, with agents continually talking to each other. This requires some form of timeout mechanism. For example, we can check the communication history for the same query, and if it’s getting too large or we detect repetitive behavior, we can stop the flow and start over.

Another problem that can occur is a phenomenon I’ll call overload: expecting too much from a single agent. The current state of the art for LLMs doesn’t allow us to give agents long, detailed instructions and expect them to follow them all, every time. Also, did I mention that these systems can be inconsistent?

One mitigation for these situations is what I call granularization: splitting agents into multiple connected agents. This reduces the load on each agent and makes agents more consistent in their behavior and less likely to go haywire. (An interesting area of research our lab is undertaking involves automating the granularization process.)

Another common problem in the way multi-agent systems are designed is the tendency to define a coordinating agent that calls several agents to complete a task. This introduces a single point of failure that can result in a rather complex set of roles and responsibilities. My suggestion in these cases is to think of the workflow as a pipeline, with one agent completing part of the work, then passing it on to the next one.

Multi-agent systems also have a tendency to pass context to other agents along the chain. This can overwhelm other agents, confuse them and is often unnecessary. I suggest allowing agents to maintain their own context and reset it when we know we’re dealing with a new request (kind of like how sessions work for websites).

Finally, it is important to note that there is a relatively high bar for the capabilities of the LLM used as the agents’ brain. Smaller LLMs may need a lot of early engineering or setup to meet demands. The good news is that there are already several commercial and open source agents, albeit relatively large ones, that exceed the bar.

This means that cost and speed must be an important consideration when building a large-scale multi-agent system. Furthermore, it is necessary to establish that these systems, although faster than humans, will not be as fast as the software systems we are used to.

Babak Hodjat is CTO for AI at Aware.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including data engineers, can share data insights and innovations.

If you want to read cutting-edge ideas and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You might also consider contributing an article all yours!

Learn more from DataDecisionMakers

Source link

Getting started with AI agents (part 2): autonomy, safeguards and pitfalls

Protections and autonomy

Explicitly defined conditions of human intervention

Safeguarding agents

Uncertainty

Disengagement button

Work orders generated by the agent

Test

Touch-ups

Pitfalls

Leave a ReplyCancel Reply

Pengkatical Sunderland – League Premier 2025-26: Black Cat Starts top plane at home fighting Ham Western on the Introduction weekend

What can Apple bring on your iPhone with iOS 18.6 before 26

Co-OP feeds £ 350m the number from six major banks after an internet attack

Protections and autonomy

Explicitly defined conditions of human intervention

Safeguarding agents

Uncertainty

Disengagement button

Work orders generated by the agent

Test

Touch-ups

Pitfalls

Leave a ReplyCancel Reply

Trending now

Pengkatical Sunderland – League Premier 2025-26: Black Cat Starts top plane at home fighting Ham Western on the Introduction weekend

What can Apple bring on your iPhone with iOS 18.6 before 26

Co-OP feeds £ 350m the number from six major banks after an internet attack