AI Agents With Memory Are The Next Step in Evolution: Interview With Vanar CEO Jawad Ashraf

Thinking agents are no longer enough. For Jawad Ashraf, the real goal is building agents that remember and carry context across days, including transactions and failures, instead of waking up fresh every time a user opens a chat. In his view, persistent memory is becoming a structural requirement for AI that runs in the wild rather than just in demos.

To better understand the role of memory in AI agents, TimesCrypto spoke with Jawad Ashraf, founder and CEO of Vanar, an AI-native Layer 1 blockchain focused on semantic memory, on-chain reasoning, and machine-to-machine payments. Its system combines a main blockchain with a memory layer called Neutron and an AI reasoning engine, Kayon, giving agents lasting, verifiable memory directly on-chain rather than adding it as a separate component.

From Cool Agent Demos to Real-world systems

When asked what changed in the last year and a half to move agents from “cool” prototypes to systems teams actually deploy, Ashraf pointed first to reliability and cost curves and then to the tooling underneath them.

Reliability is now “good enough” to embed agents inside real workflows, he said, while falling inference costs let teams run larger, longer chains rather than treating every call as a budget crisis. Just as important, orchestration frameworks have grown more opinionated about state and control flow, so developers no longer need to hand-build as much scaffolding around each agent.

“You can now treat an agent like a system closer to software engineering than prompt tinkering,” Ashraf said, adding that this is why the industry conversation has shifted toward agent engineering and operational discipline.

A second shift is that agents are beginning to transact on their own. With payment rails such as x402 enabling machine-to-machine payments over HTTP, agents can autonomously pay for APIs or data feeds. That turns “spending” into an action the agent can take, while remaining governed by humans.

Why Memory Is Becoming the Real Constraint

When the discussion turned to memory, Ashraf was clear that long-running agents do not fail because they cannot plan, but because they cannot remember.

“Long-running agents fail at continuity,” he said. Without durable memory, each new session starts from zero. The system can look impressive in a single interaction and still be unusable when the same user returns days later expecting follow-through.

He argued that the first thing to break is multi-day execution. Tasks like onboarding a customer, coordinating approvals, or managing a multi-step workflow depend on remembering what has already been done, which preferences were stated, and what remains blocked. Without persistent memory, agents either re-ask questions users already answered or act on an incomplete state, which is a behavior Ashraf described as “the kind of experience you can’t ship without eroding trust.”

Developers often try to compensate by pushing more and more prior context into prompts. That makes systems brittle, expensive, and eventually constrained by context-window limits, he added.

Privacy and Governance as the Silent Blockers

When asked about the real obstacles to scaling memory, including latency, cost, and privacy, Ashraf said the limiting factors have been less technical and more about governance.

Latency and storage costs can usually be engineered around with caching, tiered storage, and compression. What stalls deployments is uncertainty over what exactly is being retained, for how long, who can access it, and whether teams can prove data has been deleted when required.

“Once memory spans users and sessions, the blast radius of a mistake expands,” he said. A leak of one user’s data into another user’s session is no longer a simple bug but “a trust event.”

He pointed to GDPR-style regimes and right-to-erasure rules as forcing functions, pushing teams to build traceability and deletion workflows that actually propagate through every layer of the memory stack. On top of that, he expects a second governance layer to emerge around provenance and identity, where cryptographic credentials and verifiable identifiers are used to attribute which agent wrote what into shared memory.

Turning Data into “Seeds” of Knowledge

Vanar’s own answer to the memory challenge is Neutron, a semantic memory layer that compresses documents and interactions into what Ashraf calls “seeds,” which are compact knowledge objects that preserve meaning rather than full text.

When the conversation turned to what actually survives that compression, he said a seed retains four elements: the factual claims extracted, the relationships between those facts, the intent behind capturing them, and the provenance of the source. The aim is to store “meaning you can reuse, not raw text you hope you can rediscover.”

Formatting, redundancy, and conversational filler are intentionally discarded as noise that clutters retrieval, while the original source can remain as an immutable reference, and retrieval focuses on a structured essence of the content.

As agents gain the ability to take consequential actions, including spending money over machine-native payment rails like x402, Ashraf argued that this kind of structured, auditable memory becomes non-negotiable. “If the agent spent money, you need to explain what memory it relied on and show the chain back to sources,” he said.

When Memory Goes Wrong at the Worst Time

Turning to how memory fails in production, Ashraf said teams often expect breakdowns to look like a blank search result. In practice, the first bottleneck is not missing memory but incorrect recall.

“Wrong recall at the wrong moment is the first bottleneck I see most often,” he said. Systems sometimes retrieve something adjacent to the right answer, present it confidently, and only reveal the mistake downstream when decisions compound on flawed context.

That risk becomes sharper as agents move from chat interfaces into action space, where they can initiate purchases, reserve resources, or call paid APIs, making even a subtle recall error a direct financial cost.

He added that latency spikes, privacy boundaries, and cross-user data leakage remain high-severity issues, but they typically appear later in a rollout, when teams transition from single-user pilots to multi-tenant production. A related failure mode is “hallucinated memories,” where the agent fabricates continuity and writes it back as if it were real, which is a problem he framed as much about governance and write controls as about the underlying model.

Guardrails for Agents That Remember

When asked what guardrails are non-negotiable for agents with persistent memory, Ashraf highlighted two pillars: auditability and decay.

Auditability means teams must be able to inspect which memories were used, where they came from, and what the agent wrote back. If memory lacks transparency, debugging becomes guesswork, and responsibility is hard to assign when something goes wrong. In his view, modern agent stacks will need traceability, monitoring, and explicit control points as standard features, because multi-step systems fail in ways single prompts do not.

Decay is the counterpart. “Old context should not carry equal weight forever,” he said, as users change preferences, policies evolve, and facts expire. Safe systems should apply recency weighting, time-to-live rules for certain data classes, and refresh paths so outdated information does not keep steering decisions.

On top of that, Ashraf argued for strict access control, defining which agent instances can read or write to which partitions of memory, along with gates and policy checks before high-impact data is committed as durable memory. Finally, he expects identity and provenance for agent actions, including signed events, timestamps, and verifiable claims about which agent did what, to become part of any serious continuity architecture.

Will Agent Memory Standards Emerge?

Looking ahead to 2026, Ashraf expects the memory ecosystem to fragment before it converges, as early markets reward speed and each provider is currently building its own memory layer, schema, and governance approach.

However, he does anticipate standards to emerge over the next couple of years, shaped by the tooling and frameworks that win broad adoption. In his view, “good interoperability” would rest on three pieces: a portable read/write interface so agents can switch memory backends, shared schemas for provenance and access control, and common evaluation hooks to measure memory quality independent of vendor.

He noted that the payment layer is already further along, with protocols such as x402 offering a shared rail for machine-to-machine payments over HTTP. That advancement, he suggested, will put pressure on memory systems to follow a similar path, because enterprises will not want to be locked into closed islands.

Measuring Whether Memory Actually Helps

Beyond proving that semantic retrieval works, Ashraf said the central metric for memory is whether it improves task success over time rather than just at launch.

“The metric I care about most is task success rate over time,” he said, as many systems perform cleanly on day one, only to degrade quietly as history accumulates and the retrieval surface grows. Teams need to run the same workflows repeatedly, track outcomes over weeks, and watch for regressions and new failure modes.

He also looks closely at how often users have to correct the system or restate their preferences. If people keep re-explaining what they want, memory may exist but is not truly helping. Another leading indicator is how often retrieved context turns out to be irrelevant, outdated, or misleading relative to the current task.

Latency and cost still matter but function more as constraints than primary targets, he added, warning that organizations sometimes optimize them too early and inadvertently accept lower recall quality.

What Persistent Memory Enables Today, and What It Still Can’t Do

In closing, Ashraf pointed to persistent, multi-session research as a workflow that has become realistic in the last six to nine months. With durable memory, an agent can keep track of which sources it has checked, which hypotheses have been ruled out, and what remains to be explored. That continuity transforms it from a one-session assistant into something closer to a long-term collaborator on open-ended work.

What remains unsolved, he said, is safe shared memory across multiple agents at scale. Once several agents can read and write to the same store, systems may face conflicting updates, semantic contradictions, and fuzzy lines of responsibility when errors occur.

In multi-agent environments, “state” is not only a set of variables; it also includes plans, summaries, retrieved snippets, and tool traces, and most conflicts are about meaning rather than mechanics, so resolving them depends on verifiable identities, attestable action histories, and shared ways to state permissions and constraints.

“The primitives are arriving,” Ashraf said, “but the composition is still immature,” and that, for now, keeps fully collaborative, memory-rich agent swarms beyond what most teams can realistically deploy.