Memory

Unlimited memory. Zero bloat.

Store everything your agent should know — policies, knowledge, rules, skills — with no ceiling. Iconia injects only what each request actually needs, so your prompts stay lean while your agent's knowledge grows without limit. Remember more. Send less.

Store once

Teach it once, keep it forever

Policies, knowledge, rules, and skills persist across every session, every model, every provider. Nothing drifts, nothing gets re-explained.

Inject precisely

Only what the request needs

Ask about refunds, get the refund policy — not the entire company handbook. Irrelevant questions get zero injection: no bloat, no noise.

Stay coherent

Contradictions get caught

Two policies that can't both be true are flagged when the second one arrives — before your agent starts confidently saying both.

Two lines

Works with the client you already have.

Anthropic-style, OpenAI-compatible, or any agent that can set a base URL. The response headers show every injection: items, milliseconds, cache state.

# any OpenAI-compatible client
client = OpenAI(
  base_url="https://www.iconia.tech/v1",
  api_key="ick_your_key"
)

# response arrives with receipts
← x-iconia-items-injected: 2
← x-iconia-retrieval-ms: 5.3
What lives in memory

Everything your agent should never forget.

Policies

Rules it must honor

Refund windows, escalation paths, tone, compliance lines. Written once, enforced on every reply.

Knowledge

Facts it can cite

Product details, pricing, docs, playbooks. Recalled when a question touches them — not invented.

Identity

Who it is

The system prompt, the persona, the boundaries — stable across every session and every model you route through.

Skills

Tools it reaches for

Tool descriptions and usage notes, surfaced only when a request actually calls for them.

Context

What happened before

Decisions, preferences, prior threads — the continuity that turns a chatbot into a colleague.

No ceiling

As much as you want

Store without limit. Volume never slows retrieval and never bloats a prompt — only the relevant slice is ever sent.

How it's different

No embeddings. No vector database. No drift.

Most memory is a pile of embeddings in a vector store — approximate, GPU-hungry, and slower as it grows. Iconia addresses meaning directly. Retrieval is exact, flat, and CPU-only, so the millionth memory is as fast as the first — and there's no embedding model to host, re-index, or pay for.

  • Exact — the right memory, not the nearest vector
  • Flat — 2.5ms whether you store 10 or a million
  • CPU-only — no GPU, no embedding bill
  • Instant — write it, and it's retrievable now
2.5MSflat · any size
1M+memories, same speed
0GPUcpu-only retrieval
100PCTparaphrase recall
What it costs you

Less. Measurably.

87–90PCTcontext tokens saved per request
2.5MSflat retrieval · any store size
0REWRITESno sdk · no new habits · swap a url
Built for

Agents that can't afford to forget.

Support

Customer agents

Every policy honored, every account rule remembered — across a thousand conversations, without re-priming a single one.

Coding

Dev assistants

Conventions, architecture decisions, and past choices persist between sessions instead of resetting on every new chat.

Operations

Internal copilots

Runbooks, thresholds, and org knowledge stay put — screened on the way in, recalled the moment a task needs them.

Knowledge that remembers

Teach it once.

Get your key