NeuralEngram Part 3: Where Cognition Meets AI

In the previous blog, the memory component was introduced with clarity, but not in complete detail. This article focuses on the actual implementation of the memory system and how each component works together to manage information effectively over time.

The system is designed as a multi-layer cognitive architecture, where each component mirrors a function of the human brain. Instead of relying on a monolithic memory store, responsibilities are divided into three tightly integrated modules: Working Memory (real-time processing), Episodic Memory (experience storage), and the Ebbinghaus Decay Mechanism (forgetting and reinforcement).

`Ebbinghaus Decay Mechanism:`

Beyond the core retention calculation, the decay engine provides several supporting functions that control how memory evolves in the system.

Reinforcement (reinforce_memory) When a memory is recalled, this function increases its stability. The boost depends on recall quality and the number of past reviews. It also applies diminishing returns, ensuring that repeated recalls do not excessively inflate memory strength.

Forgetting Check (is_memory_forgotten) This function determines whether a memory is still usable. If its retention falls below a predefined threshold, it is marked as forgotten and excluded from retrieval.

Time-to-Forget (time_until_forgotten) This estimates how much time remains before a memory becomes irrelevant. It allows the system to identify memories that are close to being forgotten and act on them proactively.

Decay Curve Sampling (decay_curve_points) Generates sample points of the decay trend over time. This is mainly used for debugging, visualization, or analyzing how different stability values affect memory lifespan.

These functions collectively ensure that memory is not just stored, but actively monitored, reinforced, and removed based on its relevance over time.

`Episodic Memory:`

The episodic memory layer manages storage, retrieval, and evolution of past experiences.

Add Memory (add) Stores a new memory as an episode. It serializes content, generates embeddings, and persists everything in the database along with metadata like importance, timestamps, and stability.

Recall (recall) Fetches a memory by ID and updates it if valid. If the memory is not forgotten, it reinforces stability, updates the last accessed time, and increments the review count.

Semantic Search (semantic_search) Retrieves relevant memories based on meaning. It converts the query into an embedding, compares it with stored embeddings using cosine similarity, and returns the most relevant results above a threshold.

Grounded Retrieval (grounded_retrieve) Similar to semantic search but adds a confidence check. It ensures that retrieved memories are sufficiently relevant before using them in downstream processing.

Inconsistency Detection (detect_inconsistency) Checks whether retrieved memories are semantically inconsistent with each other. It compares embeddings and flags low similarity cases.

Replay Weak Memories (replay_weak) Identifies memories that are close to being forgotten and reinforces them automatically by triggering recall.

Prune Forgotten (prune_forgotten) Removes memories that have crossed the forgetting threshold, keeping the system clean and efficient.

Maintenance Loop (start_maintenance) Runs background processes that continuously reinforce weak memories and delete forgotten ones without manual intervention.

`Working Memory:`

The working memory layer manages short-term conversational context.

Add Message (add) Inserts a new message into the memory buffer. It validates input, attaches metadata, and ensures the buffer stays within its fixed capacity.

Get All (get_all) Returns all stored messages as safe copies, preserving thread safety.

Get Recent (get_recent) Retrieves the most recent N messages, allowing selective context usage.

Peek (peek) Returns the latest message without modifying the buffer.

Clear (clear) Removes all messages from the buffer, resetting the working memory.

Resize (resize) Adjusts the capacity of the buffer. If reduced, older messages are automatically discarded.

Format for Prompt (format_for_prompt) Converts stored messages into text or JSON format, making them ready for LLM input.

Prompt Messages (to_prompt_messages) Returns messages in structured role-content format suitable for direct use in chat-based models.

This layer ensures that only the most recent and relevant context is actively used, preventing overload while maintaining conversational continuity.

Our system goes beyond traditional AI memory by introducing dynamic

memory lifecycle management - where information is continuously evaluated, reinforced based on usage, and safely discarded when irrelevant. It also integrates reliability features such as hallucination control and deadlock-safe execution, which are typically absent in standard memory designs.

“Now that memory is built, the next step is bringing it to life - integrating a single agent that can think, recall, and act.”

Where cognition meets AI - teaching machines to remember what matters (Part 3)

`Ebbinghaus Decay Mechanism:`

`Episodic Memory:`

`Working Memory:`

Comments

NeuralEngram

Where cognition meets AI - teaching machines to remember what matters (Part 1)

More from this blog

Where cognition meets AI - teaching machines to remember what matters (Part 2)

Where cognition meets AI - teaching machines to remember what matters (Part 1)

Command Palette

Ebbinghaus Decay Mechanism:

Episodic Memory:

Working Memory:

Comments

NeuralEngram

Where cognition meets AI - teaching machines to remember what matters (Part 1)

More from this blog

`Ebbinghaus Decay Mechanism:`

`Episodic Memory:`

`Working Memory:`