top of page

How does Long Term Memory work in Layla?

Layla's long-term memory allows all characters/personalities in Layla to recall relevant information from past conversations.

The long term memory construction can be boiled down to 5 main steps:

  1. Parsing conversations

  2. Clustering topics

  3. Generate knowledge graphs

  4. Construct embeddings

  5. Recall via similarity search

Parsing conversations

As you chat with your characters, your conversations are parsed and chunked (which simply means splitting your messages into manageable lengths for processing). These are saved for future use. This is known as a conversation shard.

Clustering topics + generating knowledge graph

Your conversation shards are processed in the background: important information is extracted, knowledge graph entities and edges are built. This process happens in the background and is referred to as ingestion.

This process does not happen while you chat to give you a smooth experience.

Different conversation depth generates different levels of knowledge graphs, as seen in this video:

Embedding and similarity search

The information in the knowledge graph are clustered by topics, as seen by the different coloured nodes. Clusters have interconnecting edges as your conversation flows naturally between topics.

This builds into the L1 and L2 structure of Layla's long term memory.

L1 and L2 memory are inspired from the memory organisation method of every major operating system. L1 provides quick access, while L2 are relatively slower to access. In the same way, L1 memories represent edges in the knowledge graph, and L2 represents the topic cluster.


During chatting, all incoming and outgoing messages are compared with entities in L1 memory. This operation is so quick that you do not feel any impact during chatting. As similarities build up, Layla will decide to access the L2 cache at a reasonable time. This returns the topic summary and gives a wealth of context to for the current conversation.

This gives a reasonable balance between memory recall and speed.


Lastly, the current conversation + the recalled context is re-added back to the long-term memory. This builds rich layers of memory for further interactions. Important information recalled from this session are reinforced in Layla's brain for future recall.

Background processing

As with all features in Layla, this runs completely offline on your device. This whole process is pretty resource intensive for a mobile phone, so the ingestion and knowledge graph construction are only done when your phone is idle. This is usually done during the night when your phone is plugged in and charging.

Interestingly, this is basically how humans work! Our brains process everything we see and do during our sleep: categorising, reinforcing, summarising... This is why a good night's sleep is so important!

235 views0 comments


Avaliado com 0 de 5 estrelas.
Ainda sem avaliações

Adicione uma avaliação
bottom of page