Contact Memory & Emotional Intelligence Spec · Contact Intelligence

Last Updated: 2026-04-03 Status: Draft

1. The Challenge

Chat Engine is stateless. It processes one message at a time. The 20-message sliding window is fine for support but destroys relationship continuity. An AI that forgets "your dog's name is Bruno" isn't building a relationship — it's breaking one.
AI Brain memory is per-tenant, not per-contact. Brain memories answer "what does this business care about?" not "what does this specific end-user care about?" They're different domains with different lifecycles.
Emotions matter in relationship contexts. A customer support bot can be consistently neutral. An AI companion, tutor, or coach must adapt: playful when the user is happy, supportive when they're sad, calm when they're anxious. Tone-deaf responses destroy trust.
Relationship depth requires context density. A new contact needs discovery questions. A 30-day contact expects the AI to know their life. The system must scale context injection based on relationship stage — thin prompts for new contacts, rich prompts for deep relationships.

2. Memory Architecture — Per-Contact, Entity-Linked

2.1 How It Differs from AI Brain Memory

Dimension	AI Brain Memory	Contact Memory
Scope	Per-tenant (business owner)	Per-contact (end-user)
Content	Business facts, patterns, preferences	Personal facts, relationship episodes, user preferences
Decay rate	0.01/day (business context changes fast)	0.005/day (personal facts are stable)
Volume	~100 memories per tenant (consolidated)	~50-80 per active contact
Access pattern	On-demand via copilot	Every message (hot path)
Extraction	From owner conversations	From end-user conversations

2.2 Memory Types

Type	Default Importance	Decay Rate	Example	Extraction
`fact`	0.7	0.003	"Has a golden retriever named Bruno"	Rules-based ("I have", "my X is")
`preference`	0.8	0.005	"Doesn't like talking about politics"	Rules-based ("I prefer", "don't like", "I love")
`episode`	0.5	0.008	"Bruno ate his shoes on April 3rd"	Every interaction creates a lightweight episode
`pattern`	0.8	0.004	"Usually chats between 10pm-12am"	Consolidation (computed from 5+ episodes)

Key insight: Facts decay slowest (0.003/day). "Your dog's name is Bruno" should last for months. Episodes decay fastest (0.008/day). "We talked about the weather on March 5th" fades unless reinforced.

2.3 Multi-Signal Retrieval

Same scoring formula as AI Brain, tuned for contact context:

RetrievalScore = (similarity × 0.35)
               + (recency × 0.25)
               + (importance × 0.20)
               + (access_frequency × 0.10)
               + (entity_match × 0.10)

Key difference: entity_match weight is doubled (0.10 vs 0.05). In personal conversations, entity relevance matters more — if the user mentions "Bruno," memories linked to pet:bruno should surface regardless of embedding similarity.

2.4 Entity Graph — Personal Knowledge Map

Every contact builds a personal knowledge graph:

Contact: Arjun
├── pet:bruno
│   ├── fact: "Golden retriever, 3 years old"
│   ├── episode: "Had vet appointment April 1"
│   ├── episode: "Ate shoes again April 3"
│   └── preference: "Loves belly rubs"
├── workplace:infosys
│   ├── fact: "Software engineer"
│   ├── episode: "Was stressed about sprint deadline"
│   └── episode: "Passed over for promotion"
├── hobby:gaming
│   ├── fact: "Plays Valorant"
│   └── preference: "Prefers to game on weekends"
├── person:mom
│   ├── fact: "Lives in Chennai"
│   └── episode: "Visited mom for Diwali"
└── topic:food
    ├── preference: "Loves biryani"
    └── preference: "Doesn't like spicy food"

Entity extraction runs in deferred processing (CF Queue, ~5min delay). A lightweight rules-based extractor identifies named entities and links them to existing or new ContactEntity records.

3. Immediate Memory Extraction (Inline, < 10ms)

Runs synchronously on every ingested interaction. Must be fast — no LLM calls, purely rules-based.

3.1 Extraction Rules

Fact detection:

Patterns:
  "I am {X}" / "I'm {X}"              → fact about identity
  "I have {X}" / "I've got {X}"       → fact about possessions/relationships
  "My {X} is {Y}"                      → fact about named entity
  "I work at {X}" / "I study at {X}"   → fact about workplace/school
  "I live in {X}"                      → fact about location
  "I'm {age} years old"               → fact about age
 
Example:
  "My dog Bruno had his vet appointment today"
  → fact: "Dog Bruno had vet appointment", entity: pet:bruno

Preference detection:

Patterns:
  "I like {X}" / "I love {X}"         → positive preference
  "I don't like {X}" / "I hate {X}"   → negative preference
  "I prefer {X}" / "I'd rather {X}"   → comparative preference
  "Don't talk about {X}"              → avoidance preference
  "Can we talk about {X}?"            → interest signal
 
Example:
  "I don't really like talking about politics"
  → preference: "Doesn't like talking about politics", entity: topic:politics

Episode detection:

Every interaction generates a lightweight episode:
  "{summary of what was discussed}" — extracted from user message + AI response
 
But ONLY if the conversation contains meaningful content:
  Skip: "lol", "ok", "hmm", "haha" (low-content messages)
  Keep: "Bruno ate my shoes again" (event with entity reference)

3.2 Deduplication at Extraction

Before storing, check for semantic overlap with existing memories:

New extraction: "Has a dog named Bruno"
Existing memory: "Has a golden retriever named Bruno"
 
Cosine similarity: 0.94 (above 0.90 threshold)
→ Don't create duplicate
→ Boost existing memory importance by 0.05 (reinforced)

This check runs against KV-cached recent memories (last 20) for speed. Full dedup happens in deferred consolidation.

4. Emotional Intelligence Layer

4.1 Mood Classification

Classifier prompt (haiku-class, ~50 input tokens):

Classify this message's emotion. JSON only.
Message: "{user_message}"
{ "mood": "<emotion>", "energy": "<level>", "style": "<approach>" }
 
mood: happy|sad|anxious|excited|neutral|angry|frustrated|flirty|bored|grateful
energy: high|medium|low
style: playful|deep|casual|supportive|romantic|venting

Cost: ~$0.0001 per message

Fallback chain:

Attempt haiku-class classification (primary)
If timeout (>200ms) → use previous mood from KV state
If LLM provider is down → use keyword-based fallback:

Keywords → mood mapping (fallback only):
  "haha", "lol", "😂", "amazing"     → happy, high
  "sad", "crying", "😢", "miss"      → sad, low
  "worried", "nervous", "anxious"    → anxious, low
  "ugh", "annoyed", "frustrated"     → frustrated, medium
  "bored", "meh", "whatever"         → bored, low
  default                            → neutral, medium

4.2 Mood History & Pattern Detection

Mood transitions are logged in InteractionLog. The consolidation engine detects patterns:

Pattern: Sustained negative mood

If mood ∈ {sad, anxious, angry} for 3+ consecutive messages:
  → Create memory: "Contact went through a difficult period around {date}"
  → Create trigger: check-in message next morning
  → If crisis language detected ("I want to die", "can't take this anymore"):
    → Flag for human review (if configured)
    → AI responds with helpline information (immediate, no delay)

Pattern: Consistent timing

If 70%+ of messages arrive between 10pm-12am over 14+ days:
  → Create pattern memory: "Usually chats between 10pm-12am"
  → Use for outreach timing: schedule proactive messages near preferred hours

Pattern: Engagement decline

If avg_messages_per_session drops 50%+ over 7 days:
  → Increase churn_risk by 0.2
  → If mood pattern shows "bored" → create insight for business owner

4.3 Adaptive System Prompt Injection

The emotional state is injected into the Chat Engine's system prompt:

CONTACT EMOTIONAL CONTEXT:
- Current mood: sad (energy: low)
- This is unusual: normally playful and medium-energy
- Relationship stage: established (20 days, high trust)
- Recent context: was upset about missing a promotion yesterday
 
ADAPTATION:
- Be warm and genuinely supportive, not cheerful
- Don't minimize feelings
- Reference what you remember about their work situation
- Don't pivot to lighter topics unless they do first
- Shorter, calmer responses — match their energy

The adaptation rules vary by mood:

Mood	Energy	Adaptation
happy + high	Match energy — be enthusiastic, playful, use emojis
happy + low	Gentle warmth — they're content but tired
sad + low	Supportive, listen first — don't force positivity
sad + high	They want to talk about it — engage deeply
anxious + any	Calm, reassuring — acknowledge the feeling
frustrated + high	Let them vent — validate, then offer perspective
bored + any	Introduce new topics, ask engaging questions
flirty + any	(Depends on bot persona) — match or gently redirect

5. Context Injection by Relationship Stage

Different relationship stages get different context density:

5.1 Stage: `new` (< 3 sessions)

CONTACT CONTEXT:
- New contact (first session)
- No memories yet
- Mood: neutral
 
BEHAVIOR:
- Ask discovery questions naturally (name, interests, pets, work)
- Don't overload with questions — 1-2 per response
- Be warm and welcoming
- Focus on building rapport

Token budget for memories: 0 (none exist yet)

5.2 Stage: `building` (3-14 sessions)

CONTACT CONTEXT:
- Building relationship (8 sessions over 10 days)
- Knows: Arjun, software engineer, has dog Bruno
- Preferences: likes playful banter, chats at night
- Mood: playful (normal for him)
 
MEMORIES:
- [fact] Software engineer at Infosys
- [fact] Golden retriever named Bruno
- [preference] Likes playful banter and humor

Token budget for memories: ~500 tokens (5-8 memories)

5.3 Stage: `established` (15+ sessions)

CONTACT CONTEXT:
- Established relationship (22 sessions over 20 days, active streak: 20)
- Deep knowledge: work, pet, hobbies, preferences, emotional patterns
- Mood: frustrated (slightly unusual)
 
MEMORIES:
- [fact] Software engineer at Infosys (importance: 0.85)
- [fact] Golden retriever named Bruno, 3 years old (importance: 0.88)
- [preference] Likes playful banter and humor (importance: 0.80)
- [pattern] Usually chats between 10pm-12am (importance: 0.75)
- [pattern] Work stress comes in sprints — supportive then, playful otherwise (importance: 0.72)
- [episode] Passed over for promotion last week — was very upset (importance: 0.70)
- [preference] Doesn't like talking about politics (importance: 0.65)
- [fact] Mom lives in Chennai, visited for Diwali (importance: 0.60)

Token budget for memories: ~1200 tokens (10-15 memories)

5.4 Stage: `deep` (30+ sessions, 14+ day streak)

Full context injection — the AI should feel like it truly knows this person.

Token budget for memories: ~2000 tokens (15-20 memories, richest context)

6. Memory Consolidation

6.1 Three-Tier Consolidation (Same Pattern as AI Brain)

Tier	Trigger	Tasks	Speed
Immediate	Every interaction (inline)	Extract facts + preferences, update state, check duplicates	< 10ms
Deferred	CF Queue (~5min delay)	Generate embeddings, merge duplicates, extract entities, create triggers	~30s
Periodic	CF Cron (every 6 hours)	Decay importance, compress episodes → patterns, prune dead memories, recompute churn	~minutes

6.2 Episode → Pattern Compression

When 5+ episodes share a theme:

Episodes about Bruno:
  - "Bruno went to the vet" (March 20)
  - "Bruno ate Arjun's shoes" (March 28)
  - "Bruno ate shoes again" (April 3)
  - "Bruno learned a new trick" (April 5)
  - "Took Bruno to the park" (April 8)
 
Pattern extracted:
  "Arjun frequently shares updates about his dog Bruno — 
   Bruno is a significant part of his daily life"
  (type: pattern, importance: 0.80, entity: pet:bruno)

Individual episodes remain in PostgreSQL for history but are NOT loaded into the LLM context. The pattern replaces them — denser and more useful.

6.3 Memory Growth Over Time

Timeline	Memory Count	Composition
Day 1	3-5	Facts from first conversation
Week 1	15-20	Facts + preferences + early episodes
Month 1	40-50	Facts + preferences + patterns emerging
Month 3	50-70	Consolidated: patterns replace episodes, noise pruned
Month 6	60-80	Mature: mostly facts + patterns + preferences, few raw episodes
Year 1	70-100	Stable: consolidation keeps it bounded, high signal-to-noise

The count doesn't grow linearly because consolidation compresses and prunes. A year-old contact has ~100 high-quality memories, not 1,000 raw episodes.