Implementing Entropy in Karma: The First Step

Goal

Take Karma from recording events to spotting surprises.

We’re not aiming for a PhD-level probabilistic model yet — just enough to:

Learn what “normal” looks like for each tracked process.
Flag deviations as events (entropy.deviation) in real time.
Feed those events into the action loop.

Scope of the MVP

Data Source:
ClickHouse table populated by normalized events (via Kafka).
Target Entities:
Any (entity_id, step) combination — e.g., supplier response time, file arrival, job completion.
Metric:
Latency between defined step boundaries (or between event types).
Output:
- Current entropy score.
- A boolean “out of expectation” flag.
- Context tags for correlation.

Calculating Entropy

We start with bucketed latency distributions:

SELECT
  entity_id,
  step,
  intDiv(latency_ms, 500) * 500 AS latency_bucket,
  count() AS bucket_count
FROM events
WHERE event_ts >= now() - INTERVAL 7 DAY
GROUP BY entity_id, step, latency_bucket

Then apply Shannon’s entropy formula in SQL or downstream code:

SELECT
  entity_id,
  step,
  -sum(p * log2(p)) AS entropy
FROM (
  SELECT
    entity_id,
    step,
    latency_bucket,
    bucket_count / sum(bucket_count) OVER (PARTITION BY entity_id, step) AS p
  FROM latency_distribution
) t
GROUP BY entity_id, step

Defining “Surprise”

We don’t want to fire on every fluctuation. Instead:

Baseline: Median entropy over the last N days.
Tolerance: Median Absolute Deviation (MAD) or % change threshold.
Trigger: Current entropy > baseline + tolerance.

Example:

SELECT
  entity_id,
  step,
  current_entropy,
  baseline_entropy,
  (current_entropy - baseline_entropy) > 0.5 AS is_surprising
FROM ...

Publishing the Event

Once is_surprising = 1, publish a normalized event to Kafka:

{
  "event_type": "entropy.deviation",
  "entity_id": "supplier_A",
  "step": "quote_response",
  "entropy": 2.35,
  "baseline_entropy": 1.5,
  "tags": {
    "priority": "high",
    "detected_by": "karma.entropy.v1"
  },
  "ts": "2025-08-09T19:58:00Z"
}

This flows into:

The action loop (automatic responses, notifications).
The same ClickHouse table for historical tracking.
Any downstream analytics or dashboards.

Why This Works Now

No new infrastructure: Uses existing ClickHouse + Kafka stack.
Simple math: Baselines + deviations, no heavy ML yet.
Extensible: You can swap in more sophisticated models later.

Roadmap After MVP

Add sequence entropy (order of steps, not just latency).
Model joint entropy of multiple tags (supplier + product type).
Incorporate forecasting for expected future entropy.
Feed deviations into self-healing playbooks.

Goal#

Scope of the MVP#

Calculating Entropy#

Defining “Surprise”#

Publishing the Event#

Why This Works Now#

Roadmap After MVP#