One Stream, Two Very Different Destinations

Karma treats anything that emits changes — databases, infrastructure tools, metrics, logs, file drops — as a CDC-like source.
All of these sources are normalized into a common event envelope and published to Kafka (events.normalized).

From there, the normalized stream can be consumed by multiple sinks — but ClickHouse and a Graph DB have such different needs that they get separate pipelines.


Why Separate Pipelines?

1. Different Data Models

  • ClickHouse: flat, append-only tables; perfect for large-scale time-series analytics, baselines, and aggregations.
  • Graph DB: nodes, edges, and relationship properties; built for traversals, lineage, and dependency analysis.

2. Independent Scaling

  • The ClickHouse sink might handle millions of events per minute in batch inserts.
  • The Graph DB sink might process fewer events but do heavier transformations (merging nodes, recalculating edges).
  • Each can scale up or down without affecting the other.

3. Different Connectors

  • ClickHouse: Kafka Connect sink, Kafka table engine.
  • Graph DB: native streaming ingest (Neo4j Kafka Connect, Neptune Streams, JanusGraph with Gremlin).
  • Keeping them separate avoids coupling unrelated logic.

4. Easier Experimentation

  • You can evolve your graph schema without touching the main ledger.
  • You can temporarily disable graph ingestion without losing ClickHouse history.

The Karma Pattern

Real-World Graph DB Use Cases in Karma

  • Root cause tracing: traverse dependencies to see what led to an anomaly.

  • Impact analysis: predict what will break if a service fails.

  • State machine modeling: track transitions and detect unexpected states.

  • Multi-hop alerts: notify based on cascading effects, not just single metrics.

When to Skip the Graph DB

  • If your needs are limited to metrics, baselines, and statistical anomaly detection, ClickHouse alone may suffice.

  • Graph DB makes sense when relationships and path-dependent reasoning are core to the problem.