Proposal: Early Ticket Prediction via Mongo–Kafka–ClickHouse Fabric

One-Line Pitch

“By unifying MongoDB changes, system signals, and ServiceNow tickets in one stream, we can give on-call engineers an early-warning signal that a ticket is about to be generated — improving response time without adding noise.”

Executive Summary

We propose to extend the existing MongoDB → Kafka → ClickHouse pipeline with ServiceNow ticket data to create an early-warning signal for incidents. The goal is to give the system the ability to recognize conditions that typically lead to a ServiceNow ticket, before the ticket is actually opened.

This is a lightweight, non-disruptive pilot aimed at demonstrating practical usefulness rather than rigorous measurement.

Concept

Inputs:
- MongoDB Change Streams (CDC)
- System logs and metrics
- ServiceNow ticket open/resolve events
Pipeline:
- All events flow into Kafka (events.enriched).
- ClickHouse ingests via Kafka engine → consolidated events_merge table.
Outputs:
- An “Early Ticket” signal is generated when the system observes conditions that historically precede tickets (e.g., lag spikes, error bursts, entropy changes).
- Signals published to Kafka topic ops.earlywarn and displayed in Grafana.

Value Proposition

One timeline: unify changes, anomalies, lag, and ticket breadcrumbs in a single view.
Earlier visibility: on-call teams see “ticket likely soon” flags, improving response speed.
Operator confidence: helps prioritize noise vs. real issues.
Non-invasive: no changes to existing monitoring/alerting; runs in shadow mode initially.
Future extensibility: foundation for machine learning or automation once proven useful.

Pilot Plan

Ingest ServiceNow tickets into Kafka and ClickHouse.
Label pre-incident windows (e.g., 60 minutes before each ticket open).
Compute basic features: error rates, lag z-scores, CDC burstiness, entropy/sequence metrics.
Define simple rules to raise “TicketSoon” signals (e.g., lag ≥2σ + high CDC rate).
Display signals in Grafana alongside existing events and ticket breadcrumbs.
Shadow mode trial (2–3 weeks): collect operator feedback on usefulness.

Success Criteria

Visibility: operators can see why the system expects a ticket (transparent “reasons”).
Confidence: feedback indicates signals would have helped in recent incidents.
Adoption potential: leadership sees a clear path to reduce MTTR and noise fatigue.

Next Steps

Stand up Kafka topic for ServiceNow events.
Build ClickHouse ingestion + Grafana panel.
Run 2–3 week shadow pilot with one service.

Proposal: Early Ticket Prediction via Mongo–Kafka–ClickHouse Fabric#

One-Line Pitch#

Executive Summary#

Concept#

Value Proposition#

Pilot Plan#

Success Criteria#

Next Steps#