domino-effect-challenge

🧩 The Domino Effect Challenge

A legendary RubixKube™ internship puzzle.

Modern systems are dominos. One quiet failure can ripple across services and break things far away from the blast point. Your mission: simulate this world, detect what matters, reason to root cause, and explain your thinking.

This is not RubixKube. It’s about showing how you think, code, and reason about reliability. & Also It’s a proxy for how you think under ambiguity.


🎯 What you will build

A program (CLI or tiny web service) that:

  1. Simulates a service graph over time (with random glitches)
  2. Detects anomalies against a threshold
  3. Traces downstream blast radius accurately
  4. Prioritizes remediation (root cause first)
  5. Explains the why with clear, human-readable output
  6. Records an incident log we can review

Run locally. No cloud costs.


🧩 World model (formal spec)

Example (sample/services.json provided):

[
  { "name": "service-A", "depends_on": ["service-B", "service-C"], "health": 0.98 },
  { "name": "service-B", "depends_on": ["service-D"], "health": 0.95 },
  { "name": "service-C", "depends_on": ["service-E", "service-F"], "health": 0.99 },
  { "name": "service-D", "depends_on": [], "health": 0.97 },
  { "name": "service-E", "depends_on": ["service-G"], "health": 0.96 },
  { "name": "service-F", "depends_on": ["service-G", "service-H"], "health": 0.94 },
  { "name": "service-G", "depends_on": [], "health": 0.99 },
  { "name": "service-H", "depends_on": ["service-I"], "health": 0.92 },
  { "name": "service-I", "depends_on": [], "health": 0.97 },
  { "name": "service-J", "depends_on": ["service-B", "service-I"], "health": 0.95 }
]

⏱️ Simulation rules

You can externalize parameters in config.yaml:

ticks: 50
threshold: 0.70
alpha: 0.8
cooldown: 1
heal_to: 0.88
seed: 1337

🔎 Detection & RCA expectations


🖥️ CLI contract (suggested)

Your tool should support:

run:    ./domino --input sample/services.json --config config.yaml
query:  ./domino "why is service-A failing?"
help:   ./domino --help

Query semantics (examples):

If you build a web UI instead, document the endpoints (OpenAPI optional).


🧾 Output format (canonical)

Write a human-readable incident log to ./sample/output.log (example already provided), and print key events to stdout. Suggested lines:

[ALERT] service-G fell below threshold (0.62 < 0.70) at T=2
[BLAST] due to service-G → impacted: [service-E, service-F, service-C]
[PRIORITY] roots={service-G} order=[service-G]
[SUGGESTION] Remediate service-G first
[HEAL] service-G -> 0.88 at T=3; recovered: [service-E, service-F, service-C]

Machine-readable optional: also emit events.jsonl with structured entries.


✅ Quality gates (we will look for)


🧪 Scoring rubric (100 pts)

We don’t punish incomplete—but we reward thoughtful.


🚫 What not to do


🕰️ Timing (read carefully)

This challenge is open all the time — no fixed deadline. Use the 7-day window as your personal measure of fairness: whenever you start, try to finish within 7 days and be honest about it in your submission. If you crack it now, great. If not, learn from it and come back later. Our doors are always open for builders who love hard problems.


📤 How to submit (no email needed)

  1. Go to Issues → New issue
  2. Choose “Domino Effect Submission” link
  3. Fill the form (public repo URL required) and submit

A GitHub Action will clone your repo, run basic checks, and comment results.

Prefer a private submission? Send details to connect@rubixkube.ai.


🧱 Repo template you can copy

We’ve included:


🧠 Hints (not required, just helpful)


🏁 Why this matters

If your solution shows strong reasoning, clarity, and care, that’s RubixKube DNA. If AI builds products, you help keep them alive.

Build it like you mean it. 🚀


Good luck. May the dominos fall in your favor.