The scariest bug isn't a crash. It's silence.
A crash is a request for attention. Silence is the failure mode that quietly removes the signal anything is wrong, so you stop looking. One of these is harder to fix because you don't know it's there. We are trained to fear crashes. Red text, stack traces, the alarm noise. They are loud and they are honest. The system is telling you, in the most direct language it has, that something is wrong. Silent failures don't do that. The thing keeps running. The dashboard stays green. The metric still updates. The only thing missing is the bit where it was actually doing the job. By the time you notice, it has usually been broken for longer than you'd like to admit. ——————————————————— Why silence is worse A crash narrows your search. You know roughly when, you know roughly where, and the error message gives you a thread to pull. Recovery starts with a clear signal. Silence does the opposite. There is no signal, so there is no thread. You have to first prove the bug exists before you can find it. And you only think to look when something downstream has already gone wrong, which is usually too late to catch the cause cleanly. I shipped one of these this week. A monitor that compared a system timestamp as a string. My locale formatted the date one way, the comparison expected another. Every live process looked dead. The dashboard said "all clear" for five days while the thing it was watching wasn't being watched at all. The bug was small. The five days was the problem. ——————————————————— Three rules I work by now 1. Surface the absence, not just the presence. Most dashboards show what is there. Twelve active workers, four builds passing, two hundred tests green. That is useful when something is happening. It is useless when nothing is happening and that is the failure. Put "last successful run" at the top. If that number is older than it should be, something silent has broken, even if every row underneath says "all clear." 2. Distrust anything that compares system output as a string.