Odds wrote:But there's a difference between "unlikely and boring" and "unlikely and interesting", and it matters.
I think we agree here, though this particular statement makes me feel uneasy, since "boring" and "interesting" are not well-defined. The main issue I had was your claim that the event suggests something is wrong with the code solely due to how unlikely it was, which is why I presented examples of similarly unlikely events that we would not consider to be indicative of bad code.
The problem here is that we have one known probability: given that the code works, what is the probability that the spell miscasts four times in a row? (0.04^4) Here we see that the miscast event happened, and we want to know how much this supports the notion that the code is not working as intended; intuitively it feels that the more unlikely the event, the more it points towards the code being broken, and that's not completely true. If we try to apply Bayes', we'll have to deal with nebulous variables like "what is the probability the code is right" and "if the code is broken, what is the probability we get quadruple miscasts".
So for the poker example, I would be more inclined to suspect cheating against aces than random hands since if you are going to cheat, you probably are going to make your hand good. The crawl code does not have such motives, but perhaps it is easy to mess up the code in a manner that would cause miscast chains to happen more. In the same way, if we had some random-looking sequence of spell failure/successes that come up to the same probability as four failures, we don't regard this as evidence of a bug as we would expect something more spectacular out of a bug. Whether this is correct, I don't know; these things are basically impossible to quantify, leaving us to human intuition, which is pretty bad at this stuff.
For what it's worth, a fun anecdote: I once miscasted fire storm four times in a row at <5% fail. I ate a few torments and was in a pretty bad spot, but the fourth miscast created a smoke cloud, allowing me to teleport away without taking more damage.
goodcoolguy wrote:I mean a mathematical proof that the program does what it's claimed it does, assuming the correctness of the compiler and the operations performed by the host machine. In other words, the usual definition of "proof of correctness" for computer programs.
If by this you mean "take the code as written, and via some formal system of axioms, show that it is equivalent to this simpler statement that summarizes what we think it does", this is hard to do for all but the simplest of programs. If you just want to convince yourself the code works, the best way is just to read it yourself and try to understand it, since any proof that it works will be longer and harder to understand than the code itself.