I find > 3. If a test has a failure rate of < 3%, it is likely not worth your ti...

viraptor · on May 28, 2019

I'm not sure how you'd reject that flakey test. We're talking <3% so first let's assume that you don't even see the failure until 10 other PRs get merged. Not only do you not know what caused the failure, it could be that the failure is in a test which was in the code for ages but the new code breaks its assumptions / initial environment.

Sometimes you can't just point at one thing and say reject this or revert that without a long investigation.

munk-a · on May 28, 2019

If the test failure is detected (so, you get super lucky) you should immediately reject the code, including the tests failing before some other fixups that didn't effect that test... Oftentimes it will take a long time to surface these, but I'm of the opinion that a broken build is show stopping until it's resolved - that doesn't mean 5-alarm all devs rush to the scene, but it does mean a free person picking it up as their next task or, barring that, bumping someone off of feature or other work to address the issue.

It may take a while... but while that flakey test exists in your codebase it will leverage a constant cost on all of your developers.

viraptor · on May 29, 2019

> you should immediately reject the code

Which code? When you randomly run into the flakey test, in most cases it's not coming from the change which was just tested. You'd reject some random, unrelated PR

corndoge · on May 28, 2019

I'm jealous of your workplace's attitude and latitude towards testing

munk-a · on May 28, 2019

We're the inheritors of a legacy code base, part of this involved taking a strong stance to go from zero to hero in terms of testing, no minor bugs are fixed in areas of code not covered with automated testing - this has made our feature work slow right down but we are lucky to have management's support in paying this cost now rather than paying interest on it as time progresses.