> 3. If a test has a failure rate of < 3%, it is likely not worth your time fixing it. For these, we retry each failing test up to three times. Not all test frameworks support retying out of the box, but you can usually find a workaround. The retries can be restricted to specific tests or classes of tests if needed (e.g. only retry browser-based tests).
to be pretty terrifying. I know that folks are under different amounts of pressure but we'd reject that code from merging here (or revert it out when we discovered the flakiness) as it's basically just a half-finished test that requires constant baby sitting.
I'm not sure how you'd reject that flakey test. We're talking <3% so first let's assume that you don't even see the failure until 10 other PRs get merged. Not only do you not know what caused the failure, it could be that the failure is in a test which was in the code for ages but the new code breaks its assumptions / initial environment.
Sometimes you can't just point at one thing and say reject this or revert that without a long investigation.
If the test failure is detected (so, you get super lucky) you should immediately reject the code, including the tests failing before some other fixups that didn't effect that test... Oftentimes it will take a long time to surface these, but I'm of the opinion that a broken build is show stopping until it's resolved - that doesn't mean 5-alarm all devs rush to the scene, but it does mean a free person picking it up as their next task or, barring that, bumping someone off of feature or other work to address the issue.
It may take a while... but while that flakey test exists in your codebase it will leverage a constant cost on all of your developers.
Which code? When you randomly run into the flakey test, in most cases it's not coming from the change which was just tested. You'd reject some random, unrelated PR
We're the inheritors of a legacy code base, part of this involved taking a strong stance to go from zero to hero in terms of testing, no minor bugs are fixed in areas of code not covered with automated testing - this has made our feature work slow right down but we are lucky to have management's support in paying this cost now rather than paying interest on it as time progresses.
> 3. If a test has a failure rate of < 3%, it is likely not worth your time fixing it. For these, we retry each failing test up to three times. Not all test frameworks support retying out of the box, but you can usually find a workaround. The retries can be restricted to specific tests or classes of tests if needed (e.g. only retry browser-based tests).
to be pretty terrifying. I know that folks are under different amounts of pressure but we'd reject that code from merging here (or revert it out when we discovered the flakiness) as it's basically just a half-finished test that requires constant baby sitting.