I wonder if similar to infrastructure resilience, code resilience is also required for critical services that can never go down? Instead of relying on a single implementation for a critical service, have multiple independent implementations in different languages.
Back when I was running my own DNS servers, I did always ensure that primary and secondary were running on different platforms and different software.