Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The best way I've seen to handle such problems full of possible mistakes is to use Test-Driven Development.

Write a test for a trivial case, like a zero-length array. Think of the design. What should it return? Should it throw an exception? Do that, and only that, in the implementation.

Each iteration after you add a test, you should first focus on getting every test to pass, then commit to version control, then refactor to get the clearest implementation.

A next iteration would be to test with a single element - a matching one, case in which it should return its index (zero in zero-indexed languages).

Then with a non-matching one - again, you have to make a design decision: throw an exception, return -1, or something else.

Then a case with two elements, when the first is the match. Then two with the second matching. Then two with none matching.

For as long as you can think of tests breaking your implementation, keep thinking of edge-cases. Eventually you end up with a formidable collection of test cases, and you will not be afraid of refactoring or changing things, because you will find out quickly and easily which tests break.

Edit: boxed mentioned Property-Based Testing and Mutation Testing. These are amazing ways to come up with more corner cases to cover.



The reason there are so many buggy binary search implementations however is also the reliance of TDD and similar. Very few tests attempt a binary search over a container where overflows become an issue. Yet sooner or later someone will actually try to use it for that purpose and discover that it fails.


This process would be satisfied with a linear search. To even just see that there is a log n algorithm in the case where the input is sorted, you need to think analytically about the problem and its solution.

Suppose the problem is presented with runtime constraints for some sufficiently large test cases, such that this difference in performance matters. Even in that case, what are the chances that a TDDer who is unaware of the binary search algorithm will stumble on the correct solution, while never looking beyond the next test to be satisfied? In the small cases, the difference is not discernible. I suggest that to be successful, one must think analytically about the problem, and code to the insights revealed by that analysis, in addition to the specifications themselves.

A lot of problems are like that - e.g.:

http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s...


Agreed. The idea that a developer would start coding with TDD and subsequently arrive at a O(n log n) solution that works on all edge cases is... fantasy to put it mildly.


*this is supposed to say O(log n)


> Then a case with two elements,

Then a case with three elements,

Then a case with four elements,

...

Ah yes, I see why you end up with a formidable collection of test cases.


Property Based Testing and Mutation Testing is much better than trying to come up with examples by thinking hard.


PBT is awesome but I would still like to have tests I can easily reason about when implementing a tricky algorithm. The way I see it, TDD is the specification, PBT is for finding gaps in my specification.


Mutation testing is for finding holes in the tests. PBT is for finding holes in your code.


I stand corrected. These methods are indeed even better when you can't think of more examples.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: