At a previous company, we too would administer a technical test. Our pass rate was close to what was described in the article (40% for ours vs 25%). However, our test was incredibly simple. At most, it should have take a competent developer two hours to complete [including writing comments and a README].
The assignment was to read a file containing a list of numbers (some formatted incorrectly, so there was some very simple parsing logic involved), call an API using each correctly formatted number as a parameter, and store what the API returned to a file. I am to this day stunned that 60% of people who passed a phone screen could not solve this task. Note that we gave them the input file, so it wasn't a matter of an edge case tripping them up or them getting one input file but the test input file having some other edge case.
My point here is that it may be possible to get the same screening value with much less investment from the candidate.
I hate the assumption that "this should take 2 hours", as I have been given tests like that. It involved setting up a oAuth token for Instagram or some similar service. I wasted two hours trying to get that done only to be told that I would have to wait a week for it to be approved.
I am sure half of these things are never thought through. In Python setting up a new project and downloading dependencies may involve needing to install a load of other crap and often takes more than two hours. Some libraries are incompatible with others.
If you are making assumptions that the test will take two hours, make sure that it involves minimal dependencies on third party stuff.
Hate it all you want. In this case, it's true. There are no hidden factors in my description. There was no token and it was a public API.
I'm sorry you've been burned, but that doesn't mean there aren't tests that actually take < 2 hours. I can't speak to every language, but what modern toolset can't open an input file, make an http(s) call, and write to a file?
I also don't understand why we shouldn't figure out how long something takes before administering it. Several people took the test and the time ranged from 15 minutes to an hour and a half, depending on language and experience level. I will say that if someone couldn't do it in 2 hours, they wouldn't have been a good fit for the team. If several team members took it, of course we're going to make an assumption about how long it takes.
Furthermore, since we didn't prescribe a specific language, there's no reason why someone wouldn't have all of the tools pre-installed. Even so, if you had to install your favorite development environment, you'd have been fine. That also wouldn't have been part of the two hour time frame (which wasn't a limit, BTW, just how long it ended up taking competent developers).
At my current workplace we also administer a technical test. It is designed to take less than 15 minutes and this is communicated when it is sent out.
It consists of a small chunk of code in Java/C#/Go or otherwise that has some obvious and other not so obvious mistakes. The candidate is asked to point out any issues they see in the code.
It takes them 15 minutes to do and about 15 minutes to review the response, which I feel values time on both sides.
That could be an even more efficient version of our test. As long as it screens for what you're looking for, I would definitely agree that shorter is better.
It feels like it tests something different than writing code, though both may be a proxy for "quality candidate".
We also did similar take homes that at most should take an hour or two if someone really went overboard. I was amazed at what was returned. It wouldn’t compile or the candidate didn’t follow simple directions. I called it our version of a take home fizz buzz.
We’re the other numbers in your funnel similar as well (phone screen pass, on-site pass, offer accepted)? I ask this because the numbers I saw in the article look remarkably similar to approximate numbers I’ve seen or been told about other companies.
So by some arguments you could say Firebase was 2.5x as selective (40 offers vs 100). With a funnel like this, even small changes to the percentages end up having a larger overall effect.
Unfortunately, we don't have the Applications number from the blog post, though he says "we considered a great deal more applicants than that [1000] on paper." I suppose a "great deal" could be anywhere from double to 10x...
What it looks like to me is Firebase put more emphasis on the technical test. If you keep your exact numbers, except change the test pass rate to 25%, then you come out with 62-63 offers, which, by the argument you reference, would mean Firebase was 56% more selective.
That makes sense to me, because a smaller company needs to filter out as many people who couldn't possibly get hired at earlier stages, since the later stages are even more time intensive than code reviewing the technical test.
The ROI is probably low on the LeetCode hard questions. Most people that can solve LeetCode easy, plus have some sense of Design Patterns (so you know they can think big picture too)-- and you've got yourself a good candidate.
The assignment was to read a file containing a list of numbers (some formatted incorrectly, so there was some very simple parsing logic involved), call an API using each correctly formatted number as a parameter, and store what the API returned to a file. I am to this day stunned that 60% of people who passed a phone screen could not solve this task. Note that we gave them the input file, so it wasn't a matter of an edge case tripping them up or them getting one input file but the test input file having some other edge case.
My point here is that it may be possible to get the same screening value with much less investment from the candidate.