Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

i wonder if these kind of things happen within the cardano ecosystem given their focus on peer reviews


As someone who spent many years in academic computer science, I have zero confidence in the peer review system. Zero.

Hoskinson talking up peer review for Cardano makes him look, to me personally, like more of a fraud.

My micro-specialty was worse than some others, though. The quality of peer review is not a constant across all of computer science.


Computer Science has a large reproducibility problem.


Isn't Computer Science one of the most easiest things to reproduce? Its all.. Computation and Math?!


Sure; but lots of papers don't publish their raw data or their code.

Or they do publish their code but its full of obvious problems. Abstract: "Algorithm X (ours) outperforms algorithm Y". Then you go read the code, and they have a sloppy implementation of algorithm Y which is missing all the obvious optimizations. Actually, they often have sloppy implementations of both algorithms X and Y, because they're a student and they've never really optimized anything before.

So the conclusion of the paper is that some sloppily written code that you can't inspect outperforms some other sloppily written code that you also can't inspect. Wow.


Somewhat fun annectode about that:

As my thesis in uni I was parallelizing some genetic analysis algorithm that a team published a paper about. I ended up with clean code with good speedup but my results were completely off from what was could be found in the initial paper. After weeks of pulling my hair out I looked into the actual code the paper was based on and found out that at some point they hardcoded a constant in such a way that there would always be a division by zero somewhere down the line which messed with their results. Been very sceptical of „peer-reviewed“ CS stuff ever since.


Even I am guilty of this. There is basically no incentive for anyone to properly cross verify all these papers. At best the reviewers would check for any egregious errors in calculations.


And awfully fragil pseudo-code, and hard-to-reproduce base data, and ridiculously extensive hyperparameter search, and specialty hardware, and proprietary libraries, and magic circumstances that a grad student with horrible sleep deprivation happened upon and now couldn't reproduce at gun point...


I am that grad student. Lol


Having read my fair share of papers, I've noticed a few large areas which call into question the reproducibility of a worrying percentage of papers, to the point that I view CS as not much better than the social sciences:

- Papers that can't publish their code due to NDAs (thankfully, reputable journals aren't accepting this with the same frequency they used to, but there's still a large body of existing papers published in reputable journals that fall into this category)

- Papers whose datasets are not disclosed despite being essential since you can't just go and generate your own data unless you have ultra-specialized equipment laying around (typically this is also due to NDAs)

- Papers that rely on specific software versions in specific configurations and will break after the paper is published and the authors stop maintaining it ("oh btw this only runs on kernel 3.6" - security conferences are still rife with this stuff)

- Papers that rely on not just specific software versions but things like environment size and link order for their benchmarks (this was called out specifically in [0])

In theory, this should all be easy to reproduce, but in practice, it rarely is. Especially since many papers that do have code and datasets available upon request will have said code and data lost to time as professors move around, files get routinely purged, and the project is forgotten about.

It sucks because there have been some interesting papers I've read that I really wanted to check out the code for, only to find out that everything was lost when the fileserver went belly-up (and nobody had been testing the backups) or when the first author left and IT automatically purged the home folder it was stored in.

[0]: https://users.cs.northwestern.edu/~robby/courses/322-2013-sp...


They don’t share the random number generator seed :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: