Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a bit of a tangent, but can anyone provide an opinion on which of the static-typing big 3 (Haskell, SML, OCaml) is most suited to scientific computing?


It's going to depend a lot on what you're trying to do and what your personality is like.

OCaml is probably the most pragmatic of the three, but the parallelism and concurrency story is pretty weak. There is a book, OCaml for Scientists, which is a tutorial intended for the scientific audience though. It's also heavily used in finance. It also compiles quite fast.

Haskell is the hardest of the three to learn and get up to speed with. If you have a heavy mathematical bend, it may make the most sense. Parallelism and concurrency are simple and easy, but I don't know if you can do MPI or OpenMP style supercomputing; it really seems optimized for desktop/server processors.

I love SML for its simplicity, clarity and power, but it is essentially a dead language at this point, with most of its userbase having migrated to Haskell or OCaml. I don't know if Concurrent ML supports true SMP, but that would leave you limited to certain implementations. MLton is fantastic, but it comes at a price of very slow compiles with no separate compilation.

If I were in your shoes, I'd look at some sample code in OCaml and Haskell and see which one looks more reasonable, and then write a small program in each and see which one you like the feel of more. You're probably going to run into a sticky spot at some point in your project whichever one you choose; the more you like the option you have, the more likely you are to stick it out.


That depends on the kind of scientific computing you want to do.

Haskell has the repa library [1] which is very nice for working with (multi-dimensional) arrays at a high level. Performance is decent (I don't know if they have a BLAS/LAPACK binding). Overall, the main advantage of Haskell is its runtime system and its great support for concurrency. The downside is, it does not have OpenMP and the MPI bindings don't look very nice to use (I don't know how OCaml or SML fare in this area). There are OpenCL bindings, but I've never used them. Data parallel Haskell is still under heavy development, so that's probably going to take a few years to become production-ready.

OCaml's advantage is that C-like algorithms are easier to transcribe and use (no monads). OCaml's main disadvantage is that its runtime doesn't support multicore well (or even at all?). If you want that you can use F#, though.

I don't know anything about the current state of SML implementations.

[1]: http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_T...


Answering a sub question of yours: Indeed the only way to use more than one core in OCaML is to use multiprocessing. If there is a lot of data that needs to be exchanged, it may not be very fast.

That said there is this patched up version (funded by a one off summer of code by Jane street, I think)

http://www.algo-prog.info/ocmc/

that gives an API for using threads. I am fairly new to OCaML so will not be able to provide details. Another language that I am looking at is Felix

http://felix-lang.org:8080/ (Note the port, its not the one that the search engines will give you).

I am ok with OCaML not giving its users a threading API but a runtime that executes many of its higher-order functions in parallel would be really nice. Well, higher-order functions and the other parallelism exposed by the functional semantics, with some helpful directives from the user of course.


There's been a lot of projects, of which the ocamlnet/netmulticore and Jane St async's are (I think, but not very confidently) the only current. Others are:

poly/ML, ocamlP3, OC4MC, functory, JoCaml

coThreads, LWT

http://www.reddit.com/r/programming/comments/q9cro/real_worl...

http://stackoverflow.com/questions/6588500/what-is-the-state...


There's also hmatrix for a pretty nice Haskell API over blas. The one current caveat is that because certain vector code currently uses GSL under the covers, the core hmatrix lib has to be GPL in turn (as GSL is GPL). That said, there's some work underway (by me and some others) to replace the offending pieces of code with some under bsd or other permissive license so that core hmatrix can be rereleased as a bsd licensed lib and thus see broader use.


You might be interested in an HMM library for Haskell that I just finished the first version of. I wrote a tutorial using it for finding genes in strings of DNA: http://izbicki.me/blog/using-hmms-in-haskell-for-bioinformat...


I'm pretty sure Haskell is the fastest of the three, if that's one of your concerns.


Actually, I would probably bet that SML with MLton is the fastest (in some benchmarks) by a fairly large margin, but I can't find any up to date benchmarks.

Haskell is usually about 1-10x slower than C for the same task, while MLton has actually outperformed C in some benchmarks.

Edit: My information is out of date -- dons is almost certainly right on this one. I'd go with his estimation.


MLton has been mostly unmaintained for 5+ years - GHC has a performance consortium improving it ( llvm, new code gen, new register allocator, parallel GC)... I'd expect it to be a tie now for sequential code.


I see a 2 year old llvm branch for MLton

http://mlton.org/cgi-bin/viewsvn.cgi/mlton/branches/llvm/#di...

but have no idea how functional (aargh! no not that functional) it is. There seems to be some activity on the trunk in the last 7~8 months, but possibly maintenance edits. Just putting it out there if anyone wants to poke. It would be sad for MLton to bit rot.


there's always this (#include standard disclaimer that some languages' submissions have been heavily optimized, others have not, and none of them may work like your apps; languages with modern type systems do prety well: ATS, scala, GHC, clean, ocaml, even F# under mono

http://shootout.alioth.debian.org/u32/which-programming-lang...


Unfortunately, as dons noted, MLton (SML) is out of date and no longer appears in that list.


You could use Haskell for prototyping your idea and call a, hopefully faster, C implementation for portions of your code. Having said that, you should always use a profiler to determine what portions of your code can potentially benefit from going lower level.

If you do a lot of matrix computations, for example, you can always call a fast Fortran library like Blas or Lapack through a C layer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: