Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Consider this use case - in addition to your web app, you have a reporting service that makes heavy duty reports; if you run one at a bad time, bad things might happen like users not being able to log in or do any other important work, because the database is busy with the reports.

So in a traditional DB you might have a DBA set up a reporting database so the operational one is not affected. Using Datomic the reporting service gets a datomic peer that has a copy of the DB in database without any extra DBA work and without affecting any web services. This also works nicely with batch jobs or in any situation where you don't want to have different services affect each others performance.

Its true that a lot more memory gets used, but it is relatively cheap - usually the biggest cost when hosting in the cloud being the vCPUs. But usually in Clojure/Datomic web application you don't need to put various cache services like Redis in front of your DB.

Thea assumption here is that the usual bottleneck for most information systems and business applications is reading and querying data.



I appreciated this insight into other people's use cases, thank you for that! This architecture brings RethinkDB to mind, which also had some ability to run your client as a cluster node that you alone get to query. (Although there it was more about receiving the live feed than about caching a local working set.)


> which also had some ability to run your client as a cluster node that you alone get to query

FoundationDB does this as well.


Do you have a pointer to some doc explaining how to do that in foundationdb ?


https://blog.the-pans.com/notes-on-the-foundationdb-paper/ is one:

> Client (notice not Proxy) caches uncommitted writes to support read-uncommitted-writes in the same transaction. This type of read-repair is only feasible for a simple k/v data model. Anything slightly more complicated, e.g. a graph data model, would introduce a significant amount of complexity. Caching is done on the client, so read queries can bypass the entire transaction system. Reads can be served either locally from client cache or from storage nodes.


Is RethinkDB still around?

They actually have recent commits, and a release last year.


The company is gone, but the open source project lives on. We still use it in production.


How's it been for production?

Would you recommend using it, or would it be better to go with a safer option?


RethinkDB user here. I've been running it in production for the last 8 years or so. It works. It doesn't lose data. It doesn't corrupt data (like most distributed databases do, read the Jepsen reports).

I am worried about it being unmaintained. I do have some issues that are more smells than anything else — like things becoming slower after an uptime of around three weeks (I now reboot my systems every 14 days). I could also do with improved performance.

I'm disappointed that the Winds of Internet Fashion haven't been kind to RethinkDB. It was always a much better proposition that, say, MongoDB, but got less publicity and sort of became marginalized. I guess correct and working are not that important.

I'm slowly rebuilding my application to use FoundationDB. This lets me implement changefeeds, is a correct distributed database with fantastic guarantees (you get strict serializability in a fully distributed db!), and lets me avoid the unneeded serialization to/from JSON as a transport format.


We've never had any issue with it on a typical three-node install in Kubernetes. It requires essentially no ongoing management. That said, it can't be ignored that the company went under and now it's in community maintenance mode. If you don't have a specific good use for Rethink's changefeed functionality, which sets it apart from the alternatives, I'm not sure I could recommend it for new development. We've chosen not to rip it out, but we're not developing new things on it.


Interesting thank you!

I remember back when it came out it was a big deal that it could easily scale master-master nodes and the object format was a big thing because of Mongo back then.

That was before k8's wasn't a thing back then, and most of the failover for other databases wasn't a thing just yet. I'm too scared to use it because they have a community but they're obviously nowhere as active as the Postgres and other communities.


With AWS you can create a replica with a few clicks. Latency is measured in milliseconds for the three-nines observed usage.

I don’t think this is a huge challenge (anymore) for Postgres or whatever traditional database.


That is a read replica, datomic peers can do writes as well, which further expands the possible use cases.


A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)


ah okay. in that case add in a couple hours to refactor those N+1 queries and you’re all good


> A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

If you're doing this, then you need to stop :)


when do you ever have to do one query per array of items? genuinely curious


I suppose I think of this the other way around. When the query engine is inside your app, the query engine doesn't need to do loop-like things. So you can have a much simpler querying language and mix it with plain code, kind of similar to how you don't need a templating language in React (and in Clojure when you use hiccup to represent HTML)

Additionally, this laziness means your business logic can dynamically choose which data to pull in based on results of queries, and you end up with running fewer queries as you'll never need to pull in extra data in one huge query in case business logic needs it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: