Consider this use case - in addition to your web app, you have a reporting service that makes heavy duty reports; if you run one at a bad time, bad things might happen like users not being able to log in or do any other important work, because the database is busy with the reports.
So in a traditional DB you might have a DBA set up a reporting database so the operational one is not affected. Using Datomic the reporting service gets a datomic peer that has a copy of the DB in database without any extra DBA work and without affecting any web services. This also works nicely with batch jobs or in any situation where you don't want to have different services affect each others performance.
Its true that a lot more memory gets used, but it is relatively cheap - usually the biggest cost when hosting in the cloud being the vCPUs. But usually in Clojure/Datomic web application you don't need to put various cache services like Redis in front of your DB.
Thea assumption here is that the usual bottleneck for most information systems and business applications is reading and querying data.
I appreciated this insight into other people's use cases, thank you for that! This architecture brings RethinkDB to mind, which also had some ability to run your client as a cluster node that you alone get to query. (Although there it was more about receiving the live feed than about caching a local working set.)
> Client (notice not Proxy) caches uncommitted writes to support read-uncommitted-writes in the same transaction. This type of read-repair is only feasible for a simple k/v data model. Anything slightly more complicated, e.g. a graph data model, would introduce a significant amount of complexity. Caching is done on the client, so read queries can bypass the entire transaction system. Reads can be served either locally from client cache or from storage nodes.
RethinkDB user here. I've been running it in production for the last 8 years or so. It works. It doesn't lose data. It doesn't corrupt data (like most distributed databases do, read the Jepsen reports).
I am worried about it being unmaintained. I do have some issues that are more smells than anything else — like things becoming slower after an uptime of around three weeks (I now reboot my systems every 14 days). I could also do with improved performance.
I'm disappointed that the Winds of Internet Fashion haven't been kind to RethinkDB. It was always a much better proposition that, say, MongoDB, but got less publicity and sort of became marginalized. I guess correct and working are not that important.
I'm slowly rebuilding my application to use FoundationDB. This lets me implement changefeeds, is a correct distributed database with fantastic guarantees (you get strict serializability in a fully distributed db!), and lets me avoid the unneeded serialization to/from JSON as a transport format.
We've never had any issue with it on a typical three-node install in Kubernetes. It requires essentially no ongoing management. That said, it can't be ignored that the company went under and now it's in community maintenance mode. If you don't have a specific good use for Rethink's changefeed functionality, which sets it apart from the alternatives, I'm not sure I could recommend it for new development. We've chosen not to rip it out, but we're not developing new things on it.
I remember back when it came out it was a big deal that it could easily scale master-master nodes and the object format was a big thing because of Mongo back then.
That was before k8's wasn't a thing back then, and most of the failover for other databases wasn't a thing just yet. I'm too scared to use it because they have a community but they're obviously nowhere as active as the Postgres and other communities.
I suppose I think of this the other way around. When the query engine is inside your app, the query engine doesn't need to do loop-like things. So you can have a much simpler querying language and mix it with plain code, kind of similar to how you don't need a templating language in React (and in Clojure when you use hiccup to represent HTML)
Additionally, this laziness means your business logic can dynamically choose which data to pull in based on results of queries, and you end up with running fewer queries as you'll never need to pull in extra data in one huge query in case business logic needs it.
So in a traditional DB you might have a DBA set up a reporting database so the operational one is not affected. Using Datomic the reporting service gets a datomic peer that has a copy of the DB in database without any extra DBA work and without affecting any web services. This also works nicely with batch jobs or in any situation where you don't want to have different services affect each others performance.
Its true that a lot more memory gets used, but it is relatively cheap - usually the biggest cost when hosting in the cloud being the vCPUs. But usually in Clojure/Datomic web application you don't need to put various cache services like Redis in front of your DB.
Thea assumption here is that the usual bottleneck for most information systems and business applications is reading and querying data.