Consider this use case - in addition to your web app, you have a reporting servi...

electroly · on April 27, 2023

I appreciated this insight into other people's use cases, thank you for that! This architecture brings RethinkDB to mind, which also had some ability to run your client as a cluster node that you alone get to query. (Although there it was more about receiving the live feed than about caching a local working set.)

svieira · on April 27, 2023

> which also had some ability to run your client as a cluster node that you alone get to query

FoundationDB does this as well.

bsaul · on April 28, 2023

Do you have a pointer to some doc explaining how to do that in foundationdb ?

svieira · on April 28, 2023

https://blog.the-pans.com/notes-on-the-foundationdb-paper/ is one:

> Client (notice not Proxy) caches uncommitted writes to support read-uncommitted-writes in the same transaction. This type of read-repair is only feasible for a simple k/v data model. Anything slightly more complicated, e.g. a graph data model, would introduce a significant amount of complexity. Caching is done on the client, so read queries can bypass the entire transaction system. Reads can be served either locally from client cache or from storage nodes.

thewataccount · on April 27, 2023

Is RethinkDB still around?

They actually have recent commits, and a release last year.

electroly · on April 27, 2023

The company is gone, but the open source project lives on. We still use it in production.

thewataccount · on April 27, 2023

How's it been for production?

Would you recommend using it, or would it be better to go with a safer option?

jwr · on April 28, 2023

RethinkDB user here. I've been running it in production for the last 8 years or so. It works. It doesn't lose data. It doesn't corrupt data (like most distributed databases do, read the Jepsen reports).

I am worried about it being unmaintained. I do have some issues that are more smells than anything else — like things becoming slower after an uptime of around three weeks (I now reboot my systems every 14 days). I could also do with improved performance.

I'm disappointed that the Winds of Internet Fashion haven't been kind to RethinkDB. It was always a much better proposition that, say, MongoDB, but got less publicity and sort of became marginalized. I guess correct and working are not that important.

I'm slowly rebuilding my application to use FoundationDB. This lets me implement changefeeds, is a correct distributed database with fantastic guarantees (you get strict serializability in a fully distributed db!), and lets me avoid the unneeded serialization to/from JSON as a transport format.

electroly · on April 27, 2023

We've never had any issue with it on a typical three-node install in Kubernetes. It requires essentially no ongoing management. That said, it can't be ignored that the company went under and now it's in community maintenance mode. If you don't have a specific good use for Rethink's changefeed functionality, which sets it apart from the alternatives, I'm not sure I could recommend it for new development. We've chosen not to rip it out, but we're not developing new things on it.

thewataccount · on April 27, 2023

Interesting thank you!

I remember back when it came out it was a big deal that it could easily scale master-master nodes and the object format was a big thing because of Mongo back then.

That was before k8's wasn't a thing back then, and most of the failover for other databases wasn't a thing just yet. I'm too scared to use it because they have a community but they're obviously nowhere as active as the Postgres and other communities.

tyre · on April 27, 2023

With AWS you can create a replica with a few clicks. Latency is measured in milliseconds for the three-nines observed usage.

I don’t think this is a huge challenge (anymore) for Postgres or whatever traditional database.

dimitar · on April 27, 2023

That is a read replica, datomic peers can do writes as well, which further expands the possible use cases.

augustl · on April 27, 2023

A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

tyre · on April 28, 2023

ah okay. in that case add in a couple hours to refactor those N+1 queries and you’re all good

robertlagrant · on April 28, 2023

> A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

If you're doing this, then you need to stop :)

bt1a · on April 28, 2023

when do you ever have to do one query per array of items? genuinely curious

augustl · on April 28, 2023

I suppose I think of this the other way around. When the query engine is inside your app, the query engine doesn't need to do loop-like things. So you can have a much simpler querying language and mix it with plain code, kind of similar to how you don't need a templating language in React (and in Clojure when you use hiccup to represent HTML)

Additionally, this laziness means your business logic can dynamically choose which data to pull in based on results of queries, and you end up with running fewer queries as you'll never need to pull in extra data in one huge query in case business logic needs it.