ZFS encryption is much more space efficient than dmcrypt+unencrypted ZFS when combined with zstd compression. This is because it can do compress-then-encrypt instead of encrypt-then-(not-really-)compress. It is also much much faster.
Source: I work for a backup company that uses ZFS a lot.
Can you explain this in more detail? It doesn't seem true on a first glance.
If you enable compression on ZFS that runs on top of dmcrypt volume, it will naturally happen before encryption (since dmcrypt is the lower layer). It's also unclear how it could be much faster, since dmcrypt generally is bottlenecked on AES-NI computation (https://blog.cloudflare.com/speeding-up-linux-disk-encryptio...), which ZFS has to do too.
Disclaimer: I am the ntfy maintainer. Pleasantly surprised to be mentioned, hehe.
Pushover is an amazing tool and works well. In my obviously biased opinion though, I think that ntfy has a ton more features than Pushover and is fully open source. You can self host all aspects of it or you can use the hosted version on ntfy.sh for free, without signups, or pay for higher limits.
I love hearing that. Anything worth sharing? I love hearing how people use it. My favorite one is the guy protecting his apple tree from thieves by adding a camera and motion sensor and then sending himself a notification with the picture to catch the apple thief.
I have a few things. One is home security, I get basic notifications when something is in the driveway. I'm working on getting Frigate running to hopefully give me the names or license plates of people when they arrive.
I also have one tied to a manufacturing database at my company. When a batch of products rolls off the line I get an updated count of units made. Kind of a way to know production systems are running and there are no problems at the work cells.
I also made a rickety-ass system that scrapes the local commuter rail API and fires off a notification when one of my trains is late or cancelled. That's been pretty helpful. The rail company has a Twitter account, but I don't go there anymore. So I rolled my own.
Thank you to all the Debian volunteers that make Debian and all its derivatives possible. It's remarkable how many people and businesses have been enabled by your work. Thank you!
On a personal note, Trixie is very exciting for me because my side project, ntfy [1], was packaged [2] and is now included in Trixie. I only learned about the fact that it was included very late in cycle when the package maintainer asked for license clarifications. As a result the Debian-ized version of ntfy doesn't contain a web app (which is a reaaal bummer), and has a few things "patched out" (which is fine). I approached the maintainer and just recently added build tags [3] to make it easier to remove Stripe, Firebase and WebPush, so that the next Debian-ized version will not have to contain (so many) awkward patches.
As an "upstream maintainer", I must say it isn't obvious at all why the web app wasn't included. It was clearly removed on purpose [4], but I don't really know what to do to get it into the next Debian release. Doing an "apt install ntfy" is going to be quite disappointing for most if the web app doesn't work. Any help or guidance is very welcome!
> The webapp is a nodejs app that requires packages that are not currently in debian.
Since vendoring dependencies inside packages is frowned upon in Debian, the maintainer would have needed to add those packages themselves and maintain them. My guess is that they didn't want to take on that effort.
Nodejs itself is, but when you install a node project manually, you type npm install and wait while it downloads the 500 different packages it depends on.
Debian follows the same philosophy as for other more traditional languages and expects that all these dependencies are packaged as individual Debian packages.
Just jumping in to say that this is making me genuinely reconsider adopting a licence/policy that forbids repackaging: the fact that someone can repackage my project, but worse, and still use my project's name? Absolutely not. I do not want the burden that inevitably comes when people complain to me that this or that is missing from a repackage.
I mean, that's just how OSS works. Anyone can fork your thing, do whatever, and call it a day. Going MIT or whatever won't save you either - this repackaging business is basically the entire business model of AWS.
Forking is fine, do whatever, but as soon as you make actual changes to the code then adopt your own name. This is what trademarks are for, it's just that official trademark registration is somewhat inaccessible (eg: cost) for open source projects. Could you imagine trademarking every little project you make just in-case it gets repackaged by someone who tears huge chunks of it out?
As it turns out, trademarks for small open-source projects are effectively worthless (https://news.ycombinator.com/item?id=44883634), so there's effectively no real solution to people appropriating your project's name while repackaging their inferior fork of it.
Debian sources need to be sufficient to build. So for npm projects, you usually have a debian-specific package.json where each npm dependency (transitively, including devDependencies needed for build) needs to either be replaced with its equivalent debian package (which may also need to be ported), vendored (usually less ideal, especially for third-party code), or removed. Oh, and enjoy aligning versions for all of that. That's doable but non-trivial work with such a sizable lockfile. If I would guess the maintainer couldn't justify the extra effort and taking on combing through all those packages.
I also think in either case the Debian way would probably be to split it out as a complementary ntfy-web package.
> As a result the Debian-ized version of ntfy doesn't contain a web app (which is a reaaal bummer), and has a few things "patched out" (which is fine).
My advise to you is to deny all support from people using the Debian version of your software and automatically close all bug tickets from Debian saying you don’t support externally patched software.
You would be far from the first to do so and it’s a completely rational and sane decision. You don’t have to engage with the insanity that Debian own policies force on its maintainers and users.
I agree that maintainers should not be expected to support patched versions of their software, but as a user I like the Debian policies you call insane. I would actually pick Debian exactly because they are cautious with the dependencies.
Debian is not cautious with the dependencies. Debian breaks a lot of what they ship, sometimes flagrantly like removing a whole feature, sometimes insiduously by introducing new bugs. I don't really care that Debian doesn't view it as breaking things. From my point of view, users trying to get my product get subpar experience in a way which is far from explicit.
I personally wouldn't use Debian but people are free to do whatever they want. I don't want to waste my time dealing with Debian maintainers and how they think software should work however. I advise all software developers to do the same and am vocal about it because it's easy to get guilt tripped in the idea that you should somehow support their users because they want to use your product or that introducing changes to support their esoteric targets somehow make sense because they have done the work despite the burden of futur support actually landing on you.
I want to make clear to people who decide they have no interest in it that they are not alone and it's perfectly fine.
And to be clear, I am singling Debian here because they are by far the worst offender when it comes to patching but the comment applies equaly to any distributions that apply invasive patches.
Debian IS more cautious with dependencies, in that you won't get hidden dependencies that aren't in the repos.
I don't want to install an app that downloads and executes 500 node packages that I don't know what they do. Those packages should already be vetted and in Debian. If not, then I'm not interested.
Side stepping the distro repos for dependencies for software in the repos leads to unexpected behavior.
> Debian IS more cautious with dependencies, in that you won't get hidden dependencies that aren't in the repos.
For a definition of cautious I don't personally share.
Debian doesn't vet packages. Debian maintainers are less competent than the "upstream" they question approximately all the time, which is why they keep breaking stuff in more or less severe way (OpenSLL anyone?). And let's not even talk about the insane stuff like when maitainers decide to support a fork they like instead of the piece of software users actually want (Libav anyone?).
> If not, then I'm not interested.
And that's your choice. That doesn't mean developers should care, nor that it is actually a good idea.
Eventually, someone must take source code and build and package the software.
When it's Debian maintainers, one at least knows the rules they are operating by.
For random people on the internet, it's usually more difficult to evaluate, vet them, and trust what they are doing.
Of course, I don't know you personally nor any software package that you are releasing so this is not an observation directed to you.
I can agree that debian maintainers are generally more incompetent, but they do actually vet dependencies for conforming to Debian ideology.
Upstream may be developing malware, they may be adding telemetry or ads. So if we just allow them to install 500 node packages that we don't know what they do... That's suspicious. That's asking for trouble.
Debian keeps a tight control on its supply chain. Its not perfect or bug free - but, it is within Debians goals.
So if you want a free distro with almost completely free sources, then Debian is really one of your only choices.
Thank you for sharing. A curious read. I am looking forward to the next post.
I've been working on backup and disaster recovery software for 10 years. There's a common phrase in our realm that I feel obligated to share, given the nature of this article.
> "Friends don't let friends build their own Backup and Disaster Recovery (BCDR) solution"
Building BCDR is notoriously difficult and has many gotchas. The author hinted at some of them, but maybe let me try to drive some of them home.
- Backup is not disaster recovery: In case of a disaster, you want to be up and running near-instantly. If you cannot get back up and running in a few minutes/hours, your customers will lose your trust and your business will hurt. Being able to restore a system (file server, database, domain controller) with minimal data loss (<1 hr) is vital for the survival of many businesses. See Recovery Time Objective (RTO) and Recovery Point Objective (RPO).
- Point-in-time backups (crash consistent vs application consistent): A proper backup system should support point-in-time backups. An "rsync copy" of a file system is not a point-in-time backup (unless the system is offline), because the system changes constantly. A point-in-time backup is a backup in which each block/file/.. maps to the same exact timestamp. We typically differentiate between "crash consistent backups" which are similar to pulling the plug on a running computer, and "application consistent backups", which involves asking all important applications to persist their state to disk and freeze operations while the backup is happening. Application consistent backups (which is provided by Microsoft's VSS, as mentioned by the author) significantly reduce the chances of corruption. You should never trust an "rsync copy" or even crash consistent backups.
- Murphy's law is really true for storage media: My parents put their backups on external hard drives, and all of r/DataHoarder seems to buy only 12T HDDs and put them in a RAID0. In my experience, hard drives of all kinds fail all the time (though NVMe SSD > other SSD > HDD), so having backups in multiple places (3-2-1 backup!) is important.
(I have more stuff I wanted to write down, but it's late and the kids will be up early.)
Ha. That quote made me chuckle; it reminded me of a performance by the band Alice in Chains, where a similar quote appeared.
Re: BCDR solutions, they also sell trust among B2B companies. Collectively, these solutions protect billions, if not trillions of dollars worth of data, and no CTO in their right mind would ever allow an open-source approach to backup and recovery. This is primarily also due to the fact that backups need to be highly available. Scrolling through a snapshot list is one of the most tedious tasks I've had to do as a sysadmin. Although most of these solutions are bloated and violate userspace like nobody's business, it is ultimately the company's reputation that allows them to sell products. Although I respect Proxmox's attempt at cornering the Broadcom fallout, I could go at length about why it may not be able to permeate the B2B market, but it boils down to a simple formula (not educational, but rather from years of field experience):
> A company's IT spend grows linearly with valuation up to a threshold, then increases exponentially between a certain range, grows polynomially as the company invests in vendor-neutral and anti-lock-in strategies, though this growth may taper as thoughtful, cost-optimized spending measures are introduced.
- Ransomware Protection: Immutability and WORM (Write Once Read Many) backups are critical components of snapshot-based backup strategies. In my experience, legal issues have arisen from non-compliance in government IT systems. While "ransomware" is often used as a buzzword by BCDR vendors to drive sales, true immutability depends on the resiliency and availability of the data across multiple locations. This is where the 3-2-1 backup strategy truly proves its value.
Would like to hear your thoughts on more backup principles!
> An "rsync copy" of a file system is not a point-in-time backup (unless the system is offline), because the system changes constantly. A point-in-time backup is a backup in which each block/file/.. maps to the same exact timestamp.
You can do this with some extra steps in between. Specifically you need a snapshotting file system like zfs. You run the rsync on the snapshot to get an atomic view of the file system.
Of course if you’re using zfs, you might just want to export the actual snapshot at that point.
> having backups in multiple places (3-2-1 backup!) is important
Yeah and for the vast majority of individual cybernauts, that "1" is almost unachievable without paying for a backup service. And at that point, why are you doing any of it yourself instead of just running their rolling backup + snapshot app?
There isn't a person in the world who lives in a different city from me (that "1" isn't protection when there's a tornado or flood or wildfire) that I'd ask to run a computer 24/7 and do maintenance on it when it breaks down.
My solution for this has been to leave a machine running in the office (in order to back up my home machine). It doesn't really need to be on 24/7, it's enough to turn it on every few days just to pull the last few backups.
3-2-1 analogy is old. We have infinite flexibility on where we can put data unlike before cloud servers existed.
I'd at least have file system snapshots locally for easy recovery in case of manual mistakes, have it copied at a remote location using implementation A and let it snapshot there too, copy same amount on another location using implementation B and let it snapshot there too, so not only you'd have durability, implementation bugs on a backup process can also be mitigated.
zfs is a godsend for this and I use Borg as secondary implementation, which seems enough for almost any disasters.
> You should never trust an "rsync copy" or even crash consistent backups.
This leads you to the secret forbidden knowledge that you only need to back up your database(s) and file/object storage. Everything else can be, or has to be depending on how strong that 'never' is, recreated from your provisioning tools. All those Veeam VM backups some IT folks hoard like dragons are worthless.
Exactly. There is no longer any point in backing up an entire "server" or a "disk". Servers and disks are created and destroyed automatically these days. It's the database that matters, and each type of database has its own tooling for creating "application consistent backups".
This strongly depends on your environment and on your RTO/RPO.
Sure, there are environments that have automatically deployed, largely stateless servers. Why back them up if you can recreate them in an hour or two ;-)
Even then, though, if we're talking about important production systems with an RTO of only a few minutes, then having a BCDR solution with instant virtualization is worth your weight in gold. I may be biased though, given that I professionally write BCDR software, hehe.
However, many environments are not like that: There are lots of stateful servers out there with bespoke configurations, lots of "the customer needed this to be that way and it doesn't fit our automation". Having all servers backed up the same way gives you peace of mind if you manage servers for a living. Being able to just spin up a virtual machine of a server and run things from a backup while you restore or repair the original system is truly magical.
Databases these days are pretty resilient to restoring from crash consistent backups like that, so yes, you'll likely be fine. It's a good enough approach for many cases. But you can't be sure that it really recovers.
However, ZFS snapshots alone are not a good enough backup if you don't off-site them somewhere else. A server/backplane/storage controller could die or corrupt your entire zpool, or the place could burn down. Lots of ways to fail. You gotta at least zfs send the snapshots somewhere.
How do you mean can’t be sure if it recovers?
It’s not hoping for inconsistent states to be recovered by the db but they’re supposed to be in good state with file system snapshotting.
Ha! I did not expect a reference to `innodb_flush_log_at_trx_commit` here. I wrote a blog post a few years ago about MySQL lossless semi-sync replication [1] and I've had quite enough of innodb_flush_log_at_trx_commit for a lifetime :-)
Depending on the database you're using, and on your configuration, they may NOT recover, or require manual intervention to recover. There is a reason that MSSQL has a VSS writer in Windows, and that PostgreSQL and MySQL have their own "dump programs" that do clean backups. Pulling the plug (= file system snapshotting) without involving the database/app is risky business.
Databases these days are really resilient, so I'm not saying that $yourfavoriteapp will never recover. But unless you involve the application or a VSS writer (which does that for you), you cannot be sure that it'll come back up.
My personal external backup is two external drives in RAID1 (RAID0 wtfff?). One already failed, of course the Seagate one. It failed silently, too - a few sectors just do not respond to read commands and this was discovered when in-place encrypting the array. (I normally would avoid Seagate consumer drives if it wasn't for brand diversity. Now I have two WD drives purchased years apart.)
It's a home backup so not exactly relevant to most of what you said - just wanted to underscore the point about storage media sucking. Ideally I'd periodically scrub each drives independently (can probably be done by forcing a degraded array mode, but careful not to mess up the metadata!) against checksums made by backup software. This particular failure mode could also be caught by dd'ing to /dev/null.
ZFS really shines here with its built-in "zpool scrub" command and checksumming.
Even though I am preaching "application consistent backups" in my original comment (because that's what's important for businesses), my home backup setup is quite simple and isn't even crash consistent :-) I do: Pull via rsync to backup box & ZFS snapshot, then rsync to Hetzner storage box (ZFS snapshotted there, weekly)
My ZFS pool consists of multiple mirrored vdevs, and I scrub the entire pool once a month. I've uncovered drive failures, and storage controller failures this way. At work, we also use ZFS and we've uncovered even failures of entire product lines of hard drives.
Cryptography Engineering definitely does not hold up. It predates (almost willfully, given the chronology) modern notions of AEAD, key derivation, random number generation, and elliptic curve asymmetric cryptography.
The standard recommendation these days is Aumasson's Serious Cryptography. I like David Wong's Real-World Cryptography as well.
I really enjoyed the book and it certainly helped me, but it's also the only cryptography book I've ever read. I appreciate you challenging my suggestion!
I just checked and it has been a whooping 12 years since I purchased/read the book, so I retract my recommendation.
Sorry, you're right, I should have been less clinical about this. Practical Cryptography (which is essentially the exact same book by the same authors) was also the first cryptography book that clicked in any meaningful way for me, and really lit me up about the prospect of finding vulnerabilities in cryptosystems.
I would actively recommend against using it as a guide in 2025. But you're not crazy to have liked it before. Funny enough, 12 years ago, I wrote a blog post about this:
I read the beginning of the post and it looks quite interesting. I'll read the rest tomorrow when my mind is sharper.
I checked my blog and I also wrote a post about some crypto related things shortly after I purchased the book. It's a post about a bug in the JDK that I stumbled across, which I am certain I would not have understood without Bruce's book:
I am a lot more cynical about Schneier's influence on the practice of cryptography engineering today than I was when he and Ferguson (who I am not cynical about at all) wrote the book back in 2003.
I was thinking exactly this. I am the maintainer of ntfy.sh and my costs are $0 at the moment because DogotalOcean is paying for it 100% because it is open source. It would be around $100, though I must admit it's quite oversized. However, my volume is much much higher than what is described in the blog.
I suspect that the architecture can be improved to get the cost down.
There are some other interesting repos from the same author, namely https://github.com/pijng/goinject, which lets you inject code as part of preprocessing. Feels a lot like Java’s annotation magic.
Thanks for sharing. I wasn’t even aware Go had pre-processors, or that modifying the AST like that is even possible.
I wholeheartedly disagree. I think the Stripe docs and developer experience is one of the greatest, if not the greatest, I've ever seen.
It has great user docs, API docs, developer centric UI elements like copy pastable IDs, webhook event browser, time travel features, test mode (!!), and you can even look at the exact API calls that the stripe UI itself is making. I've brought it up to many colleagues are awesome the docs and experience is...
IMHO, for most things, the data model is straight forward and well explained. Of course there are complicated topics and quirks, but that's just because payments is not easy in general..
I'm clearly a Stripe fanboy, but I am not affiliated in any way.
I agree. I think stripe is complicated because accepting payments is complicated. It’s easy to start a new services that only support the 80% of use cases. Especially if you don’t have to consider fraud or regulatory requirements. But that remaining 20% is what kills your simplicity.
Source: I work for a backup company that uses ZFS a lot.