AFAIK fly.io run firecracker and cloud-hypervisor VMs. This seems to have a copy-on-write filesystem underneath.
Given their principled take on only trusting full-VM boundaries, I doubt they moved any of the storage stack into the untrusted VM.
So maybe a virtio-block device passing through discard to some underlying CoW storage stack, or maybe virtio-fs if it's running on ch instead of fc? Would be interesting to hear more about the underlying design choices and trade-offs.
Edit: from their website, "Since it's just ext4, you won't run into weird edge cases like you might with NFS or FUSE mounts. You can happily use shared memory files, for example, so you can run SQLite in all its modes." So it's a virtio block device supporting discard that's exposed to the VM. Interesting; fc doesn't support virtio discard passthrough, and support for ch is still in progress...
I have a post coming next week about the guts of this thing, but I'm curious why you think we'd avoid running the storage stack inside the VM. From my perspective that's safer than running it outside the VM.
My impression is that you (very reasonably) treat anything inside the VM as untrusted. If you want trusted rollback, presumably that implies that the VM can't have any ability to tamper with the snapshot?
But maybe you have parts of the stack that don't need to be trusted inside the VM somehow? Looking forward to the article.
It claims London E1W for my IPv4 and IPv6 RIPE PI space, and the same location/postcode for my EE home broadband. Generic 'London data centre' postcode? Or the location of the nearest Cloudflare POP, maybe? (Whois on the PI blocks shows a correct postcode some distance from London FWIW.)
[Edit: same location and postcode for a Vodafone UK v4 address too.]
I hack on various C projects on a linux/musl box, and I'm pretty sure I've seen musl's malloc() return 0, although possibly the only cases where I've triggered that fall into the 'unreasonably huge' category, where a typo made my enormous request fail some sanity check before even trying to allocate.
Whenever I think about writing a central privileged daemon to grant capabilities to other processes, I'm puzzled by the choice to remove the old version of CAP_SETPCAP in 2.6.24: "grant or remove new capabilities to/from an existing running process" - sadly it still exists but means something else in newer kernels with filesystem capabilities.
(In a sense, not having this capability in processes running as root is theatre anyway: you have /dev/kmem access so could just edit the kernel data structures. It's just doing so cleanly that is no longer possible.)
Being able to briefly escalate my editor to have the capabilities to write /etc/wibble.conf when I started editing it as a non-privileged user, then take away the capability again would be more convenient that always needing to run the editor as root. (So convenient, in fact, that people fake this with little editor helpers that do the equivalent of 'really tee FILE-TO-WRITE >/dev/null', but that's an ugly hack.)
I was already able to host onion services last year by using the crate directly. A few footguns related to flushing but it generally works as expected. I will however say that the code quality could be improved though. When trying to contribute, I found a lot of somewhat bad practices such as having direct file read/writes littered around without abstraction which made refactoring difficult (trying to add different storage/cache options such as in-memory only or encrypted)
Opting not to over engineer the solution with abstractions nobody asked for until you came along is the definition of best practice. something not being designed for any and all use cases doesn't make something bad practice. Reading and writing from a filesystem you always expect to available is more than reasonable. Modular code for the sake of modularity is a recipe for fizz buzz enterprise edition.
Not disagreeing or agreeing, but "best practice" is probably one of the concepts together with "clean code", that has as many definitions as there are programmers.
Most of the time, it depends, on context, on what else is going on in life, where the priorities lie and so on. Don't think anyone can claim for others what is or isn't "best practice" because we simply don't have enough context to know what they're basing their decisions on nor what they plan for the future.
Reading and hacking on the Chez Scheme codebase is always a treat and rather inspiring, especially compared with more mainstream compilers and code generators. As well as Kent Dybvig, Andy Keep's contribution (nanopass) is super-impressive. The whole thing is so cleanly designed and beautifully expressed.
And the N150 had mainline linux support from day one, whereas I'm not sure if there's proper support for pi5-family devices in a released mainline kernel even now, two years after the launch.
They used to do an good-to-adequate job of linux support, but nowadays they seem rubbish at it. Nobody wants to be stuck on a downstream kernel full of cobbled-together device support that's too poorly-written to upstream.
This has been the primary hurdle for me. I like it when I can just install regular linux and be on my way. Having to do a bunch of kernel nonsense is just not fun. I don't even mind messing with the kernel, but I want to use the mainline kernel.
This is certainly the reputation but I'm not sure they deserve it. They've always had the horrible closed-source bootloader with threadx running on the gpu, without a free alternative. At least up to pi4 they weren't bad at linux mainlining, but progress on upstreaming pi5 support has been glacial.
Cf. the various Beagle boards which have mainline linux and u-boot support right from release, together with real open hardware right down to board layouts you can customise. And when you come to manufacture something more than just a dev board, you can actually get the SoC from your normal distributor and drop it on your board - unlike the strange Broadcom SoCs rpi use.
I'm quite a lot more positive about rp2040 and rp2350, where they've at least partially broken free of that Broadcom ball-and-chain.
Given their principled take on only trusting full-VM boundaries, I doubt they moved any of the storage stack into the untrusted VM.
So maybe a virtio-block device passing through discard to some underlying CoW storage stack, or maybe virtio-fs if it's running on ch instead of fc? Would be interesting to hear more about the underlying design choices and trade-offs.
Edit: from their website, "Since it's just ext4, you won't run into weird edge cases like you might with NFS or FUSE mounts. You can happily use shared memory files, for example, so you can run SQLite in all its modes." So it's a virtio block device supporting discard that's exposed to the VM. Interesting; fc doesn't support virtio discard passthrough, and support for ch is still in progress...
reply