Drawbridge

runesoerensen · on Sept 2, 2014

This is very exciting research. There's doesn't seem to be much new information on the page (it's been posted a few times before). In this recent presentation[1] (PDF) MSR's Galen Hunt gives a nice high-level overview of Drawbridge, possible applications and some more details about their current progress.

The Graphene Library OS[2] is a similar implementation for Linux and was released a few months ago. In particular the Graphene Host ABI[3] is adapted mostly from Drawbridge.

[1]http://vee2014.cs.technion.ac.il/docs/VEE14-present601.pdf

[2]https://github.com/oscarlab/graphene

[3]https://github.com/oscarlab/graphene/wiki/Graphene-Host-ABI

JoshTriplett · on Sept 2, 2014

The "picoprocess" here seems very similar to Linux's "seccomp" mechanism for restricting the kernel API surface. Current sandboxing mechanisms on Linux (such as Chrome's various sandboxes) use seccomp to restrict almost all syscalls, and their APIs come from IPC to more privileged processes.

Two notable differences:

On the one hand, seccomp provides much more flexibility about the subset of kernel API offered to the process, rather than just saying "here's 45 syscalls".

On the other hand, Drawbridge claims to run unmodified Windows applications; they may have an efficient mechanism for trapping NT "syscalls" and redirecting them to their "user-mode NT kernel" ntoskrnl.dll. However, this might just mean that they run unmodified applications making Win32 library calls, and the libraries have been modified, in which case programs making NT kernel syscalls would not run unmodified.

I'd really like to see a standard mechanism on Linux, similar to the "personality" mechanism, that augments seccomp with an efficient process for defining a new "syscall" layer. That would make sandboxing much simpler and more efficient.

bane · on Sept 2, 2014

This looks like they're actually bundling the kernel and all appropriate libraries into the package.

"Persistent Compatibility: allowing host and application to evolve separately. Changes in the host don't break applications."

"A library OS is an operating system refactored to run as a set of libraries within the context of an application.

While Drawbridge can run many possible library OSes..."

This implies to me at least that the app and the "OS" the application sees is a complete isolation from the host OS.

But I might be misinterpreting things here, the video seems to go into a better explanation

http://channel9.msdn.com/Shows/Going+Deep/Drawbridge-An-Expe...

Strategically this could allow Microsoft to drop lots of backwards compatibility cruft in their mainstream host OS and vastly reduce development cost and complexity.

See my previous comment here https://news.ycombinator.com/item?id=8245682

rayiner · on Sept 2, 2014

This seems to me to be more along the lines of work into exokernels: most of the "kernel" runs as a library in user mode.[1] The application still has a full range of functionality available. The "45 API calls" is not what the user process can access, but the interface between the, untrusted, user-mode kernel and the, secure, kernel-mode kernel.

[1] http://research.cs.wisc.edu/areas/os/Qual/papers/exokernel.p...

CyberShadow · on Sept 2, 2014

> they may have an efficient mechanism for trapping NT "syscalls"

I believe this is not really necessary. The syscall ABI is not stable from Windows version to Windows version - ABI stability is instead provided via the userspace DLLs (kernel32, user32 etc.), which are the official API applications are expected to use.

Some processes might invoke the syscalls directly, but this is a narrow use case (e.g. security software or copy protection wrappers might use syscalls to bypass userspace API hooks).

derefr · on Sept 2, 2014

This seems to map more to Chrome's Native Client/PPAPI than to anything like container virtualization: a reduced set of pretend syscalls that actually go to an interop library that talks to the host OS. It's just missing the "static analysis to ensure it only uses those syscalls" step.

markbnj · on Sept 2, 2014

Seems like Microsoft's response to docker, although I have no idea if it actually is, in terms of precedence. But certainly aimed at the same idea.

justincormack · on Sept 2, 2014

Not really. It is a research project, and predates Docker.

markbnj · on Sept 3, 2014

LXC containers predate Docker, too. In any case, it doesn't really matter whether it's an actual response, or not.

jpgvm · on Sept 3, 2014

Docker is not anything, Docker is a wrapper for cgroups and namespaces - also utilized by LXC.

Infact, Docker started as a wrapper around LXC. It literally configured and shelled out to lxc-start in order to orchestrate containers.

This however, is much much different to cgroups/namespaces.

What the Drawbridge paper describes is a full user-mode kernel . If you want the analagous implementation on Linux look at User Mode Linux, or the Graphene stuff that has already been linked to.

justincormack · on Sept 3, 2014

Or look at the netbsd rump kernel (http://rumpkernel.org) which also runs under Linux or standalone.

markbnj · on Sept 13, 2014

Thanks for this comment. I had some time to read through this pdf (http://lib.tkk.fi/Diss/2012/isbn9789526049175/isbn9789526049...) and I have a much better understanding of the differences between a user-space kernel and containers now.

StillBored · on Sept 4, 2014

I'm not sure that is true. Beside App-v, windows has a number of mature 3rd party application virtualization/sandbox/container products as well.

What does docker do, that app-v can't?

wslh · on Sept 12, 2014

I discussed that here https://news.ycombinator.com/item?id=8258032

majke · on Sept 2, 2014

> it consists of a closed set of 45 downcalls with fixed semantics that provide a stateless interface.

If the features 800+ syscalls can be put into 45 syscalls, why not make an operating system that has only 45 syscalls?

Then make the kernel modular.... and we've made a full circle in the OS design :)

roc · on Sept 2, 2014

> "why not make an operating system that has only 45 syscalls?"

Because much of Microsoft's licensing revenue is contingent upon continuing to support the edge cases that are inevitably not part of the set of programs that can be dropped into such a sandbox without problems.

Sanddancer · on Sept 2, 2014

You missed the joke. NT was originally a pretty hardcore microkernel, with even things like graphics drivers being in userspace. However, in the name of performance, more and more stuff was brought into kernel space.

atombender · on Sept 3, 2014

NT was never a true microkernel (much less a hardcore one), and was never intended to be. It was never designed to have any protection between its internal "subsystems" (file system I/O, security, HAL, drivers, etc.) — everything runs in kernel mode, in the same address space, and communicates using direct calls, not IPC.

NT is sometimes mistaken for a microkernel partly because of the graphics driver problem, and partly because it contains a module which Microsoft actually refers to as "the microkernel". This part consists mostly of the scheduler.

What is true about the NT kernel is that it's modular, with strict internal API separation between each subsystem, and that kind of design was (as far as I know) inspired by actual microkernels, as was the idea of hiding the kernel behind "OS personalities" such as Win32.

StillBored · on Sept 4, 2014

IRPs are the kernel "IPC" mechanism used by NT... And for whole classes of drivers/subsystems IRPs are the core processing mechanism. The IO manager calls to process IRPs could very well be traditional microkernel message passing API's complete with task switching and message copies.

While NT is not a microkernel, from a kernel API level it appears to basically be one, only missing the fact that everything is not actually isolated.

pinkyand · on Sept 2, 2014

There's a leak[1] talking about windows 9 possible being partly free, but some features are subscription based. That would give them the incentive to solve the security issues.

[1]http://arstechnica.com/information-technology/2014/04/the-in...

_h8ft · on Sept 2, 2014

I would welcome a new Windows API to replace Win32. Not sure if they would ever expose it or continue with the old legacy crap.

kyberias · on Sept 2, 2014

WinRT is not really a replacement for Win32 per se but it's a step towards that direction, right?

pjmlp · on Sept 2, 2014

More a refreshment from the past.

WinRT is COM based, similar in concept to Ext-VOS, the percursor of .NET, before it got folded into .NET.

http://blogs.msdn.com/b/dsyme/archive/2012/07/05/more-c-net-...

iNate2000 · on Sept 2, 2014

Right - A conceptual replacement, not a technical one.

iNate2000 · on Sept 2, 2014

I thought that Microsoft intended WinRT to replace Win32...

_h8ft · on Sept 13, 2014

I had the impression that you couldn't make desktop apps with WinRT?

wslh · on Sept 2, 2014

It is important to note how this will shake the current state of the application virtualization market.

There is no docker like solution for Windows. All the big players (VMware, Microsoft, Symantec, etc) do tricks to isolate the applications. The tricks are instrumenting API calls and adding filtering drivers. With these solutions only less than 70% can be virtualized and the process can be really difficult.

StillBored · on Sept 4, 2014

Hu? Can you elaborate on what you mean by this (app-v, and other windows sandbox/package tools) not being as complete as docker?

Because from what I've seen of the current state of docker, non trivial linux applications seem to have issues in docker as well because they depend on specific things which are not being namespaced well (say /sys manipulations for example, ioctls, or even use filesystem specific APIs).

Docker seems to work well as long as one stays close to web server functionality. (aka LAMP like stacks which tend to only manipulate network sockets and traditional files).

wslh · on Sept 5, 2014

Yes, it is very simple. Docker.io uses LXC where the virtualization layer occurs at the kernel space while applications such as VMware ThinApp and others occur at the user level intercepting Windows APIs which are at a higher level than the kernel. App-V and SWV adds a filtering driver as a way to sandbox registry and filesystem.

One difference in the approach is that, for example, with Docker.io you can have your own isolate network interface while with the current Windows approach this is not possible.

hds · on Sept 2, 2014

"a version of Windows enlightened to run efficiently" is just begging to be quoted out of context.

mdellabitta · on Sept 3, 2014

That usage of the word 'enlightened' is giving me the howling fantods.

Hortinstein · on Sept 2, 2014

This seems like Microsoft's response to Bromium vSentry which is a very impressive product. http://www.bromium.com/products/vsentry.html

jpgvm · on Sept 3, 2014

Not quite. Bromiums solutions make use of VT-x/VT-d/EPT. This is pure para-virtualised approach. Atleast from what I can see, it's possible it uses VT-x/VT-d/EPT to implement the process isolation - just somewhat unlikely given how it's presented.

dunno · on Sept 2, 2014

So, this is this a means to bridge the gap between Hyper-V, Hyper-V app streaming and Docker on the windows side? I'm kinda confused what the use case is compared to other existing product offerings.

pdwetz · on Sept 2, 2014

"Docker for Windows" is what hit my mind while reading it. Worth noting it's from MS research, so it doesn't mean you'll see it as is (if at all).

justincormack · on Sept 2, 2014

It is not a product offering, it is a research project. So it does not have to have a clear use case (yet).

p4bl0 · on Sept 2, 2014

I'm wondering if the Library OS could be used in the Wine project or in a new Wine-like project to run Windows programs under other operating systems.

Someone1234 · on Sept 2, 2014

I suspect that would run into copyright problems as the Win32 replacement is still owned by Microsoft. You could re-make the 40 some odd kernel calls, but the real magic here is the Win32 full replacement with the 800+ calls.

At this point I doubt Wine would be interested, unless somehow Microsoft released most of it as OSS.

atburrow · on Sept 2, 2014

Would this be comparable to Sandboxie? (http://www.sandboxie.com/)

guiomie · on Sept 2, 2014

Not sure I get it... Basically you would get rid of IIS, and run vNext apps in a drawbridge pico process ?

markbnj · on Sept 2, 2014

I don't think so. The idea is to provide for Windows the same kind of lightweight containerization as docker/LXC provides on linux. So if it were to work the same way, you'd install IIS into your container, and run the container on a host. Having read through the post now, which doesn't have a lot of detail, it seems to me that the picoprocess and library OS concepts are a consequence of not having true kernel-level support for namespaces and something like cgroups. Docker and LXC containers don't host a minimal OS, they share the existing kernel in well-defined ways.

aespinoza · on Sept 2, 2014

This sounds a lot like Docker for Windows, am I right ?

I might have to try it to understand it, but it is exciting.

jpgvm · on Sept 3, 2014

It's more like User Mode Linux for Windows.. except ofcourse it's User Mode Windows...

strictfp · on Sept 2, 2014

Is this similar to User-mode Linux?