Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

mmap is not a C feature, but POSIX. There are C platforms that don't provide mmap, and on those that do you can use mmap from other languages (there's mmap module in the Python's standard library, for example).


And it's not just mmap(), all the functions in the code snippet except printf() are not actually C stdlib functions.


I think this is sort of missing the point, though. Yes, mmap() is in POSIX[1] in the sense of "where is it specified".

But mmap() was implemented in C because C is the natural language for exposing Unix system calls and mmap() is a syscall provided by the OS. And this is true up and down the stack. Best language for integrating with low level kernel networking (sockopts, routing, etc...)? C. Best language for async I/O primitives? C. Best language for SIMD integration? C. And it goes on and on.

Obviously you can do this stuff (including mmap()) in all sorts of runtimes. But it always appears first in C and gets ported elsewhere. Because no matter how much you think your language is better, if you have to go into the kernel to plumb out hooks for your new feature, you're going to integrated and test it using a C rig before you get the other ports.

[1] Given that the pedantry bottle was opened already, it's worth pointing out that you'd have gotten more points by noting that it appeared in 4.2BSD.


If we're going to be pedantic, mmap is a syscall. It happens that the C version is standardized by POSIX.

The underlying syscall doesn't use the C ABI, you need to wrap it to use it from C in the same way you need to wrap it to use it from any language, which is exactly what glibc and friends do.

Moral of the story is mmap belongs to the platform, not the language.


it also appears in operating systems that aren't written in c. i see it as an operating system feature, categorically.


No, that's too far down the pedantry rabbit hole. "mmap()" is quite literally a C function in the 4.2BSD libc. It happens to wrap a system call of the same name, but to claim that they are different when they arrived in the same software and were written by the same author at the same time is straining the argument past the breaking point. You now have a "C Erasure Polemic" and not a clarifying comment.

If you take a kernel written in C and implement a VM system for it in C and expose a new API for it to be used by userspace processes written in C, it doesn't magically become "not C" just because there's a hardware trap in the middle somewhere.

mmap() is a C API. I mean, duh.


and if i directly do an mmap syscall on linux from a freestanding forth that doesn't go through libc for anything? sure, c unfortunately defines how i have say, pass a string, but that's effectively an arbitrary calling convention at that point; there's no c runtime on the calling side so it's not particularly useful to contend that what i'm using is a c api.

or perhaps mmap is incontrovertibly a c function on platforms where libc wrappers are the sole stable interface to the kernel but something else entirely on linux?


> and if i directly do an mmap syscall on linux from a freestanding forth

... mmap() remains a system call to a C kernel designed for use from the C library in C programs, and you're running what amounts to an emulator.

The fact that you can imagine[1] an environment where that might not be the case doesn't mean that it isn't the case in the real world.

Your argument appears to be one of Personal Liberty: de facto truths don't matter because you can just make your own. This is sort of a software variant of a Sovereign Citizen, I think.

[1] Can you even link a "freestanding forth" with an mmap() binding on any Unix that doesn't live above the libc implementation? I mean, absent everything else it would have to open code all the flag constants, whose values change between systems. This appears to be a completely fictitious runtime you've invented, which if anything sits as evidence in my favor and not yours.


?

i'm not so much imagining an environment per se¹ as describing one i've already written, so i'm not entirely sure where any of this is coming from. if you care to have some additional assurance this isn't somehow an elaborate rhetorical trap, a previous comment about forth tail call elimination with a bit of demonstrative assembly is presumably only a short scroll down my profile. ctrl-f for cmov if you want to find it quickly. as i recall, it came up for similar reasons then because people often make similar incorrect generalizations about lots of things that implicitly sit atop a c runtime in their minds. that said, you're the first one to call me a sovcit before asking any clarifying questions so at least there's some new pizzazz there.

i was clear that i was talking specifically about linux precisely because this isn't something one can do portably for exactly the reasons you're describing (which, yes, makes porting things built like this off of linux before the point you've built up enough to be able to go through libc annoying and ad hoc at the very least).

the fact remains that i can, right now, non-theoretically, on a well supported common unixlike os, and entirely unrelated to whatever weird crusade you seem to have invented to stand in for my side of this discussion, link a pile of assembly with -static -nolibc, fire up the repl, and mmap files into memory as i please with nary a bit of c on the userspace side.

as i originally said, i'm happy to consider linux a weird exception to the point you're making in a wider context since this isn't something you can do portably, but there still are entirely useful things one can do today with mmap that involve zero userspace c code on a widely supported platform.

edit: lol forgot to even get to this part. i'm also somewhat curious what you mean with this bit: "you're running what amounts to an emulator." perhaps i'm not firing on all cylinders today but i fail to see how it's useful to characterize performing bare syscalls from assembly (or something more high-level built out of assembly legos) as an emulator in any way, but i'm open to having missed some interesting nuance there.

¹ unless you mean trivially (seeing as this is code i imagined and then proceeded to write) in which case i suppose i agree



> C is the natural language for exposing Unix system calls

No, C is the language _designed_ to write UNIX. Unix is older than C, C was designed to write it and that's why all UNIX APIs follow C conventions. It's obvious that when you design something for a system it will have its best APIs in the language the system is written in.

C has also multiple weird and quirky APIs that suck, especially in the ISO C libc.



>> C is the natural language for exposing Unix system calls

> No, C is the language _designed_ to write UNIX. [...]

This is one of those hilarious situations where internet discussion goes off the rails. Everything you wrote, to the last word, would carry the same meaning and the same benefit to the discussion had you written "Yes" instead of "No" as the first word.

Literally you're agreeing with me, but phrasing it as a disagreement only because you feel I left something out.


If I write an OS in Basic, surely the 'natural' language for exposing the system calls is Basic?

Yes Unix predates C. But at this point in time 50+ years down the road, where the majority on nix users don't use anything that ever contained that code, and the minority use a nix that has been thoroughly ship of Theseused, Unix is to all intents and purposes a C operating system.


> If I write an OS in Basic, surely the 'natural' language for exposing the system calls is Basic?

For that specific OS, that would probably be the case? I think every API is bound to reflect the specific constraints of the language it has been written in. What I was trying to clarify was that UNIX and C are intertwined in an especially deep way, more than basically other OS that doesn't have a UNIX API, because both were born and written alongside each other, so some Unix APIs rely on C-specific behaviour and quirks and some C features were born and designed around the same historical context UNIX was born


>> Best language for SIMD integration? C

Uh, no. C intrinsics are so much worse than just writing assembly that it's not even comparable.


Agree to disagree there. For casual "I need to vectorize this code" tasks, modern compilers are almost magic. I mean, have you looked at the generated code for array-based numerics processing? It's like, you start the process of "vectorizing" the algorithm and realize the compiler already did 80% of it for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: