The user specifically asked about stackfull coroutines, which is not the direction we're going. But I think the pain point the user talked about (futures right now are difficult to deal with) will be resolved by our stackless coroutine approach, and more in line with Rust's values around "zero cost abstractions."
That's fair. This space is quite complex, with so so many options. I guess we'll know if and when the OP elaborates with more :) I very well could be wrong.
You are both right. I was specifically asking about "stackfull coroutines" as tatterdemalion says, but I stackless coroutines with await/async would be as good for code readability.
The difference between stackful and stackless is whether you can yield execution across a function that doesn't know about async/await.
The problem with stackless concerns code reuseability. With stackless you have to duplicate all your intermediate functions: once for code that calls functions or methods which can yield, and again for code that takes functions or methods which won't yield. Or you simply don't reuse code at all, bifurcating the entire ecosystem.
The stackful vs stackless effectively refers to whether the implementation uses the same stack discipline for calls that can yield vs those that cannot. In either case you're always going to have to construct some kind of stack to support nested function invocation, the question is whether you're going to duplicate all that infrastructure.
Also, it helps if you don't conflate asynchronous I/O with coroutines. Coroutines are a meta abstraction over functions (an abstraction over call chains) that can be used to create ergonomic async I/O, but have other uses, like inverting producer/consumer caller/callee relationships (e.g. converting a push parser into a pull parser with a couple of lines of wrapper code). Stackful coroutines reuse the normal call stack discipline; stackless coroutines require function annotations and compiler rewriting and lead to the code reuse problems mentioned above.
Stackful coroutines are actually a perfect fit for Rust's ownership model, and would simplify much of the work, or obviate it altogether. But for other reasons--C compatibility, poor OS constructs for minimizing memory use, and an unfortunate early conflation of coroutines with async I/O--Rust has chosen the path of stackless coroutines a la async/await as the official model.
Signal's use of SGX was to increase users' trust in Signal - OpenWhisperSystems couldn't log your contacts even if they wanted to. Its up to potential users to decide if they think OpenWhisperSystems will surreptitiously perform an attack on their own secure enclave to secretly log your contact information.
> The major difficulty here is that in general, unsafe pieces of code cannot be safely composed, even if the unsafe pieces of code are individually safe. This allows you to bypass runtime safety checks without unsafe code just by composing "safe" modules that internally use unsafe code in their implementation.
This comment suggests you don't have much domain knowledge about how `unsafe` in Rust works, so I'm surprised you speak with such confidence. Your comment is flatly wrong: users using only safe code are not responsibility for guaranteeing the composed safety of the components they use (whether or not they are implemented with unsafe code).
Interfaces marked safe must uphold Rust's safety guarantees, or they are incorrect. They are just wrong if they have additional untyped invariants that need to be maintained to guarantee their safety; interfaces like this must be marked `unsafe`.
Because they cannot depend on untyped invariants, any correct implementation with a safe interface can be composed with any other. This ability to create safe abstractions over unsafe code which extend the reasoning ability of the type system is a fundamental value proposition of Rust.
> This comment suggests you don't have much domain knowledge about how `unsafe` in Rust works, so I'm surprised you speak with such confidence.
I hate being tone police, but jeez, we're having a discussion about Rust here and talking about my personal competency is inappropriate and unwelcome.
The problem I'm talking about happens when you write libraries that contain "unsafe" blocks. You want to prove (or at least assure yourself) that no unsafe behavior is observable by clients of the library. However, the way to do this is not entirely clear, although there is research being done in this area. One known trap is that it is not sufficient to demonstrate that Rust code without "unsafe" blocks cannot observe unsafe behavior in your library.
Proving the correctness of unsafe code is totally different from what you talked about, which was composing different abstractions with unsafe internals together.
Users of safe Rust do not need to worry about whether the composition of two safe interfaces that use unsafe internally is safe unless one of those interfaces is incorrect. Your comment would suggest that users need to think about the untyped invariants of each library they use, but this is not correct, libraries are not allowed to rely on untyped invariants for the correctness of their safe APIs.
The problem with talking about this subject is that "safe" and "unsafe" are overloaded terms in Rust, so I can understand why you think I was talking about something different.
Let R be arbitrary Rust code with no "unsafe" blocks. Let X and Y be libraries with "unsafe" blocks. You can prove that R + X is safe, and prove that R + Y is safe, but you haven't yet proven R + X + Y is safe. This is the hard part, because without an understanding of what property of X and Y individually makes R + X + Y + Z + ... safe, we don't have a good definition for what makes an interface "safe".
And this is what I mean when I say that this is not only a pedagogical problem.
You have restated your position, but it is still incorrect in the context of this discussion. Even your original statement of "R be[ing] arbitrary Rust code with no 'unsafe' blocks" is problematic: any Rust code is, very unavoidably, built upon a foundation of unsafe code. It has to be, because it's running on an "unsafe" processor. And yet, any safe Rust code in the core library (barring a safeness bug) is obviously safely composable with any other safe Rust code precisely because it obeys safety guarantees when transitioning from unsafe to safe. The fact that you can mistakenly conceive of Rust code that somehow avoids any internal unsafety simply reinforces how obvious this simple fact is.
But using your original problem statement, if R is safe and X and Y use unsafe code but do not expose any unsafe interfaces, then either R + X + Y is safe or one of [X, Y] has a safety bug and is inaccurately marking an unsafe interface as safe.
This is a generally unsolvable problem, and every other language has this problem as well; the difference being that in most other languages you're typically forced to write the unsafe code in C (where one has much greater variety of footguns available at their disposal). If I write a Ruby FFI wrapper for buggy C code whose interfaces bleed "unsafe" (from the perspective of the Ruby VM) behavior, then I am liable to experience crashes and memory corruption bugs. The only difference here is that Rust allows you to break the seal on the warranty without switching to a different language.
> ...is obviously safely composable with any other safe Rust code precisely because it obeys safety guarantees when transitioning from unsafe to safe.
And what are those safety guarantees? This is the part where I see a lot of handwaving.
> ...either R + X + Y is safe or one of [X, Y] has a safety bug and is inaccurately marking an unsafe interface as safe.
Correct, but the problem is that we don't have a way to identify which library is incorrect without a definition for what a "safe interface" is. If R + X were unsafe or R + Y were unsafe we would have an easy answer to that question.
> This is a generally unsolvable problem...
The fact that the problem is unsolvable in general did not stop people from inventing the Rust language in the first place. The point of Rust is to solve this problem for a larger and more useful class of programs. Likewise, the research into defining what a "safe interface" is in Rust is important and useful research, e.g., RustBelt.
On a minor note, these kind of negative interactions with individual Rust community members have given me a bad impression of the Rust community as a whole.
> And what are those safety guarantees? This is the part where I see a lot of handwaving.
I think this is the contention: correct me if I'm wrong, but you're saying, that, in practice, the safety guarantees of Rust are currently too nebulous to be able to be enforced reliably, whereas most other people in this thread are, I think, visualising the "platonic Rust"/post-RustBelt Rust where the currently vague conditions for safety have been tweaked as needed and proved correct, treating the current situation more like a "just" bug (and the success of RustBelt so far hints that this isn't vapourware/imagination, there's significant concrete progress towards it).
That is to say, most people are talking about the potential of Rust's safety, whereas you're talking about the reality, right now. I think both positions are reasonable to think about, but it obviously leads to confusion when the positions aren't distinguished in a discussion. (I also think that most people would agree with you about Rust right now: there isn't a definite set of safety rules, so it can be hard to work out whether "edge-cases" are correct or not.)
It is still that we are still working this out; this is what we're cooprating with academia on, formalizing the exact semantics. Such things take time.
> One known trap is that it is not sufficient to demonstrate that Rust code without "unsafe" blocks cannot observe unsafe behavior in your library.
I'm curious: what does this mean/could you point me to the part of the paper that describes it? (Unfortunately, I don't have time to read all 34 pages at the moment.)
I'm not convinced that the statement in the paper translates into what you said: the key piece of that paragraph is "or seems to be". The Leakpocalypse problem was one piece of code (crossbeam's scoped threads API) was relying on an invariant that doesn't actually hold ("destructors will always run"). It was, fundamentally, a bug in the `unsafe` code in crossbeam, meaning it was incorrect for crossbeam to call its API safe: the fact that it took multiple libraries to trigger in that case means nothing, it just happens to be the circumstances under which the problem was noticed.
Of course, to be fair, no-one had thought about this destructor property before, just implicitly relied on it, and so it does demonstrate the necessity for better understanding of/tools for unsafe code, which is what projects like RustBelt are pushing towards.
To summarise, I still don't see how these two sentences are different:
> no unsafe behavior is observable by clients of the library
> [clients] without "unsafe" blocks cannot observe unsafe behavior in [the] library
Indeed, I don't think it makes sense to even attempt to prove that clients with unsafe code can't observe unsafe behaviour (which seems to be the only way for the second sentence to differ from the first). The typical framing is that the safe code can be arbitrarily bad and there'll still be no undefined behaviour, but arbitrary `unsafe` can do anything, including writing directly to another library's data structures, which of course can easily cause UB (e.g. replace a Vec's data pointer with a null one).
> To summarise, I still don't see how these two sentences are different: ...
To "observe unsafe behavior" means I can write a program that does something safe, e.g., a data race or invalid memory access. It's possible to write library X and Y in such a way that I can observe unsafe behavior using both X and Y in my program, without putting "unsafe" blocks in my program. This is possible even if I can't do the same thing with either X or Y alone.
This is surprising, because it means that the naive definition of "safe interface" is not actually safe enough!
I'm still not understanding: other than the hardware/global state thing in another comment in this thread[1], what's a program that demonstrates this "composing safe interfaces is unsafe" property? The example in the paper is not one, it was a bug for crossbeam to mark its API safe.
[1]: I'm ignoring this case, because it's somewhat completely impossible to solve: there's no way Rust (or any language) can control this situation. And, there's a strong argument in my mind that this sort of scenario should have an `unsafe` constructor or something, to act as an assertion from the programmer that they're guaranteeing unique access to the resource.
There is a way. You have to meticulously prove operations down to machine code to not have externally observable side effects.
You can weaken the condition by excluding, say, timing effects or cacheline effects. (Say hello to Spectre)
This means you get to prove bounded access and data race freedom on any piece of memory safe code touches. Likewise prove bounded access for all unsafe code and correct cpu flag and state handling.
It it's not as bad as it seems - you can use the machine code prover designed for seL4 as a good starting point.
The definition of what you're allowed to do with `unsafe`, however nebulous its specifics may be at the moment, is that such a situation is a bug in one or both of those libraries, not their composition.
To put it another way, if you can't observe unsafety with X or Y alone, but you can with both together, then at least one of them has given you a new capability that you did not have before. Either that new capability is not truly safe, and thus the bug is providing that capability, or it exposes the other library relying on something not truly safe, and thus the bug is relying on that property.
The important point here is that, by definition, at least one of X or Y will have to change when such a situation is discovered, in order to preserve the property that composing safe interfaces is safe.
Correct me if I’m wrong but I thing GP was stating that composing two “unsafe” blocks together (both of which are manually verified to work well) might interfere with each other when run simultaneously.
'unsafe' doesn't mean unsafe. Unsafe means "I can't convince the compiler that this is safe. But in my context, it is."
If there is any way in which a function containing an `unsafe` block may be used unsafely (specifically, violating memory-safety), then that function must also be marked as unsafe.
That's what it means if you use it properly. If you write bad code, it means "this code will break everything and the compiler won't protect you." An `unsafe` block does nothing to guarantee that you're doing something safe, which is what you seem to be saying, even if it's not what you mean to say.
This is pretty much tautological, and nobody's arguing this point. However, you have the benefit of being able to narrow your search scope to the parts of your code marked `unsafe` instead of the entire project.
Most things don't need unsafe code. For the things that do, you must yourself uphold the invariant that all requirements of safety are being obeyed when transitioning out of an unsafe block. If you don't do this, bad things can happen. Other languages don't have this because they either don't offer Rust's safety guarantees in the first place, or the only way to circumvent them is to write code in C.
> However, you have the benefit of being able to narrow your search scope to the parts of your code marked `unsafe` instead of the entire project.
I may be missing some context, but this is certainly not true in Rust. In order to understand whether an individual piece of code marked `unsafe` is actually correct, you need to examine the context in which it is run and in general you could have to examine a large section of "safe" code in order to figure out whether the "unsafe" block is correct. Usually you will have to examine the entire module.
Think of two libraries that use unsafe Rust and interact with the same hardware, but work correctly when used on their own.
A program written only in pure not-unsafe Rust might use these two libraries in a way that breaks because the assertions the programmers of the libaries had, like for example having exclusive access to the hardware, are wrong now.
One could argue the pure not-unsafe Rust program is wrong, not the libraries.
I think klodolph's comment is very thoughtful and shows a good deal of experience and domain knowledge.
There is a conflation happening here. What is the nature of this bug when you compose these two libraries together?
If it is a violation of Rust's safety guarantees, then at least one of those libraries has a bug, it is exposes a safe abstraction which is not actually safe. One could not argue that the safe Rust program is wrong; the library exposing an unsafe interface as safe is unarguably wrong.
If the library just behaves incorrectly in a manner disconnected from the type system because some global state was changed in a way it doesn't expect ("the hardware" in this case), then that's a normal bug & it is not connected to unsafe code at all.
Yes, we agree about this point. However, the process for determining if these bugs exist is not well understood. That's what I mean when I say that this is not only a pedagogical problem--even Rust experts struggle to prove that a library containing "unsafe" blocks is safe, and more research into the area is needed.
My apologies if I misunderstood you - I read your comment as suggesting that safe abstractions are "leaky" and therefore create additional responsibilities for users to validate that they are using them safely when composing them together. This is not the case unless those abstractions are incorrect - which is the same situation you are with any language, just most languages those abstractions exist within the language runtime & not in libraries.
- We are directing material resources (as in, hours of paid work) to improving compiler performance. We consider this a serious problem, and "the compiler performance is always improving" is not an accurate gloss of the amount of work people are putting into solving it.
- We would love to have a REPL but there are nontrivial technical challenges, and it has not been a major requested feature by our users.
- I have never heard the complaint about the size of packages before now; I have no idea how we compare to other languages in this regard.
I've witnessed some of the conversations you're talking about (I think I engaged with you about REPLs in the past), and I think the "true-believer syndrome" has more to do with how you interpret the answers you receive than with the behavior of anyone else.
Only listening to current users for feature requests is selection bias.
I'm not currently using rust, and lack of a repl is a big strike against it. Not as much of a strike as the lack of a good stable async story, but still.
What we could do better than we do today - and your comment about ergonomics alludes to our work to improve this - is ease the onboarding of that complexity we have to be "honest" about. Its a design constraint of Rust that it must maximize user control, but that does not imply that users have to be faced with all of those choices as soon as they first try to write Rust. In some respects I think this article is an attempt at a counterargument to that work (suggesting that it is "dishonest") but I fundamentally do not believe that there is a contradiction between giving advanced users control and making it easier for new users to write correct code before they have a full understanding of the entire system.
Indeed, I used the term "ergonomics" specifically as such an allusion.
But I am not sure I read the article as claiming that such efforts are "dishonest", I instead see it as a reference to some other languages and runtimes that run up technical debt for the sake of easy living in the short term.
I would like a language that has knobs (e.g. file-level pragmas) for strictness, which you could turn all to one side (to get something as strict as Rust) or all the way to the other side (to get something like Ruby) or somewhere in between.
For example, a REPL would be pretty non-strict: When you define a function, you don't have to annotate types on the arguments, and everything gets passed around as generic objects. However, the standard library would be pretty strict [1], so in your unstrict REPL code, types are checked (and implicit clones are made to satisfy the borrow checker) as soon as you call into the strict code.
[1] In fact, the package repository for this hypothetical language (the analog to crates.io etc.) should only accept strict code, or alternatively put HUGE warning signs on libraries that contain unstrict code.
For the record, in PL jargon, "strictness" commonly refers to how soon the arguments in a function call are to be scheduled for evaluation. Using it to talk about the extent of static correctness checks can be slightly confusing!
Do you stay away from all type systems? This would be an accurate negative framing for any type system, which all necessarily reject some correct programs.
Not really. Just it's super difficult to understand how exactly borrow checker operates and when would it allow program to compile or not, causing all kinds of unpredictable situations and unless you are "black belt"-level, difficulty providing any estimates. Not mentioning loss of morale when you have to fight it every single day.
This is not my experience using Rust. I had six months of programming experience when I first tried Rust. I definitely got confusing errors at first, but I grasped the system within a month or two. And that was in 2014 - the borrowchecker and especially its errors have way improved since then!
But this is also a very different objection from your initial objection, which was just a statement of fact about type systems for turing complete languages.
Not at all; currently I am able to write non-trivial programs in >60 languages and ranging from low-level distributed transactions (did own Paxos already) through 3D visualizations, business software, mobile apps, Deep Learning models and AI, system software all the way to advanced ETL pipelines using imperative, functional, logical, co-routine-based, generics, reactive, declarative etc. concepts and am always on the lookout for a better language as I can't find the perfect one ;-) So I want to understand where Rust stands and if it is meaningful investment to learn it, both programming "pleasure"-wise and business sense-wise.
Is it? Could you give an example? I've found a borrow checker bug in the current release but the rest of the time its behavior is perfectly predictable. Granted, I wouldn't be surprised if the documentation was horrible, my understanding of the checker is based on one way I would do it.
It's simply one more thing you need to keep in your internal "context" while writing a program, taking mental resources you could spend elsewhere. Perhaps an IDE guiding you and outright rejecting the code or offering code completion that is compatible with borrow checker might be a solution, though we aren't there yet I believe.
You have to keep the kind of problems the borrow checker nags you about in your mind anyway. Memory errors and concurrency problems don't go away because the compiler doesn't complain about them. Personally, I find it easier to use Rust than C++, because I don't have to worry so much about iterator invalidation and stuff like that.
Experienced Rust programmers usually say the opposite: the borrow checker frees you from thinking about it since the compiler lets you know when you mess up.
Yes, and sometimes I wish C++ stayed somewhere in pre C99 levels where a single person could master it. As much as new features are useful, they make codebases unreadable to anyone that doesn't grasp all concepts. Even Google internally "javaizes" C++ and uses a strict subset to keep some sanity. Scala is another language that can go insane in the same fashion if teams don't enforce strict rules.
There's another reason Rust's ecosystem uses futures (as opposed to having some library-based greenthreading system): each future is like the stack of a userspace thread, but perfectly sized (it will be as large as the largest stack space needed at a yield point). This reduces the memory footprint of services using futures.