Safety vs. Performance. A case study of C, C++ and Rust sort implementations

Voultapher · on Oct 5, 2023

Author here, I've spent the last 1.5ish years researching sort implementations. While developing a test suite and understanding prior art, I've accumulated a list of results and properties I thought would be insightful to share. Feel free to ask me questions.

teunispeters · on Oct 5, 2023

Glad to C doing so well at what it's best at - efficiency and size. I worked in embedded space for years. Rust does not suit that environment.

Different tools for different needs!

I appreciate this, as looking for better tools in embedded space is welcome. Just pity that so many of them come with so many dependencies and large library sizes.

m00x · on Oct 5, 2023

It actually does. You can modify Rust to run without stdlib, reducing its size significantly. There are also tons of tricks to make this work really well so it's very close to C performance.

https://esp-rs.github.io/book/ https://github.com/avr-rust/ruduino

saagarjha · on Oct 5, 2023

C of course generally does not require as many of these tricks, which is why people reach for it first.

rstuart4133 · on Oct 7, 2023

It's not much of a trick in the case of Rust. Rust has a "std" library that I guess is the equivalent of "libc", but is far more comprehensive.

Rust also has a "core" library which is the absolute minimum the compiler assumes. It's well defined, well documented and how you use it is standardised. All C implementations have a similar library. For gcc you get it with -lgcc. But it is not "well defined, well documented, with standardised usage".

So I'd disagree with you. I've done it, and avoiding libc for in different C implementations is tricker than Rust.

m00x · on Oct 6, 2023

C also has decades of maturity behind it. Rust will get there.

Voultapher · on Oct 5, 2023

There are many situations where C libraries are amazingly small. This however is not one of those cases. ipnsort is aggressively optimized for binary-size. A sort instantiation for u64 is ~4.5k while crumsort was ~53k last time I measured.

I know several easy ways to boost ipnsort's performance by 10% but don't because they don't align with my binary-size and compile-time goals.

AlotOfReading · on Oct 5, 2023

Honestly, Rust is a lot closer to ready in the embedded space than you probably think. It's perfectly adequate to replace most of the embedded C++ out there today and coexist with the remaining C.

uxp100 · on Oct 5, 2023

I kept hearing rust was ready for embedded pretty early on (like 2018), and I fooled with it for a bit in 2021 and it definitely was not, on the platform I chose at least. Updating compiler version (minor update) broke existing code, I was relying on one guys hobby to support the relatively popular (though fading) mcu I pulled out of my parts bin.

What about rust today makes it suitable for replacing some code but not all of it? What’s better, what still isn’t there yet?

I guess from my limited experience register fiddling ergonomics in rust were miserable, but I was blessed with working with an extremely safe and ergonomic set of c macros at work (you could do something like rmw(i2s, clockconfig, enable, set) and know if it compiled that such a value corresponded to a valid value in a field that existed in such a register in such a peripheral) and I know some vendors provided c “pac” equivalents that were pretty sloppy and error prone, if still nicer than all the punctuation needed to set a bit in a register in rust. How is HAL quality for popular platforms? How much does that matter? The stm provided c HAL is extremely limiting in my experience, anything fancy requires bypassing it, and I worked places without touching the vendor provided libraries ever but I guess having it as an option for popular platforms is important.

AlotOfReading · on Oct 5, 2023

I'm definitely not the best person to answer this, but honestly it's not bad. Here's an example of a moderately complex peripheral, the cortex-m MPU, and how one rust OS handles it:

https://github.com/tock/tock/blob/3a0527d586702b8ae8cb242391...

Reads and writes turn into volatile reads, so everything works out under the hood. You get the benefits of everything having good names, declared sizes, and proper typing on your register accesses. You can extend that to bit accesses as well.

Rust still has a few areas it isn't competitive in, like your hyper limited or obscure chips (e.g. 8051s, XAP), mature tooling around formal methods, and a certification story for safety critical code. People are working on these latter two issues (e.g. ferrocene) and supposedly very close to public delivery, but you know how slow the industry is to adopt new things even then.

estebank · on Oct 5, 2023

> Updating compiler version (minor update) broke existing code

Could you elaborate on that? That's not something that's supposed to happen and if it has, I would like to make sure we register it in our issue tracker.

uxp100 · on Oct 6, 2023

I don't really remember, its approaching 3 years now, but the target was msp430-none-elf, and some function I had to implement had its signature change? abort! i guess? I don't think it was accidental or unknown, I think i found some conversation on a mailing list when I went to look for it.

AlotOfReading · on Oct 6, 2023

Panic changed quite a bit in no_std around 2018 [0], perhaps that's what you're remembering? You could only see that issue on nightly though, and at that point no_std / embedded rust was very much a work in progress that was limited to nightly no matter what random people were saying.

[0] https://users.rust-lang.org/t/psa-breaking-change-panic-fmt-...

uxp100 · on Oct 6, 2023

I think this was the issue, but it wasn’t 2018 when it I ran into it (looking at some notes it seems like 2020, not 2021). Perhaps it was slow to propagate to the tier 3 supported toolchain I was using? Somehow I got an old version? Idk, it was something I spent a few hours on years ago. I was on nightly, I had to be for msp430 iirc.

deciduously · on Oct 5, 2023

Wow, you really are vigilant and omnipresent. Thank you for the DX!

Yoric · on Oct 13, 2023

For what it's worth, there's bare metal Rust in orbit right now (https://www.gamaspace.com/). We'll see how that goes, but I'd like to see it work!

morning-coffee · on Oct 5, 2023

> I worked in embedded space for years. Rust does not suit that environment.

It seems to be getting better though. For example... https://tweedegolf.nl/en/blog/65/async-rust-vs-rtos-showdown

the__alchemist · on Oct 7, 2023

I consider embedded to be a core competency of Rust. It's one of a few languages that can run on these devices, and compared to the others, is a nicer experience. The language itself, the tooling, and the compile/flash/debug process make it appealing this domain.

Support outside of ARM is a bit rough, but it's improving for RISC-V and Espressif.

SleepyMyroslav · on Oct 5, 2023

Two notes from C++ user in gamedev. Please, make realistic ValWithPtr implementation for C++. Please, implement comparison inlining in C and C++ where algorithm API allows it.

Voultapher · on Oct 5, 2023

The C and C++ code pulls in all thirdparty code as header only, making for one single translation unit, and the compiler is able to do comparison inlining for the C++ implementations. I've written a bit about this previously [here](https://github.com/Voultapher/sort-research-rs/blob/main/wri...).

Regarding `ValWithPtr` my goal was to make it as close semantically as I could to the Rust code while keeping the example to a minimum to avoid distracting from the main point. If you have a concrete idea how `ValWithPtr` could have been modeled better given these criteria, please let me know.

SleepyMyroslav · on Oct 5, 2023

C and C++ are often used to achieve max performance. For max performance devs have to guarantee inlining. Which means programmers intentionally choose bigger code size and instantiation of sort algorithm code for every type and comparison combination. Function pointers to comparison operators are not relevant for max performance part of the community.

For ValWithPtr to make sense it needs its semantics defined. Correct C++ types can be deep copying, move only, reference counted, singleton instance etc. For example typed deep copy version without attempting any verification (link to compiler explorer): https://godbolt.org/z/3sesYY8of

If it is not helpful for you please feel free to ignore all of the above.

akhosravian · on Oct 5, 2023

I’ll allow that I’m entirely misunderstanding the goal, but why is it important to stick close to the semantic meaning of rust code?

How does this differ from implementing GC in C++ then showing how much better Java does?

Voultapher · on Oct 6, 2023

I'll try to explain it. I want to demonstrate that something can be used incorrectly and cause issues. To reach a wider audience and make the claims comparable, I wanted to show the same logic in C++ and Rust. For ValWithPtr there was pretty much only one way to do it in Rust, so I used that as the baseline and tried to mirror the semantics and layout of that type. Does that make sense?

bernds74 · on Oct 6, 2023

Not really. It comes across as an extremely contrived example, motivated by a desire to show one language is better, and misusing the other language in order to arrive at that goal. In other words, typical tiresome language evangelism. I think your abuse of "mutable" is about on the same level of someone writing bogus code in an "unsafe" block and then complaining that his foot is gone and Rust did nothing to prevent it.

No one sane would mutate elements, free memory or throw exceptions in a sort comparison function. To demonstrate fairness, you therefore need to at least also show how you could use Rust "unsafe" to break the sort, which is also a language feature you probably wouldn't want to use in this context.

lifthrasiir · on Oct 6, 2023

> No one sane would mutate elements, free memory or throw exceptions in a sort comparison function.

No one sane would intentionally do so, but we are not talking about that. The OP has a section that links to a blog post by Danila Kutenin, which mentions `std::sort` often just segfaults if a comparator doesn't satisfy irreflexivity or asymmetry, yet such comparators were found in the wild, even passing all reviews and even tests (!) because this behavior greatly depends on the number of elements. Given this occurrence of the actual bug in user comparators, it should be no surprise that they may also contain mutations, memory deallocations and exception throws (all of which can easily be a side effect from internal routines).

gpderetta · on Oct 6, 2023

It is a surprise. I have seen and written buggy comparators that violated strict weak ordering (these usually involve floating points...). But I have never seen mutating or throwing comparators.

It is great that rust can prevent them those additional issues, but in the grand scheme of things it is not something I lose sleep about.

menaerus · on Oct 9, 2023

TBH I find it difficult to understand:

1. How do you compile the C++ code? E.g. what flags do you use with GCC, clang and MSVC

2. How exactly is C++ binary run through a Rust benchmark suite? FFI?

3. Is it possible for rust code-under-test to be at any advantage here because it is run and built natively from the benchmark written in the rust itself?

What I find questionable though is the actual C++ code found in the benchmark:

1. Exception-handling code around std::sort and std::stable_sort - nobody does that. What problem are you trying to solve with this? Comparators do not throw exceptions.

2. Using function pointers for the comparators - surprising to see such code in C++ benchmark - it's very non-idiomatic and essentially making it impossible for a compiler to inline the code. std::sort is rarely used like that.

3. Passing over some magical third argument to the comparator function - ?

4. Passing over the context to the comparator "function" - I fail to see if "ctx" has been used anywhere?

5. "Making" the comparator function with make_compare_fn which in turn instantiates a lambda that, again, throws an exception from its body.

6. Storing a comparison result - why? You only need to return true or false from a comparator.

7. Modeling rust "panics" by storing a boolean is_panic plus exception-handling code plus throwing exceptions when is_panic is "ON" - why?

8. Confusing exceptions with rust panics - even if you wanted to do so, which still would be an arguable thing to do, std::abort or std::terminate is a replacement for std::panic. Not throwing exceptions and implementing the is_panic logic around it.

9. What is https://github.com/Voultapher/sort-research-rs/blob/main/src... being used for?

I think you're making a bit dishonest representation here for the reasons above. And I have not delved very much into depth nor have I covered all the code but just the fragment of it. Also, it is not quite clear how everything is put together and run because, ideally, you would want to have multiple binaries built with their representative toolchains/scripts _regardless_ of your benchmarking framework. And only then I would want to point the benchmarking framework to the respective binary to run the test. Here, it seems it's the other way around.

Voultapher · on Oct 12, 2023

You claim yourself that "I have not delved very much into depth" yet you make a lot of assumptions here. Here are the key facts, the `_by` functions that contain the complex exception handling stuff are only used for tests and the benchmark ones are as native as possible e.g. `std::sort(data, data + len)`. I pick either gcc or clang based on what gives better performance, and use these settings with the cc crate https://github.com/Voultapher/sort-research-rs/blob/b131ed61... plus sort specific options. It all gets statically linked into one big binary. Also note the C++ code is header only, so one big translation unit per sort. The only disadvantage based on me looking at the generated asm is one additional function call versus possible inlining for the initial part of the sort. This adds a total of ~20 or less cycles to the total time, which in the benchmark is measured in microseconds, where 1 microsecond is ~4900 cycles.

menaerus · on Oct 18, 2023

And yet you ignore majority of my questions above. It was an honest feedback but now I will bite and say that the report looks "good on the paper" but the underlying code is low-quality, very non-idiomatic, without explanations, very difficult to understand on what and how it is being tested, testing/benchmarking methodology etc. Many important pieces are missing and all that poses a question on what genuine intention of this experiment was - was it to actually find a bottleneck in implementations, if any, or was it something else. Now, that you're avoiding to address the feedback, it begins to become obvious.

reactordev · on Oct 5, 2023

I'm sure the author would accept pull requests if you've got the time to show ValWithPtr safety.

mgaunard · on Oct 5, 2023

Why didn't you consider bitonic sort? It's also a comparison sort, and it's typically the fastest one.

kelthan · on Oct 5, 2023

Bitonic sort has a couple of properties that make it poor fit for a generic sorting algorithm:

* The number of items being sorted must be a power of 2 (2^n).

* The number of comparisons it makes is larger than other sorting algorithms like merge sort, which will make it slower in non-parallel environments in many cases.

Bitonic sort has the advantage that it can be implemented to run in highly parallel environments (hardware, FPGAs, etc.), where the cost of comparisons is offset by them operating in parallel, so the sort completes faster even though more comparisons are occurring.

mgaunard · on Oct 6, 2023

bitonic sort is O(n log(n)^2), which hardly makes a difference in normal sizes (benchmarks are on 10k, which is small).

The power of 2 requirement only applies to the core kernel.

All CPUs in usage today are parallel (in 3 to 4 different ways), and bitonic sort is the best performing sort on both amd64 and aarch64.

Voultapher · on Oct 5, 2023

Adding onto what the other commenter said, these sort implementations are practically all hybrids, containing multiple sort algorithms. In fact ipnsort uses optimal sorting networks as part of its small-sort https://github.com/Voultapher/sort-research-rs/blob/bf90e294.... And for example vqsort benchmarked in another one of my writeups uses a bitonic merge network to combine smaller blocks created from optimal sorting networks.

mgaunard · on Oct 6, 2023

None of those are particularly optimized to make the most of the hardware capabilities though.

The obvious thing is to use SIMD.

Voultapher · on Oct 6, 2023

Sorry but what exactly do you base that on? They are very much so designed to exploit the out-of-order, speculative, super-scalar nature of modern CPUs. vqsort is pure, well done SIMD and yet how do you explain that graph https://github.com/Voultapher/sort-research-rs/blob/main/wri... if your claims are true? Also there are reasons why SIMD is not a good fit for every situation.

janwas · on Oct 7, 2023

That's before the performance bugfix we talked about, right? (We re-initialized the RNG on every sort.)

Vqsort performance on M1 is indeed about half that of AVX-512. The NEON instruction set is missing some important operations for Quicksort, including Compress and popcount of vector masks.

Voultapher · on Oct 12, 2023

Yes, like everything before the addendum the results are for the the same tested vqsort version without the bug-fix, as I motivate in the section "Author's conclusion and opinion" at the end. I suspect thought that even the new version will not outperform crumsort or ipnsort on this machine.

mgaunard · on Oct 7, 2023

There are plenty of other SIMD sort libraries, some of which specifically optimized for AVX2, AVX512, NEON or SVE.

janwas · on Oct 6, 2023

Vqsort is indeed using SIMD :)

mgaunard · on Oct 7, 2023

I just looked at the link he gave and it wasn't.

Intel publishes its own fast library for this, which isn't even in the benchmark list.

janwas · on Oct 7, 2023

Measurements (by the same author) for Intel's x86-simd-sort and vqsort are here: https://github.com/Voultapher/sort-research-rs/blob/main/wri...

mgaunard · on Oct 7, 2023

Pretty impressive results on the new version of vqsort.

janwas · on Oct 8, 2023

Thanks :) We fixed a performance bug (re-seeding an RNG on each sort).

verdagon · on Oct 5, 2023

> Often safety and performance are characterized as a set of zero sum tradeoffs, yet often it's possible to find better tradeoffs who's holistic properties improve upon a previously seen "either or".

There is truth in this, but I'm not sure whether the reader can/should extrapolate this to larger situations (not that the author implied we should, but it was my first interpretation).

We know that in certain situations, borrow checking works really well and allows us to guarantee safety with minimal friction and no overhead.

But there are other cases where safety and performance _are_ in contention, and we must choose one or the other. Anyone who has been forced to satisfy the borrow checker by using a .clone(), using Rc, or refactoring objects into a hash map and referred to them by an ID (that must be hashed to exchange with a reference), has felt this contention. In https://verdagon.dev/blog/myth-zero-overhead-memory-safety, I concluded that there's no general approach that always has zero overhead, at least not yet.

So perhaps the best interpretation from this study is that often, for small enough programs/areas, there is no conflict between safety and performance.

For larger programs with more complex requirements and data interrelationships, the question becomes much more interesting.

> I see no reason why a straight port from Rust to C++ wouldn't have been possible while satisfying their requirements.

Like the author, I also don't see a reason for this, but I've never tried myself. I've always thought that with the restrict keyword, one could make any C++ as performant as any Rust code. Perhaps something else got in the way there.

Guvante · on Oct 5, 2023

Honestly I think unsafe Rust is the solve here. Often you have an isolated data structure that is relatively simple that needs to break the borrow checker rules (without invoking UB such as multiple mutable references). An often times a safe API can be written on top of that.

However the current state of this is absolutely terrible. A lot of work needs to be done to improve the development experience with this model if it is to be used effectively in this way.

wredue · on Oct 5, 2023

Sorry, but you are just assuming that the borrow checker is an authority of safety, when, even stated by rust lang developers, it is not.

The borrow checker is known to be far in to “overly cautious” territory.

verdagon · on Oct 5, 2023

I made no assumption like that, though do let me know if I've said something that could be interpreted that way.

Rather, I think that the static analysis we see in today's languages just isn't powerful/flexible enough to reason about safety in a lot of the patterns that we know are safe. I'm also uncertain if it can _ever_ catch up to what we know to be safe, but I wouldn't be surprised if we get there in a few hundred years.

For example, borrow checking is a step forward and can guarantee safety, but does nothing about the other half of correctness, specifically liveness. [0]

Linear types (like in Austral [1] and Vale's higher RAII [2]) can help guarantee liveness, but we still have further to go.

Both are based on single-ownership (in the C++ sense) like Rust, which introduces errors that e.g. Haskell would not.

But even Haskell (and LiquidHaskell which has linear types) don't go far enough; Coq goes even further.

So yes, like you say, we have a long way to go w.r.t. correctness, even past the borrow checker though it is a big step forward.

To my original point though, even all of these tools put together will put restrictions on a program such that it sometimes won't be allowed to take the most optimal approach. Perhaps someday we'll get there!

[0] https://en.wikipedia.org/wiki/Safety_and_liveness_properties

[1] https://austral-lang.org/linear-types

[2] https://verdagon.dev/blog/higher-raii-7drl

wredue · on Oct 5, 2023

> But there are other cases where safety and performance _are_ in contention, and we must choose one or the other. Anyone who has been forced to satisfy the borrow checker by using a .clone(), using Rc, or refactoring objects into a hash map and referred to them by an ID (that must be hashed to exchange with a reference), has felt this contention.

Yes. You literally did assume that the borrow checker is an authority, as seen here.

To be frank with you, given that zig is 95% of the way there, I feel that you are “diving off the deep end” when stating things like “Haskell doesn’t go far enough”. Haskell typing system is a nice experiment, but I don’t believe to be good in any capacity, let alone “not going far enough”.

Haskell, in my opinion, a great case of “solving a problem before even asking what the problem really is”.

Veserv · on Oct 5, 2023

They obviously meant safety in the sense of "knowing the operation is safe because it is checked". You can tell that was the intention because they say "forced to satisfy" which indicates the stated operation is unnecessary, but needed to appease a overly cautious checker. In addition, in the previous sentence they say "allows us to guarantee safety with minimal friction" to indicate they are talking about automatic safety guarantees, not the abstract concept of safety.

_a_a_a_ · on Oct 5, 2023

So what problem do you think Haskell thinks it's solving?

wredue · on Oct 5, 2023

I haven’t the foggiest clue what problem Haskell is solving, because as far as I can tell, it doesn’t solve any problem particularly well. Hence “solving the problem before even asking what their problem actually is”.

_a_a_a_ · on Oct 5, 2023

"Designed for teaching, research, and industrial applications" - from wikipedia (2nd sentence)

Seems to have done pretty damn well IMO. Not that I've used it for 25 years, but I liked it and it introduced me to FP which totally changed how I thought of programming. I guess we have to differ on this.

wredue · on Oct 5, 2023

Yes. It changed how you thought about programming *for the worse*.

Whereas I believe that concepts are tools for programmers to reach for when appropriate, functional programmers believe concepts are rules and reaching for them should be mandatory, no matter how much bullshit they force you to add for no reason other than you accepted from the get go that, for example, immutability should be mandatory.

This is a massive fundamental problem with Haskell and all language that take hardline stances on things that are better left to the users. In this regard, I’d say it’s a complete failure. It’s horrible for teaching. You need to know more than you need to know for Java just to use it. It’s horrible for research. It has hardline fundamental stances that rejects exploration, and therefor is research averse. It is horrible for industrial applications as there are massive ranges of industry that simply cannot give to the whims of Haskell for one reason or another, but probably multiple reasons cause Haskell is terrible.

_a_a_a_ · on Oct 5, 2023

I'm pretty sure you don't have a clue or any significant experience, and too much arrogance to stop and question your own assumptions which you've just imposed on others including me, but carry on anyway.

kaba0 · on Oct 6, 2023

> functional programmers believe concepts are rules

Based on what? It’s almost like “functional programmer” is not a single entity controlled by Big Haskell - you are just spewing bullshit about a made up boogeyman.

tome · on Oct 6, 2023

> I believe that concepts are tools for programmers to reach for when appropriate, functional programmers believe concepts are rules and reaching for them should be mandatory

Could you give some examples of such concepts? It's rather hard to understand what you mean in the abstract.

_xk9i · on Oct 5, 2023

> "This is a massive fundamental problem with Haskell and all language that take hardline stances on things that are better left to the users".

Thank you. This is completely and utterly true.

thaliaarchi · on Oct 5, 2023

I'm surprised to see that the author's ipnsort is not published on crates.io, even though it performs at first or second place on most of the benchmarks for unstable sorts and it passes all the safety and correctness criteria, all while also being able to deterministically panic to inform the user of a logic bug in their comparison function.

https://github.com/Voultapher/sort-research-rs/tree/main/ipn...

Voultapher · on Oct 5, 2023

It's designed as the new `slice::sort_unstable` stay tuned.

kibwen · on Oct 5, 2023

Is there a PR or internals discussion that we can follow?

Voultapher · on Oct 6, 2023

No PR yet, still have to do some things before that. Sorry for being so vague.

fluoridation · on Oct 5, 2023

>To me the Results across all tested implementations is indicative of a pervasive mindset in the C and C++ world, that argues it's the users responsibility to be careful, even if that has been proven impossible at scale.

I mean, even if the sort function is implemented in such a way that it's impossible to use it incorrectly, the user is still programming in C/++. Yes, all else being equal, the harder it is to introduce bugs the better, but if the user is not careful they will shoot themselves in the foot one way or another.

prosqlinjector · on Oct 5, 2023

> that argues it's the users responsibility to be careful,

If you try to sort with a function that's not a valid comparison operator, I don't know what to tell you. What should it do?

Semantics cannot be validated at compile time. The best that can be done is to annotate the function as "yes this should have the right semantics" as is done in Rust and C++ concepts, but that's still not a guarantee.

itishappy · on Oct 5, 2023

> What should it do?

If I'm reading the article correctly, the user can be informed at runtime with little to no performance impact. That certainly sounds preferable to returning an unsorted list (that may not even match the input).

Almost every Rust sort does this, but no C/C++ sorts do. That appears to support that author's conclusion that this is a cultural thing.

prosqlinjector · on Oct 6, 2023

> If I'm reading the article correctly, the user can be informed at runtime with little to no performance impact.

It's not possible to verify a function is a strict weak ordering without enumerating the entire domain.

Voultapher · on Oct 6, 2023

I'm not sure enumerating the entire domain is even possible in this setting, as the comparison function could return random values for each call.

The idea is that the implementation doesn't promise it will detect a strict weak ordering violation, but it has code that will detect the effects of such violations with a high probability. It's not a sampled approach, but rather as part of the small-sort a bi-directional merge is performed and if the pointers don't line up, the only explanation is a strict weak ordering violation, the result of the merge is discarded and a panic is raised.

itishappy · on Oct 6, 2023

Right, but restricting the domain to just the list you're already enumerating makes things easier.

prosqlinjector · on Oct 6, 2023

Yep, so now your sort function isn't just a sort. It's a sort with a unit test at the beginning. This burden is placed on EVERY use.

fluoridation · on Oct 5, 2023

There are still ways for a sort function to not work properly with the comparer, even if the criterion is correct. As mentioned in the article, the comparer might have internal state that the sorter duplicates during intermediate steps, thus breaking assumptions made by the caller. The example the article gives is a comparer that increments a variable each time it's called, to count the number of comparisons. Depending on how this is done and how the sorter is implemented, copying the comparer may break this behavior.

jeffbee · on Oct 5, 2023

In C++ your relational operators can return std::partial_ordering etc which is a bit more semantically meaningful, for the reader and the compiler.

prosqlinjector · on Oct 6, 2023

Agreed. And I think that's a good idea to do. It's still possible to do wrong.

voxl · on Oct 5, 2023

This work is significant enough that it should be published in an academic venue.

insanitybit · on Oct 5, 2023

What would the purpose of that be? Genuinely wondering as a very non-academic person. Surely academics can just read this document?

davrosthedalek · on Oct 5, 2023

Discoverability. They will not find this document. If it's in a journal (or maybe only on the arXiv), it's much much easier to find.

insanitybit · on Oct 5, 2023

Ah, got it, thanks.

Voultapher · on Oct 6, 2023

You are not the first person to tell me that. Honestly I'd like to but the limiting factor is time. I'm doing all this in my personal time.

voxl · on Oct 7, 2023

If you don't have the time then it is what it is, though to be honest you've basically written the paper already in Markdown. I would guess the bigger issue is that you don't really get a lot out of it, and reviewers could potentially give you a hard time + travel time.

gpderetta · on Oct 5, 2023

It is an interesting comparison, but it would be nice to compare the same algorithms to understand the cost of each language (genericity of c++ over C, safety of rust over C++).

Voultapher · on Oct 6, 2023

For what it's worth, rust_std_unstable is mostly a port of cpp_pdqsort_unstable.

devit · on Oct 5, 2023

"Unspecified order" seems a poor guarantee.

I think a sort implementation should return elements in an order consistent with a subset of the comparison calls it makes, and that subset should be such that it fully determines the order.

An even better guarantee is that the subset should be the whole set of comparison calls.

Guvante · on Oct 5, 2023

This isn't just <= vs <.

It is absolutely possible to write a comparator that evaluates things in a circle e.g. 1<2 && 2<3 && 3<1.

At that point it is impossible to "do your best" there is no correct answers only wrong ones. (You have to violate a comparison here)

devit · on Oct 9, 2023

The sort function has no reason to ask for all three comparisons.

Rather, it can ask for two comparisons and act according to the results (e.g. it learns that 1 < 2 and 2 < 3 and returns [1, 2, 3]).

This makes the output unspecified in general, but it will be consistent with the information received and the calls made even if the comparator doesn't form an ordering, which is what one would expect.

tedunangst · on Oct 5, 2023

Who's going to write the specification for how a sort algorithm behaves when the comparator lies?

wffurr · on Oct 5, 2023

Is there any existing sort implementation that does this? The closest I can think of is stable sorts which aren’t quite what you described.

Someone · on Oct 6, 2023

> I think a sort implementation should return elements in an order consistent with a subset of the comparison calls it makes, and that subset should be such that it fully determines the order.

> An even better guarantee is that the subset should be the whole set of comparison calls.

For consistent comparator functions, all decent implementations “return elements in an order consistent with a subset of the comparison calls it makes, and that subset should be such that it fully determines the order”, because that’s the definition of sorting.

They also have the added feature that that subset is the full set of the comparison calls they make. Why would they make more calls than necessary?

For buggy comparator functions, once you hit even a single inconsistency it can’t be the whole set of comparison calls.

Also, for a buggy comparator function, you can’t count on the comparator function to be antisymmetric, so it may both say that a < b and b < a, and you can’t even count on it returning the same value when called twice with the same arguments (a comparator could return a coin flip, for example)

However, if you’re willing to make all the n × (n - 1) comparison calls, it seems reasonable to me that you can find a subset that defines an ordering.

I can only think of an heuristic argument for that, though, not of a proof. That argument is that there are n! possible orderings you can return, and each has a ‘chance’ of 1/2^(n-1) of only containing links that are consistent with the comparator function, and the former is way larger than the latter. Given that chances are at least one would be consistent, and define the order.

However, I don’t see what good that would do. If your code can detect that the comparator function is buggy, it’s better to signal that than to spend time finding some semi-random ordering that partially satisfies the comparator function.

dgb23 · on Oct 5, 2023

Interesting work!

I haven’t ever even thought about the issues laid out here in such detail.

At first I was scratching my head, but the strength of a sorting guarantee actually might matter a lot more than I first thought.

Assuming that something is sorted can have quite substantial effects on code. It’s a very strong assumption in a sense.

jithesh · on Oct 5, 2023

What is @stjepang (author of Rust's unstable sort, and many other contributions) doing now?

dahfizz · on Oct 5, 2023

TLDR: C is fastest, Rust is safest.

kobalsky · on Oct 5, 2023

You could replace Rust for almost any language in that assertion and it would be true, so it would be good to clarify that Rust does not trail of behind C much, and it's even faster than C on 4 out of 14 tests.

queuebert · on Oct 5, 2023

I feel like these tests are really testing the ability of the language to give hints to the compiler in its attempt to generate efficient machine code.

Voultapher · on Oct 5, 2023

I'm kind of sad that's your takeaway :(

Ar-Curunir · on Oct 5, 2023

Literally the second paragraph :

“ Overall no correlation between performance and safety could be found, nor whether safe or unsafe internal abstractions are used.”

dralley · on Oct 5, 2023

The comparison involved different and not completely overlapping sort algorithms. If you want to say this you have to first prove it's not just the algorithm.

ape4 · on Oct 5, 2023

"As seen in the benchmarks, the current Rust standard library unstable sort implementation outperforms the C++ standard library counterparts. "

verdagon · on Oct 5, 2023

I don't think the article made that conclusion? I see this in there:

> As seen in the benchmarks, the current Rust standard library unstable sort implementation outperforms the C++ standard library counterparts

And I can't find any generalizations like that from the author, though I may have missed it.

hawski · on Oct 5, 2023

You're talking about C++, he is talking about C.

xeonmc · on Oct 5, 2023

and C++ would be easiest?

morning-coffee · on Oct 5, 2023

easiest to blow your foot off with, yes.

belter · on Oct 5, 2023

easier to blow your whole leg off according to Bjarne Stroustrup

huhtenberg · on Oct 5, 2023

Is the plussest.

natsucks · on Oct 5, 2023

can we just move on from C/C++ already

jimbob45 · on Oct 5, 2023

We can't even move past FORTRAN and Cobol, let alone C/++.

petschge · on Oct 5, 2023

Why would we want to move away from them? Fortran is VERY good at what it is intended for (implementing formulas in code).

bee_rider · on Oct 5, 2023

Fortran added the most important OO feature, methods bound to types, around 2003. Therefore, it is a more modern object oriented language than C++, in the sense that the object oriented features were Frankensteind on more recently. So maybe we could move past C++ to it.

Guvante · on Oct 5, 2023

You don't just "move on" from a multi million line code base.

Hell everything only works by interoping with those languages given the OS is written in one of them.

ActorNightly · on Oct 5, 2023

C no. Linux is C, and it works well.

C++ yes, there is no real reason to use C++ anymore outside of big libraries that require it.

geertj · on Oct 5, 2023

> C++ yes, there is no real reason to use C++ anymore outside of big libraries that require it.

Some of us like C++ because it gives us a the freedom to work at the lowest level when we need to, while also giving us plenty of higher level APIs for most day to day situations, while having great compatibility with tons of software that is already out there. It seems we are at peak Rust fanboyism these days. I have nothing against Rust but the idea that Rust has already replaced C++, or will certainly do so in the future, is ludicrous. C++ has a ton of activity in the standards committee and has very interesting developments like cppfront. It is a vibrant community that continues to reinvent itself and in all likelihood is a lot larger than the Rust community.

morning-coffee · on Oct 6, 2023

As a C++ programmer since 1992, all through the 90's and oughts, I picked up Rust in 2019 or so... I'm not suggesting Rust has replaced C++, but IMHO, the writing is on the wall that C++ is past the point of saving. There has been a ton of activity in the standards committee since C++03, but "a ton of activity" has not been sufficient to save it thus far... I guess we can always keep hoping. Meanwhile, I can write more and more safe code in Rust that performs like C++ or better and "just works" while cppfront tries to gain traction. For me, it's not so much fanboyism as pragmatism... C++ has had its run, but life is short and I'm tired of waiting for the committee to stop adding features and start addressing UB and mutation in the presence of aliasing.

ActorNightly · on Oct 6, 2023

>freedom to work at the lowest level when we need to, while also giving us plenty of higher level APIs for most day to day situations,

Mixing of these two is why nobody really takes C++ seriously anymore.

If C++ dissalowed C style memory accessors, removed <reinterpret_cast> and everything was done through stdlib and smart pointers, it would probably be above Rust right now. A good portion of Rust borrow semantics were already built to the smart pointer system.

But with mixing, you not only have to deal with all the typing syntax, you also have no idea if you are just going to segfault because there is a C style dereference somewhere to a null pointer.

jltsiren · on Oct 5, 2023

There is no real reason to use C++ anymore, except that it's often the only language you can practically use. Or, in other words, you end up using C++ because there are too many people designing new programming languages and too few people writing libraries for the existing languages.

FpUser · on Oct 5, 2023

What is holding you? And if you mean the rest of us - live and let live.

natsucks · on Oct 5, 2023

What's holding me? People writing software who are choosing C/C++ for legacy reasons only.

JohnFen · on Oct 5, 2023

When I use C or C++, it's certainly not for legacy reasons. And my use of it doesn't prevent you from using a different language.

Conscat · on Oct 5, 2023

Other languages need tooling and featurefulness that is overall on par with C++ first. So far no low level language has profilers and debuggers that are as good, none have nearly as many compiler hints, and most are far harder to write zero overhead abstractions in. Not all developers need these tools all the time, but their availability is what keeps C++ attractive.

FpUser · on Oct 5, 2023

I do not use C++ just for legacy reasons. And frankly I do not give a flying fuck about holding people like yourself. You are free to do as you please, do not expect others to provide for you.

prosqlinjector · on Oct 5, 2023

Currently employed writing new C++ btw.

omgmajk · on Oct 5, 2023

Currently employed writing new C btw. It's not holding me back one bit.

mgaunard · on Oct 5, 2023

why? is there a real alternative? (no there isn't)

LAC-Tech · on Oct 5, 2023

Zig is getting there. Very ergonomic at doing the kind of bit bashing stuff people might reach to C for. It's about as easy to use C libraries in Zig as it is in C++.

mgaunard · on Oct 7, 2023

Zig makes me think of the people that reject C++ and stick to C, with some shitty Go thrown in.

No one will ever take it seriously.

omgmajk · on Oct 5, 2023

Zig is the future, Zig and C as a combination is amazing and in my opinion will change everything.

xeonmc · on Oct 5, 2023

Zig comptime will yield the fastest result ;P

1980phipsi · on Oct 5, 2023

That would only work for data known at compile-time...

wredue · on Oct 5, 2023

Any language that reifies types at compile time should have reasonably similar performance characteristics given similar code. Zig, though, should still end up being easier to make faster simply because it’s not RAII heavy, and doesn’t push you over in to dynamic dispatch whenever it feels like it.

The reason I responded to you though, is because comptime is not strictly for performing business logic at comptime. Most comptime uses are for the reification of code.

insanitybit · on Oct 5, 2023

What is the connection between RAII and dynamic dispatch?

wredue · on Oct 6, 2023

I don’t believe I implied that there was one?

Dynamic dispatch is slow for usually no reason.

RAII encourages patterns that are slow.

insanitybit · on Oct 6, 2023

Ah I misread :) Apologies

j-pb · on Oct 5, 2023

But will it be correct and not encounter a compiler bug? ;P

littlestymaar · on Oct 5, 2023

compiler miscompiles and outputs a noop

Fastest program ever.