Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure what your point is calling gcc both ‘awful’ and ‘so fast overall’. Nor why you're conflating gcc and glibc.

FWIW my experience has been has clang produces a superior development experience for c++ code, but that gcc is better for c. Llvm does have a more modular architecture than gcc, but it's not really fair to say that it's more modern overall. Gcc has gotten several features ahead of llvm, like hot/cold LTO separation and parallel compilation (the former is now in llvm and the latter apparently is impractical due to architectural problems).

> The compilers are only good for short memcpy's. For the rest the asm hinders modern compiler optimizations

Also not sure what you mean by that. TFA isn't performing memcpys with inline assembly. They're using inline assembly to perform a call to the exact same ‘memcpy’ function you would have called anyway, but also to inform the compiler that memcpy clobbers fewer functions. So it's true that this will mainly improve performance for copies of small memory regions; but the performance of large copies will be unaffected.



If anyone's curious, here's the link to memcpy() as it's actually implemented in the Cosmopolitan headers: https://github.com/jart/cosmopolitan/blob/de09bec215675e9b0b... One thing that the web page doesn't mention (for the sake of simplicity) is that the Cosmopolitan headers do call __builtin_memcpy() as well, but only for 2-power constexpr sizes. That's the only time when GCC and Clang both do the optimal thing. In all other cases it's faster to use asm("call MemCpy") which implements something faster than the builtin would otherwise generate. See https://github.com/jart/cosmopolitan/blob/de09bec215675e9b0b...


> __builtin_constant_p

Aww, I was hoping you would use the evil ICE_P - https://lkml.org/lkml/2018/3/20/805




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: