Barely. On a reasonably configured kernel (you need both syscall auditing and context tracking turned off, which is doable at compile time or runtime), a modern CPU should be able to round-trip a syscall in under 40 ns. That only eats 4% CPU at 1M syscalls per second.
(It's slightly worse than that due to extra cache and TLB pressure, but I doubt that matters in this workload.)