This is a bit faster at -O2 because memcpy()/memmove()/memset() are vectorized. The code is also cleaner.
Attach a file by drag & drop or click to upload