Hand written assembly is pretty common in video, no matter what they say. All modern video codecs have hand written assembly for all modern SIMD extensions, even on ARM. They didn’t say anything about where these numbers come from. Likely compared against unoptimized C code. There will never be a case where having AVX-512 will give you that kind of speedup, because there will be fallbacks for more common extensions.
Hand written assembly is pretty common in video, no matter what they say. All modern video codecs have hand written assembly for all modern SIMD extensions, even on ARM. They didn’t say anything about where these numbers come from. Likely compared against unoptimized C code. There will never be a case where having AVX-512 will give you that kind of speedup, because there will be fallbacks for more common extensions.
It’s mostly because AVX-512 doesn’t get used too well by compilers even today.
However, what makes this impressive for me is that it is x86 after all. ARM is way easier to write assembly for.