Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I attended a talk by a PhD student who specializes in CPU/architectural research, on the M1 as an architecture. There are some key differences in design, such as unified memory etc, but what accounts for a large part of the performance gap with x86 is the fact Apple's compilers can target their specific microarchitecture and optimize for that, because they know its performance characteristics well.

We know the performance characteristics of various x86 cores, however there are simply a lot of them, so most precompiled x86 code is generically targeted but likely has suboptimal parts on different architectures. You can, of course, -mtune= with gcc, but you'd basically need to rebuild your distribution. Or in other words, the ideal would be to compare someone running Gentoo vs macOS.

So, that was the gist of the talk and I more or less accept the conclusions. Now for my own view, I think what makes ARM processors interesting is their efficiency wrt power consumption. I would love a non-Apple ARM laptop if only because intel cores are much less efficient in terms of power they require.



> what accounts for a large part of the performance gap with x86 is the fact Apple's compilers can target their specific microarchitecture and optimize for that, because they know its performance characteristics well.

OP is running Arch though, which is probably built with gcc/clang targeting generic ARMv8(.0?).


I've a friend in the industrial and academic chip development space.

They basically agreed with what you said. ARM isn't some magical compute/watt wand that Apple waved; it's their incredibly tight software and hardware integration that allowed them optimize.


Interestingly I actually ran gentoo for a while on that machine precisely for that reason, but the overall latency improvement was marginal.

Certainly not worth recompiling things all the time in my opinion, even with a beefy machine like mine.


Owning every layer of the platform helps with the microarchitecture design as well: They’ve got massive quantities of real world profiling data / instruction traces. These are invaluable in assessing the trade-offs in how you do branch prediction, cache associativity & a thousand other decisions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: