Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AMD claims this APU delivers more than twice the tokens per second than an RTX4090.

So its better than 4090.

The reason its better with a less powerful GPU is context switching.

"AMD also claims its Strix Halo APUs can deliver 2.2x more tokens per second than the RTX 4090 when running the Llama 70B LLM (Large Language Model) at 1/6th the TDP (75W)."

https://www.tomshardware.com/pc-components/cpus/amd-slides-c...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: