Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm doing fast AVX512 embeddings on Ryzen, and fast ONNX AVX512 reranking on Ryzen. Though I do the actual heavy lifting on GPU, doing all the RAG stuff in CPU is helpful. AI on CPU is still mostly a gimmick, but as models get smaller and more capable it's becoming less of a gimmick.


Yeah, but the "AI" on CPU means some basic NPU doing (sparse) matrix multiplication, not AVX512.


I'm handling dense, sparse, and colbert vectors generated by BGE-M3 model on CPU, pretty sure that counts as "AI"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: