I'm doing fast AVX512 embeddings on Ryzen, and fast ONNX AVX512 reranking on Ryz... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		electroglyph on Aug 12, 2024 \| parent \| context \| favorite \| on: AMD records its highest server market share in dec... I'm doing fast AVX512 embeddings on Ryzen, and fast ONNX AVX512 reranking on Ryzen. Though I do the actual heavy lifting on GPU, doing all the RAG stuff in CPU is helpful. AI on CPU is still mostly a gimmick, but as models get smaller and more capable it's becoming less of a gimmick.

treprinum on Aug 12, 2024 [–]

Yeah, but the "AI" on CPU means some basic NPU doing (sparse) matrix multiplication, not AVX512.

electroglyph on Aug 15, 2024 | [–]

I'm handling dense, sparse, and colbert vectors generated by BGE-M3 model on CPU, pretty sure that counts as "AI"

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact