I shudder to think of what it means to be storing the _results_ of processing 21... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		sunrunner 9 months ago \| parent \| context \| favorite \| on: 21 GB/s CSV Parsing Using SIMD on AMD 9950X I shudder to think of what it means to be storing the _results_ of processing 21 GB/s of CSV. Hopefully some useful kind of aggregation, but if this was powering some kind of search over structured data then it has to be stored somewhere...

devmor 9 months ago [–]

Just because you’re processing 21GB/s of CSV doesn’t mean you need all of it.

If your data is coming from a source you don’t own, it’s likely to include data you don’t need. Maybe there’s 30 columns and you only need 3 - or 200 columns and you only need 1.

Enterprise ETL is full of such cases.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact