Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We built something tangentially related at SoundTrace.

Basically when we onboard a new client they dump all their audiograms on us as PDFs.

The data needs extraction needs to be perfect because the tables values are used to detect hearing loss over time.

We settled on a pipeline that looks roughly like

PDF -> gpto pre filter phase -> OCR to extract text tables and forms -> things branch out here

We do a direct parse of forms and text through an LLM

Extract audiogram graphs and send them to a foundation convnet

Attempt to parse tables programmatically

-> an audiogram might have 3 separate places where the values are so we pass the results of all three of these routes through Claude sonnet and if they match they get auto approved. If they don’t, they get flagged for manual review.

All in all it’s been a journey but the accuracy is near 100 percent. These tools are incredible



Super cool! This aligns with our experiences. These tools are great and can get to near 100% of accuracy but it's quite a lot of work on the Eng side to get it there reliably.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: