We built something tangentially related at SoundTrace. Basically when we onboard...

We built something tangentially related at SoundTrace.

Basically when we onboard a new client they dump all their audiograms on us as PDFs.

The data needs extraction needs to be perfect because the tables values are used to detect hearing loss over time.

We settled on a pipeline that looks roughly like

PDF -> gpto pre filter phase -> OCR to extract text tables and forms -> things branch out here

We do a direct parse of forms and text through an LLM

Extract audiogram graphs and send them to a foundation convnet

Attempt to parse tables programmatically

-> an audiogram might have 3 separate places where the values are so we pass the results of all three of these routes through Claude sonnet and if they match they get auto approved. If they don’t, they get flagged for manual review.

All in all it’s been a journey but the accuracy is near 100 percent. These tools are incredible