Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We face similar challenges you listed and handle all of the above. 1. Out of the box OCR doesn't perform as well for complex documents (with tables, images, etc.). We use vision model to help process that documents. 2. Recall (for longer documents) and accuracy are also a major problem. We built in validation systems and references to help users validate the results. 3. Maintain this systems in production, integrate with the data sources and refresh when new data comes in are quite annoying. We manage that for the end users. 4. For non-technical users, we allow them to iterate through different business logic and have a one unify place to manage data workflows.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: