Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> a major commercial bank I work with couldn’t improve credit risk models because critical data was stuck in PDFs and emails.

Great use case! Worked on exactly this a decade ago. It was Hard™ then. Could only make so much progress. Getting this right is a huge value unlock. Congrats!



Make sure you have an on-premise option for this type of customer. I've worked at two software companies in Europe with tangentially similar products related to document analysis. On premise is a key requirement.

Even though it's 2024, banks, financial institutions like insurance companies etc. tend to be _very_ cautious with valuable documents involving customers. There are also regional regulations that prevent things like patient data being shared with _any_ 3rd parties. Even one of the big 4 oil companies that I've dealt with as prospective customer - very strict rules requiring on premise solutions.

The good news is many are using things like Kubernetes and OpenShift internally, so it should be possible to port what you do on AWS to on-premise.


On-premise will be a lot more difficult than just launching a few pods in Kubernetes. These AI tools (LLMs / vision models) will require some high powered gpus as well.


On-prem is theater if the OS isn't libre.


I have just been working through the same problem (though just PDFs). Google DocAI helped enormously after a bit of initial input.


Who is liable when the ML model hallucinates™ while parsing some critical data?

Better still if it can then become a source of truth for further departures from reality.


Great to hear that you worked saw similar use cases. Doing this before LLMs seem like a big challenge.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: