Right tool for the right job. With a large amount of data, a large amount of dat...

Right tool for the right job.

With a large amount of data, a large amount of data can be "relevant" with a loose query.

I think in those situations it's fine, using a model with an extra large context and similarity etc filters quite tight.

Developing it to realise when there are too many results and to prompt the user to clarify or be more specific would help.

Companies that want to trawl data like this can just deal with it and pay hardware that can run model for >100k context.

If >all< of the 70gb of data is meant to be relevant ie "summarise all financial activity over 5 years into one report" then well...it has to be developed to do what a human would. 100k context already far exceeds what human brain is capable of "keeping in your head" imo; just need multiple steps to summarise, take notes and compress the overall data down smaller and smaller with each query until it's manageable with a single 100k query.

It's totally doable, but not "out of the box".