Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have uploaded entire books to the latest Gemini and had the model reliably accurately answer specific questions requiring knowledge of multiple chapters.


I think it works for info but not so well for instructions/guidance. That's why the standard advice is instructions at the start and repeated at the end.


Or under the covers are just putting all the text you fed at into a rag database and doing embedding search define route and snippets and answer your questions when asked directly. Which is the difference approach than recalling instructions


I wonder if the serial-position effect is happening with LLMs.

https://en.wikipedia.org/wiki/Serial-position_effect


Something like it definitely, though not exactly. We also know that recall improves with proximate position of bits within the context.

Adherence to context is lossy in a way reminiscent of human behavior but also different in crucial ways.


I wonder if those books were already in the training set, i.e. in a way "hardcoded" before you even steered the model that way.


Should be easy to test: ask the question without the book in the context window, ask again with the book in the context window.


That’s pretty typical, though not especially reliable. (Allthough in my experience, Gemini currently performs slightly better than ChatGPT for my case.)

In one repetitive workflow, for example, I process long email threads, large Markdown tables (which is a format from hell), stakeholder maps, and broader project context, such as roles, mailing lists, and related metadata. I feed all of that into the LLM, which determines the necessary response type (out of a given set), selects appropriate email templates, drafts replies, generates documentation, and outputs a JSON table.

It gets it right on the first try about 75% of the time, easily saving me an hour a day - often more.

Unfortunately, 10% of the time, the responses appear excellent but are fundamentally flawed in some way. Just so it doesn't get boring.


Try reformatting the data from the markdown table into a JSON or YAML list of objects. You may find that repeating the keys for every value gives you more reliable results.


Thanks for the suggestion! I’ll start benchmarking my current md table setup against one using YAML. It's apparently slightly less verbose than JSON.


Gemini does a lot better at long context.


Mind if I ask how you’re doing this? I have uploaded short stories of <40,000 words in .txt format and when I ask questions like “How many chapters are there?” or “What is the last sentence in the story?” it gets it wrong. If I paste a chapter or two at a time then ask, it works better, but that’s tedious…


Try multi-turn and agent-to-agent , it will breakdown , but Gemini is a lot better at larger context.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: