> but using code generated from an LLM is pure madness unless what you are build...

xmprt · on Feb 4, 2025

Perhaps I suck at prompting but what I've noticed is that if an LLM has hallucinated something or learned a fake fact, it will use that fact no matter what you say to try to steer it away. The only way to get out of the loop is to know the answer yourself but in that case you wouldn't need an LLM.

liamwire · on Feb 4, 2025

I’ve found a good way to get unstuck here is to use another model, either or comparable or superior quality, or interestingly sometimes even a weaker version of the same product (e.g. Claude Haiku, vs. Sonnet*). My mental model here is similar to pair programming or simply bringing in a colleague when you’re stuck.

*I don’t know to what extent it’s worthwhile discussing whether you could call these the same model vs. entirely different, for any two products in the same family. Outside of simply quantising the same model and nothing else. Maybe you could include distillations of a base model too?

amalcon · on Feb 5, 2025

The idea of using a smaller version of the same (or a similar) model as a check is interesting. Overfitting is super basic, and tends to be less prominent in systems with fewer parameters. When this works, you may be finding examples of this exact phenomenon.

sdesol · on Feb 5, 2025

> The idea of using a smaller version of the same (or a similar) model as a check is interesting.

I built my chat app around this idea and to save money. When it comes to coding, I feel Sonnet 3.5 is still the best but I don't start with it. I tend to use cheaper models in the beginning since it usually takes a few iterations to get to a certain point and I don't want to waste tokens in the process. When I've reached a certain state or if it is clear that the LLM is not helping, I will bring in Sonnet to review things.

Here is an example of how the conversation between models will work.

https://beta.gitsense.com/?chat=bbd69cb2-ffc9-41a3-9bdb-095c...

The reason why this works for my application is, I have a system prompt that includes the following lines:

# Critical Context Information

Your name is {{gs-chat-llm-model}} and the current date and time is {{gs-chat-datetime}}.

When I make an API call, I will replace the template strings with the model and date. I also made sure to include instructions in the first user message to let the model know it needs to sign off on each message. So with the system prompt and message signature, you can say "what do you think of <LLM's> response".

codr7 · on Feb 4, 2025

I would say prompting skills relative coding skills; and the more you rely on them, the less you learn.

the_mitsuhiko · on Feb 4, 2025

That is not my experience. I wrote recently [1] about how I use it and it’s more like an intern, pair programmer or rubber duck. None of which make you worse.

[1]: https://lucumr.pocoo.org/2025/1/30/how-i-ai/

lmm · on Feb 5, 2025

> it’s more like an intern, pair programmer or rubber duck. None of which make you worse.

Are you sure? I've definitely had cases where an inexperienced pair programmer made my code worse.

the_mitsuhiko · on Feb 5, 2025

That’s a different question. But you don’t learn less.

codr7 · 2025-02-15T00:03:14 1739577794

Of course you do, that's why school isn't just the teacher giving you the answers, you have to work for it.