Hacker Newsnew | past | comments | ask | show | jobs | submit | prhn's commentslogin

Technically, yes it is still burglary.

It's an odd position to take, that a crime was not committed or the offense isn't as bad if the difficulties of committing the crime have been removed or reduced.


> odd position [...] offense isn't as bad if the difficulties of committing the crime have been removed or reduced

Not really, intent is a part of the crime. If the barrier for crime is extremely small, the crime itself is less egregious.

Planning a robbery is not the same as picking up a wallet on the sidewalk. This is a feature, not a bug.


This. 1000x this.

Yes, it’s still wrong to take things but the guy should get like community service teaching white hat techniques or something. The CEO should be charged with gross negligence, fraud, and any HIPPA/Medical records laws he violated - per capita. Meaning he should face 1M+ counts of …


What does "the crime is less egregious" even mean?

Morally, you burglarized a home.

Legally, at least in CA, the charge and sentencing are equivalent.

If someone also commits a murder while burglarizing you could argue the crime is more severe, but my response would be that they've committed two crimes, and the severity of the burglary in isolation is equivalent.


Now, how do we apply that to today’s current events?

Is it still a crime if the roadblocks to commit the crime are removed? Even applauded by some? What happens when the chief of police is telling you to go out and commit said crimes?

Law and order is dictated by the ruling party. What was a crime yesterday may not be a crime today.

So if all you did was turn a key and now you’re a burglar going to prison, when the CEO of the house spent months setting up the perfect crime scene, shouldn’t the CEO at least get an accomplice charge? Insurance fraud starts the same way…


It's a common attitude with people from low-trust societies. "I'm not a scammer - I'm clever. If you don't want us to scam your system why do you make it so easy?"

The Internet is the ultimate low-trust society. Your virtual doorstep is right next to ~8 billion other peoples' doorsteps. And attributing attacks and enforcing consequences is extremely difficult and rather unusual.

When people from high-trust societies move to a low-trust society, they either adapt to their new environment and take an appropriately defensive posture or they will get robbed, scammed, etc.

Those naïfs from high-trust societies may not be morally at fault, but they must be blamed, because they aren't just putting themselves at risk. They must make at least reasonable efforts to secure the data in their custody.

It's been like this for decades. It's time to let go of our attachment to heaping all the culpability on attackers. Entities holding user data in custody must take the blame when they don't adequately secure that data, because that incentivizes an improved security posture.

And an improved security posture is the only credible path to a future with fewer and smaller data breaches.

See also: https://news.ycombinator.com/item?id=25574200


We can start by stopping the use of posture like you’re squirming in your seat. I’ve heard that term for the last 10 years and never has it been useful. Policy yes, Practice if you must, Mandate absolutely, Governance required.

Using posture is a kin to modeling or showing off clothes, the likes of which will never see the streets. Let’s all start agreeing that the term is a rug cover for whatever security wants it to be. Without checks and balances.

If your posture is having your rear end exposed and up in public then…


It's a generic, albeit somewhat euphemistic term. I agree we could do with some better messaging. Dirty and direct is usually more effective. How about this framing?

The Internet is a dark street in rural India and your dumbass company is a pretty young white woman walking around naked and alone at 2AM. It's not your fault morally if someone rapes you, but objectively you're an idiot if you do not expect it. Now, you getting raped doesn't just hurt you; it primarily hurts people your company stores data about. Those rapists aren't going away, so we need you to take basic precautions against getting raped and we're gonna hold you accountable for doing dumb shit that predictably leads you to getting raped.

> If your posture is having your rear end exposed and up in public then…

Right, that is most companies' current security posture: Naked butt waving in the air. "Improving your security posture" is just a euphemism for "pull your pants up and put your butt down".

> Using posture is a kin to modeling or showing off clothes, the likes of which will never see the streets. Let’s all start agreeing that the term is a rug cover for whatever security wants it to be. Without checks and balances.

No, I will not agree with that; that's ridiculous. "Improve [y]our security posture" is not some magic talisman used to seize unchecked power within an organization. It's basically just the Obama Doctrine brought to computer security: "Don't do stupid shit".


“Improve [y]our security posture” absolutely is without a definition of posture. Does that mean more monitoring? More security team members?

Posture is no replacement for a plan.

Originally it was “how we follow our plan” but that has since been thrown out the window. Now, posture is code word for cover.

I don’t mean to vent it’s just tiring having to deal with varying degrees of posturing where everyone is just haphazardly laying on a couch watching TV.


Welcome to America

Powerful.

This is surprisingly basic knowledge for ending up on the front page.

It’s a good intro, but I’d love to read more about when to know it’s time to replace my synchronous inter service http requests with a queue. What metrics should I consider and what are the trade offs. I’ve learned some answers to this question over time, but these guys are theoretically message queue experts. I’d love to learn about more things to look out for.

There are also different types of queues/exchanges and this is critical depending on the types of consumer or consumers you have. Should I use direct, fan out, etc?

The next interesting question is when should I use a stream instead of a queue, which RabbitMQ also supports.

My advice, having just migrated a set of message queues and streams from AWS(AvtiveMQ) to RabbitMQ is think long and hard before you add one. They become a black box of sorts and are way harder to debug than simple HTTP requests.

Also, as others have pointed out, there are other important use cases for queues which come way before microservice comms. Async processing to free up servers is one. I’m surprised none of these were mentioned.


> This is surprisingly basic knowledge for ending up on the front page.

Nothing wrong with that! Hacker News has a large audience of all skill levels. Well written explainers are always good to share, even for basic concepts.


In principle, I agree, but “a message queue is… a medium through which data flows from a source system to a destination system” feels like a truism.


For me, I've realized I often cannot possibly learn something if I can't compare it to something prior first.

In this case, as another user mentioned, the decoupling use case is a great one. Instead of two processes/API directly talking, having an intermediate "buffer" process/API can save you headache


To add to this,

The concept of connascence, and not coupling is what I find more useful for trade off analysis.

Synchronous connascence means that you only have a single architectural quanta under Neil Ford’s terminology.

As Ford is less religious and more respectful of real world trade offs, I find his writings more useful for real world problems.

I encourage people to check his books out and see if it is useful. It was always hard to mention connascence as it has a reputation of being ivory tower architect jargon, but in a distributed system world it is very pragmatic.


Agree! In fact, I would appreciate more well written articles explaining basic concepts on the front page of Hacker News. It is always good to revisit some basic concepts, but it is even better to relearn them. I am surprised by how often I realize that my definition of a concept is wrong or just superficial.


Also it's nice to have a set of well-written explainers for when someone asks about a concept.


This has more depth on System V/POSIX IPC, and a youtube video.

https://www.softprayog.in/programming/interprocess-communica...

Fun fact: IPC was introduced in "Colombus UNIX."

https://en.wikipedia.org/wiki/CB_UNIX


> when to know it’s time to replace my synchronous inter service http requests with a queue

I've found that once it's inconveniently long for a synchronous client side request, it's less about the performance or metrics and more about reasoning. Some things are queue shaped, or async job shaped. The worker -> main app communication pattern can even remain sync http calls or not (like callback based or something), but if you have something that has high variance in timing or is a background thing then just kick it off to workers.

I'd also say start simple and only go to Kafka or some other high dev-time overhead solution when you start seeing Redis/Rabbit stop being sufficient. Odds are you can make the simple solution work.


I think the article would be a little bit more useful to non-beginners if it included an update on the modern landscape of MQs. Are people still using apache kafka lol?

it is a fine enough article as it is though!


Kafka is a distributed log system. Yes, people use Kafka as a message queue, but it's often a wrong tool for the job, it wasn't designed for that.


> but I’d love to read more about when to know it’s time to replace my synchronous inter service http requests with a queue. What metrics should I consider and what are the trade offs. I’ve learned some answers to this question over time, but these guys are theoretically message queue experts. I’d love to learn about more things to look out for.

Not OP but I have some background on this.

An Erlang loss system is like a set of phone lines. Imagine a special call center where you have N operators, each of which takes calls, talks for some time (serving the customer) and hungs up. Unlike many call centers, however, they don’t keep you in line. Therefore, if all operators are busy the system hungs up and you have to explicitly call again. This is somewhat similar to a server with N threads.

Let's assume N=3.

Under common mathematical assumptions (constant arrival rate, time between arrivals modeled by a Poisson distribution, exponential service time) you can define:

1) “traffic intensity” (rho) has the ratio between arrival time and service time (intuitively, how “heavy” arrivals are with respect to “departures”)

2) the blocking probability is given by the Erlang B formula (sorry, not easy to write here) for parameters N (number of threads) and rho (traffic intensity). Basically, if traffic intensity = 1 (arrival rate = service rate), the blocking probability is 6.25%. If service rate is twice the arrival rate, this drops to 1% approximately. If service rate is 1/10 of the arrival rate, the blocking probability is 73.3%.

I will try to write down part 2 when I find some time.

EDIT - Adding part 2

So, let's add a buffer. We said we have three threads, right? Let's say the system can handle up to 6 requests before dropping, 1 processed by each thread plus an additional 3 buffered requests. Under the same distribution assumptions, this is known as a M/M/3/6 queue.

Some math crunching under the previous service and arrival rate scenarios:

- if service = arrival time, blocking probability drops to 2%. Of course there is now a non-zero wait probability (close to 9%).

- if service = twice the arrival time, blocking probability is 0.006% and there is a 1% wait probability.

- if service = 1/10 of the arrival time, blocking probability is 70%, waiting probability is 29%.

This means that a buffer reduces request drops due to busy resources, but also introduces a waiting probability. Pretty obvious. Another obvious thing is that you need additional memory for that queue length. Assuming queue length = 3, and 1 KB messages, you need 3 KB of additional memory.

A less obvious thing is that you are adding a new component. Assuming "in series" behavior, i.e. requests cannot be processed when the buffer system is down, this decreases overall availability if the queue is not properly sized. What I mean is that, if the system crashes when more than 4 KB of memory are used by the process, but you allow queue sizes up to 3 (3 KB + 3 KB = 6 KB), availability is not 100%, because in some cases the system accepts more requests than it can actually handle.

An even less obvious thing is that things, in terms of availability, change if you consider server and buffer as having distinct "size" (memory) thresholds. Things get even more complicated if server and buffer are connected by a link which itself doesn't have 100% availability, because you also have to take into account the link unavailability.


I only really ever play one game, so that's not a blocker for me.

I would have switched by now but film and audio production software, including VSTs, don't seem to be greatly supported on Linux. I'd love to hear from someone if you are successfully doing this.


> I only really ever play one game, so that's not a blocker for me.

I play loads of games; its mainly AAA multiplayers that aren't able to run on linux due to kernel anti-cheat - nearly everything else runs well with minimal effort using proton via steam (either installed via steam or imported as a non-steam game).


Music production is indeed still a blocker. I used to use Windows for that; I am now on macOS for work and music (much better than Windows in every way! I use an old trashcan Mac Pro with Monterey for my studio computer) and Debian for my personal machines.


I'd say about less than .00000001 percent of the world is in the same use case as you.


I vibe coded a windows shell extension that renders thumbnails for 10-bit videos. Windows does not do this out of the box.

I also built a Preview Pane Handler for 10-bit videos.

The installers (WIX) were vibe coded as well.

So was the product website and stripe integration. I created a bespoke license generation system on checkout.

I don’t think I wrote a single line of C++ code although the WIX installers and website did receive minimal manual adjustments.

Started with Claude but then at some point during development Codex got really good so I used only that.

https://ruptureware.com


Netflix on Apple TV has an issue if "Match Content" is "off" where it will constantly downgrade the video stream to a lower bitrate unnecessarily.

Even fixing that issue the video quality is never great compared to other services.


I just launched a 10-Bit Video Thumbnail Provider for Windows.

Windows does not natively support rendering thumbnails for 10-bit videos, which are commonly produced by cameras like the Sony A7IV.

When I started working on a short film the video clips were piling up on my hard drive. Opening them one by one to find what I was looking for was tedious.

I could not find a reputable solution to this problem, so I started a company and built one. I went through the process of EV Certification to have the installer and executable code signed.

I hope to be in the Microsoft Store soon.

I'm also building other utilities with similar purpose.

https://ruptureware.com/thumbprovider


Let's not conflate the two things that were said.

It is absolutely true that companies were rushing to rewrite their code every few years when the new shiny JS library or framework came out. I was there for it. There was a quick transition from [nothing / mootools?] to jQuery to Backbone to React, with a short Angular detour about 13 years ago. If you had experience with the "new" framework you could pretty much get a front-end gig anywhere with little friction. I rewrote many codebases across multiple companies to Backbone during that time per the request of engineering management.

Now, is React underappreciated? In the past 10 years or so I've started to see a pattern of lack of appreciation for what it brings to the table and the problems it solved. It is used near universally because it was such a drastic improvement over previous options that it was instantly adopted. But as we know, adoption does not mean appreciation.

> React is used near universally, despite there being alternatives that are better in almost every way.

Good example of under-appreciation.


Having worked in both over the years the main technical thing React had going for it over Vue, in my humble opinion, was much better Typescript support. Otherwise they are both so similar it comes down to personal preference.

However 0 of the typescript projects (front and back end) I've worked one (unless I was there when they started) used strict mode so the Typescript support was effectively wasted.


No, I was also around when React was new, moving to it from tangles of jQuery and Backbone. I absolutely know React brought several lasting innovations, in particular the component model, and I do appreciate that step change in front-end development. But other frameworks have taken those ideas and iterated on them to make them more performant, less removed from the platform, and generally nicer to work with. That is where we are today.

I agree that there was a period where many organizations did rewrite their apps from scratch, many of them switching to React, but I think very few did it ”every couple of years”, and I think very few are doing it at all today (at least not because of hype - of course there might always be other reasons you do a big rewrite). We should not confuse excitement about new technologies for widespread adoption, especially not in replacing existing code in working codebases.


I read parent's comment as an assertion that the current "fast-moving JavaScript world" expects everyone to rewrite their app. Personally I've never seen this, but since React became popular ~13+ years ago, I struggle to believe this has actually been true for others in any meaningful way.


Mootools is still around! "Copyright © 2006-2025". I don't know anyone who uses it, but glad it see it's still going.

https://mootools.net/core


MooTools also features in the infamous SmooshGate:

https://developer.chrome.com/blog/smooshgate



You can also just grab the same piece from Substack: https://thenoosphere.substack.com/p/just-how-many-more-succe...


Any odd blurring, distortion, or vignetting you might find around the edges could be caused by anamorphic lenses. Vignetting is often also added in post.


Is anyone using any of these tools to write non boilerplate code?

I'm very interested.

In my experience ChatGPT and Gemini are absolutely terrible at these types of things. They are constantly wrong. I know I'm not saying anything new, but I'm waiting to personally experience an LLM that does something useful with any of the code I give it.

These tools aren't useless. They're great as search engines and pointing me in the right direction. They write dumb bash scripts that save me time here and there. That's it.

And it's hilarious to me how these people present these tools. It generates a bunch of code, and then you spend all your time auditing and fixing what is expected to be wrong.

That's not the type of code I'm putting in my company's code base, and I could probably write the damn code more correctly in less time than it takes to review for expected errors.

What am I missing?


>What am I missing?

That you are trying to use LLMs to create giant sprawling codebase feature packed software packages that define the modern software landscape. What's being missed is that any one user might only utilize 5% of the code base on any given day. Software is written to accommodate every need every user could have in one package. Then the users just use the small slice that accommodates their specific needs.

I have now created 5 hyper narrow programs that are used daily by my company to do work. I am not a programmer and my company is not a tech company located in a tech bubble. We are a tiny company that does old school manufacturing.

To give a quick general example, Betty uses Excel to manage payroll. A list of employees, a list of wages, a list of hours worked (which she copys from the time clock software .csv that she imports to excel).

Excel is a few million LOC program and costs ~$10/mo. Betty needs maybe 2k LOC to do what she uses excel for. Something an LLM can do easily, a python GUI wrapper on an SQLite DB. And she would be blown away at how fast it is, and how it is written for her use specifically.

How software is written and how it is used will change to accommodate LLMs. We didn't design cars to drive on horse paths, we put down pavement.


The Romans put down paved roads to make their horse paths more reliable.

But yes, I hope we get away from the giant conglomeration of everything, ESPECIALLY the reality of people doing 90% of their business inside a Google Chrome widow. Move towards the UNIX philosophy of tiny single-purpose programs.


> I have now created 5 hyper narrow programs that are used daily by my company to do work. I am not a programmer and my company is not a tech company located in a tech bubble. We are a tiny company that does old school manufacturing.

OK, great.

> That you are trying to use LLMs to create giant sprawling codebase feature packed software packages that define the modern software landscape. What's being missed is that any one user might only utilize 5% of the code base on any given day. Software is written to accommodate every need every user could have in one package. Then the users just use the small slice that accommodates their specific needs.

With all due respect, the fact that you made a few small programs to help with your tasks is wonderful but this last statement alone rather disqualifies your expertise to make an assessment on software engineering in general.

There's a great number of reasons why codebases get large. Complex problems inherently come with complexity and scale in both code and integrations. You can choose to move the complexity around but never fully get rid of it.


But how much of the software industry is truly solving inherently complex problems?

At a very conservative guess I'd say no more than 10% (and my actual guess would be <1%)


A lot of people are deeply invested in these things being better than they really are. From the OpenAI's and Google's spending $100s of billions EACH developing LLMs to VC backed startups promising their "AI agent" can replace entire teams of white collar employees. That's why your experience matches mine and every other developer I personally know but you see comments everywhere making much grander claims.


I agree, but I'd add that it's not just the tech giants who want them to be better than they are, but also non-programmers.

IMO LLMs are actually pretty good at writing small scripts. First, it's much more common for a small script to be in the LLM's training data, and second, it's much easier to find and fix a bug. So the LLM actually does allow a non-programmer to write correct code with minimal effort (for some simple task), and then they are blown away thinking writing software is a solved problem. However, these kinds of people have no idea of the difference between a hundred line script where an error is easily found and isn't a big deal and a million line codebase where an error can be invisible and shut everything down.

Worst of all is when the two sides of tech-giants and non-programmers meet. These two sides may sound like opposites but they really aren't. In particular, there are plenty of non-programmers involved at the C-level and the HR levels of tech companies. These people are particularly vulnerable to being wowed by LLMs seemingly able to do complex tasks that in their minds are the same tasks their employees are doing. As a result, they stop hiring new people and tell their current people to "just use LLMs", leading to the current hiring crisis.


TBH, this website in the last few years has attracted an increasingly non-technical audience. And the field, in general, has attracted a lot of less experienced folks that don't understand the implications of what they're doing. I don't mean that as a diss-- but just a reflection of reality.

Indeed, even codex (and i've been using it prior to this release) is not remotely at the level of even a junior engineer outside of a set of tasks.


Occasionally. I find that there is a certain category of task that I can hand over to an LLM and get a result that takes me significantly less time to clean up than it would have taken me to write from scratch.

A recent example from a C# project I was working in. The project used builder classes that were constructed according to specified rules, but all of these builders were written by hand. I wanted to automatically generate these builders, and not using AI, just good old meta-programming.

Now I knew enough to know that I needed a C# source generator, but I had absolutely no experience with writing them. Could I have figured this out in an hour or two? Probably. Did I write a prompt in less than five minutes and get a source generator that worked correctly in the first shot? Also yes. I then spent some time cleaning up that code and understanding the API it uses to hook into everything and was done in half an hour and still learnt something from it.

You can make the argument that this source generator is in itself "boilerplate", because it doesn't contain any special sauce, but I still saved significant time in this instance.


I've built a number of personal data-oriented and single purpose tools in Replit. I've constrained my ambitions to what I think it can do but I've added use cases beyond my initial concept.

In short, the tools work. I've built things 10x faster than doing it from scratch. I also have a sense of what else I'll be able to build in a year. I also enjoy not having to add cycles to communicate with external contributors -- I think, then I do, even if there's a bit of wrestling. Wrangling with a coding agent feels a bit like "compile, test, fix, re-compile". Re-compiling generally got faster in subsequent generations of compiler releases.

My company is building internal business functions using AI right now. It works too. We're not putting that stuff in front of our customers yet, but I can see that it'll come. We may put agents into the product that let them build things for themselves.

I get the grumpiness & resistance, but I don't see how it's buying you anything. The puck isn't underfoot.


I think it all depends on your platform and use cases. In my experience AI tools work best with Python and JS/Typescript and some simple use cases (web apps, basic data science etc). Also, I've found they can be of great help with refactorings and cases when you need to do something similar to already existing code, but with a twist or change.


What you're missing is how to use the tools properly. With solid documentation, good project management practices, a well-organized code structure and tests, any junior engineer should be able to read up on your codebase, write linted code following your codebase style, verify it via tests and write you a report of what was done, challenges faced etc. State of the art coding agents will do that at superhuman speeds.

If you haven't set things up properly (important info lives only in people’s heads / meetings, tasks dont have clear acceptance criteria, ...) then you aren't ready for Junior Developers yet. You need to wait until your Coding Agents are at Senior level.


Firstly, LLM chat interfaces != agentic coding platforms.

ChatGPT is good for asking questions about languages, SDKs, and APIs, or generating boilerplate, but it's useless if you want to give an AI a ticket and for it to raise PRs for you.

This is where you need agentic solutions like Codex which will be far more useful because they will actually have access to your codebase and a dev environment where they can test and debug changes.

They still do really dumb things, but a lot of this can be avoided if you prompt well and give it the right types of problems to solve.

In my experience at the moment there's a sweet spot with these agentic coding platforms which makes them useful for semi-complicated tasks – assuming you prompt well they can generate 90% of the code you need, then you just need to spend the extra 10% fixing it up before it's ready for prod.

Tasks too simple (a few lines) it's a waste of time. You spend longer prompting and going back and forth with the agent than it would take to just make the change yourself.

Then obviously very complicated tasks, especially tasks that require some thought around architecture and performance, coding agents really struggle with. Less because they can't do it, but because for certain problems simply meeting ACs is far less important than how the ACs are being met. Ideally here you want to get the architecture right first, then once that's in place you can break down the remaining work for the AI to pick up.


I think most code these days is boilerplate, though the composition of boilerplate snippets can become something unique and differentiated.


you might be missing small things to create more guardrails like effective prompting and maintaining what's been done using files, carefully controlling context, committing often in-between changes, but largely, you're not missing anything. i use AI constantly, but always for subtasks of a larger complicated thing that my brain has thought through. and often use higher cost models to help me abstractly think through complex things/point me in the right directions.

personally, i've always operated in a codebase in a way that i _need_ to understand how things work for me to be productive and make the right decisions. I operate the same way with AI. every change is carefully reviewed, if it's dumb, i make it redo it and explain why it's dumb. and if it gets caught in a loop, i reset the context and try to reframe the problem. overall, i'm definitely more productive, but if you truly want to be hands off--you're in for a very bad time. i've been there.

lastly, some codebases don't work well with AI. I was working on a problem that was a bit more novel/out there and no model could solve it. Just yapped endlessly about these complex, very potentially smart sounding solutions that did absolutely nothing. went all the way to o1-pro. the craziest part to me was the fact that across claude, deepseek and openai, they used the same specific vernacular for this particular problem which really highlights how a lot of these models are just a mish-mash of the same underlying architecture/internet data. some of these models use responses from other models for their training data, which to me is like incest. you won't get good genetical results


It’s probably what you’re asking. You can’t just say “write me an app”, you have to break a big problem into small problems for it.


I tried using Gemini 2.5 Pro for a side-side-project, seemed like a good project to explore LLMs and how they'd fit into my workflow. 2-3 weeks later it's around 7k loc of Python auto-gerating about 35k loc of C from JSON spec.

This project is not your typical Webdev project, so maybe that's an interesting case-study. It takes a C-API spec in JSON, loads and processes it in Python and generates a C-library that turns a UI marked up YAML/JSON into C-Api calls to render that UI. [1]

The result is pretty hacky code (by my design, can't/won't use FFI) that's 90% written by Gemini 2.5 Pro Pre/Exp but it mostly worked. It's around 7k lines of Python that generate a 30-40k loc C-library from a JSON LVGL-API-spec to render an LVGL UI from YAML/JSON markup.

I probably spent 2-3 weeks on this, I might have been able to do something similar in maybe 2x the time but this is about 20% of the mental overhead/exhaustion it would have taken me otherwise. Otoh, I would have had a much better understanding of the tradeoffs and maybe a slightly cleaner architecture if I would have to write it. But there's also a chance I would have gotten lost in some of the complexity and never finished (esp since it's a side-project that probably no-one else will ever see).

What worked well:

* It mostly works(!). Unlike previous attempts with Gemini 1.5 where I had to spend about as much or more time fixing than it'd have taken me to write the code. Even adding complicated features after the fact usually works pretty well with minor fixing on my end.

* Lowers mental "load" - you don't have to think so much about how to tackle features, refactors, ...

Other stuff:

* I really did not like Cursor or Windsurf - I half-use VSCode for embedded hobby projects but I don't want to then have another "thing" on top of that. Aider works, but it would probably require some more work to get used to the automatic features. I really need to get used to the tooling, not an insignificant time investment. It doesn't vibe with how I work, yet.

* You can generate a *significant* amount of code in a short time. It doesn't feel like it's "your" code though, it's like joining a startup - a mountain of code, someone else's architecture, their coding style, comment style, ... and,

* there's this "fog of code", where you can sorta bumble around the codebase but don't really 100% understand it. I still have mid/low confidence in the changes I make by hand, even 1 week after the codebase has largely stabilized. Again, it's like getting familiar with someone else's code.

* Code quality is ok but not great (and partially my fault). Probably depends on how you got to the current code - ie how clean was your "path". But since it is easier to "evolve" the whole project (I changed directions once or twice when I sort of hit a wall) it's also easier to end up with a messy-ish codebase. Maybe the way to go is to first explore, then codify all the requirements and start afresh from a clean slate instead of trying to evolve the code-base. But that's also not an insignificant amount of work and also mental load (because now you really need to understand the whole codebase or trust that an LLM can sufficiently distill it).

* I got much better results with very precise prompts. Maybe I'm using it wrong, ie I usually (think I) know what I want and just instruct the LLM instead of having an exploratory chat but the more explicit I am, the more closely the output is to what I'd like to see. I've tried to discuss proposed changes a few times to generate a spec to implement in another session but it takes time and was not super successful. Another thing to practice.

* A bit of a later realization, but modular code and short, self-contained modules are really important though this might depend on your workflow.

To summarize:

* It works.

* It lowers initial mental burden.

* But to get really good results, you still have to put a lot of effort into it.

* At least right now, it seems you will still eventually have to put in the mental effort at some point, normally it's "front-loaded" where you have to do the design and think about it hard, whereas the AI does all the initial work but it becomes harder to cope with the codebase once you reach a certain complexity. Eventually you will have to understand it though even if just to instruct the LLM to make the exact changes you want.

[1] https://github.com/thingsapart/lvgl_ui_preview


yes, think of it as search engine that auto-applies that stackoverflow fix to your code.

But I have done larger tasks (write device drivers) using gemini.


I feel things get even worse when you use a more niche language. I get extremely disappointed any time I try to get it do anything useful in Clojure. Even as a search engine, especially when asking it about libraries, these tools completely fail expectation.

I can't even fathom how frustrating such tools would be with poorly written confusing Clojure code using some niche dependency.

That being said, I can imagine a whole class of problems where this could succeed very well at and provide value. Then again, the type of problems that I feel these systems could get right 99% of the time are problems that a skilled developer could fix in minutes.


Hey there!

Lots missing here, but I had the same issues, it takes iteration and practice. I use claude code in terminal windows, and text expander to save explicit reminders that I have to inject super regularly because anthropic obscures access to system prompts.

For example, I have 3 to 8 paragraph long instructions I will place regularly about not assuming, checking deterministically etc. and for most things I have the agents write a report with a specific instruction set.

I pop the instructions into text expander so I just type - docs when saying go figure this out, and give me the path to the report when done.

They come back with a path, and I copy it and search vscode

It opens as an md and i use preview mode, its similar to a google doc.

And ill review it. always, things will be wrong, tons of assumptions, failures to check determistically, etc... but I see that in the doc and have it fix it. correct misunderstandings, update the doc until its perfect.

From there ill say add a plan in a table with status for each task based on this ( another text expander snippet with instructions )

And WHEN thats 100% right, Ill say implement and update as you go. The update as you go forces it to recognize and remember the scope of the task.

Greatest points of failure in the system is misalignment. Ethics teams got that right. It compounds FAST if allowed. you let them assume things, they state assumptions as facts, that becomes what other agents read and you get true chaos unchecked.

I started rebuilding claude code from scratch literally because they block us from accessing system prompts and I NEED these agents to stop lying to me about things that are not done or assumed, which highlights the true chaos possible when applied to system critical operations in governance or at scale.

I also built my own tool like codex for managing agent tasks and making this simpler, but getting them to use it without getting confused is still a gap.

Let me know if you have any other questions. I am performing the work of 20 Engineers as of today, rewrote 2 years of back end code that required a team of 2 engineers full time work in 4 weeks by myself with this system... so I am, I guess quite good at it.

I need to push my edges further into this latest tech, have not tried codex cli or the new tool yet.


Its a total of about 30 snippets, avg 6 paragraphs long, that I have to inject. for each role switch it goes through i have to re inject them.

its a pain but it works.

Even TDD it will hallucinate the mocks without management. and hallucinate the requirements. Each layer has to be checked atomically, but the text expander snippets done right can get it close to 75% right.

My main project faces 5000 users so I cant let the agents run freely, whereas with isolated projects in separate repos I can let them run more freely, then review in gitkraken before committing.


You could just use something like roo code with custom modes rather than manually injecting them. The orchestrator mode can decide on the other appropriate modes to use for subtasks.

You can customize the system prompts, baseline propmts, and models used for every single mode and have as many or as few as you want.


It may depend on what you consider boilerplate. I use them quite a bit for scripting outside of direct product code development. Essentially, AI coding tools have moved this chart's decision making math for me: https://xkcd.com/1205/ The cost to automate manual tasking is now significantly lower so I end up doing more of it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: