More

hnthrowaway0315 · 2026-02-28T04:43:40 1772253820

Ah, is it the time when Skynet starts to manifest itself...

hnthrowaway0315 · 2026-02-27T19:36:58 1772221018

Ah, I was supposed to read Hyperion for a long, long time :/

hnthrowaway0315 · 2026-02-22T22:38:30 1771799910

This. It is just mental drug.

cagenut · 2026-02-22T22:50:48 1771800648

so is love

this level of reductive thought termination goes nowhere

hnthrowaway0315 · 2026-02-19T14:02:45 1771509765

Thank you. I love the wallpapers of Paged Out and always set it as my default wallpaper on MacOS.

hnthrowaway0315 · 2026-02-13T19:09:27 1771009767

What's the point of this web page?

hnthrowaway0315 · 2026-02-10T17:41:47 1770745307

I have given the topic some thoughts. I concluded that the ONLY way for ordinary people (non-genius, IQ <= 120) to be really good, be really close to the genius, is to sit down, condensate the past 40 or so year's tech history of three topics (Comp-Arch, OS and Compiler) into a 4-5 years of self-education.

Such education is COMPLETELY different from the one they offered in school, but closer to those offered in premium schools (MIT/Berkeley). Basically, I'd call it "Software engineering archaeology". Students are supposed to take on ancient software, compile them, and figure out how to add new features.

For example, for the OS kernel branch:

- Course 0: MIT xv6 lab, then figure out which subsystem you are interested in (fs? scheduler? drivers?)

- Course 0.5: System programming for modern Linux and NT, mostly to get familiar with user space development and syscalls

- Course 1: Build Linux 0.95, run all of your toolchains in a docker container. Move it to 64-bit. Say you are interested in fs -- figure out the VFS code and write a couple of fs for it. Linux 0.95 only has Minix fs so there are a lot of simpler options to choose from.

- Course 2: Maybe build a modern Linux, like 5.9, and then do the same thing. This time the student is supposed to implement a much more sophiscated fs, maybe something from the SunOS or WinNT that was not there.

- Course 3 & 4: Do the same thing with leaked NT 3.5 and NT 4.0 kernel. It's just for personal use so I wouldn't worry about the lawyers.

For reading, there are a lot of books about Linux kernels and NT kernels.

hnthrowaway0315 · 2026-02-09T16:28:05 1770654485

I wonder who left the team recently. Must be someone bagged with shadow knowledge. Or maybe they send devops/devs work to another continent.

jsheard · 2026-02-09T16:39:26 1770655166

They're in the process of moving from "legacy" infra to Azure, so there's a ton of churn happening behind the scenes. That's probably why things keep exploding.

estimator7292 · 2026-02-09T16:57:34 1770656254

I don't know jack about shit here, but genuinely: why migrate a live production system piecewise? Wouldn't it be far more sane to start building a shadow copy on Azure and let that blow up in isolation while real users keep using the real service on """legacy""" systems that still work?

chickenpotpie · 2026-02-09T17:08:07 1770656887

Because it's significantly harder to isolate problems and you'll end up in this loop

* Deploy everything * It explodes * Rollback everything * Spend two weeks finding problem in one system and then fix it * Deploy everything * It explodes * Rollback everything * Spend two weeks finding a new problem that was created while you were fixing the last problem * Repeat ad nauseum

Migrating iteratively gives you a foundation to build upon with each component

wizzwizz4 · 2026-02-09T17:28:21 1770658101

So… create your shadow system piecewise? There is no reason to have "explode production" in your workflow, unless you are truly starved for resources.

paulddraper · 2026-02-09T20:58:08 1770670688

Does this shadow system have usage?

Does it handle queries, trigger CI actions, run jobs?

wizzwizz4 · 2026-02-09T22:23:33 1770675813

If you test it, yes.

Of course, you need some way of producing test loads similar to those found in production. One way would be to take a snapshot of production, tap incoming requests for a few weeks, log everything, then replay it at "as fast as we can" speed for testing; another way would be to just mirror production live, running the same operations in test as run in production.

Alternatively, you could take the "chaos monkey" approach (https://www.folklore.org/Monkey_Lives.html), do away with all notions of realism, and just fuzz the heck out of your test system. I'd go with that, first, because it's easy, and tends to catch the more obvious bugs.

chickenpotpie · 2026-02-09T23:51:27 1770681087

So just double your cloud bill for several few weeks, costing site like GitHub millions of dollars?

How do you handle duplicate requests to external services? Are you going to run credit cards twice? Send emails twice? If not, how do you know it's working with fidelity?

paulddraper · 2026-02-10T01:43:59 1770687839

> several few weeks

*many months

1718627440 · 2026-02-10T10:46:50 1770720410

You can mirror all requests to the shadow system.

paulddraper · 2026-02-09T17:07:42 1770656862

A few reasons:

1. Stateful systems (databases, message brokers) are hard to switch back-and-forth; you often want to migrate each one as few times as possible.

2. If something goes sideways -- especially performance-wise -- it can be hard to tell the reason if everything changed.

3. It takes a long time (months/years) to complete the migration. By doing it incrementally, you can reap the advantages of the new infra, and avoid maintaining two things.

---

All that said, GitHub is doing something wrong.

throwway120385 · 2026-02-09T17:00:08 1770656408

Why would you avoid a perfect opportunity to test a bunch of stuff on your customers?

toast0 · 2026-02-09T17:57:04 1770659824

If you make it work, migrating piecewise should be less change/risk at each junction than a big jump between here and there of everything at once.

But you need to have pieces that are independent enough to run some here and some there, and ideally pieces that can fail without taking down the whole system.

literallyroy · 2026-02-09T17:05:03 1770656703

That’s a safer approach but will cause teams to need to test in two infrastructures (old world and new) til the entire new environment is ready for prime time. They’re hopefully moving fast and definitely breaking things.

helterskelter · 2026-02-09T17:23:33 1770657813

It took me a second to realize this wasn't sarcasm.

hnthrowaway0315 · 2026-02-09T16:56:05 1770656165

Are they just going to tough through the process and whatever...

perdomon · 2026-02-09T16:39:06 1770655146

I think it's more likely the introduction of the ability to say "fix this for me" to your LLM + "lgtm" PR reviews. That or MS doing their usual thing to acquired products.

persedes · 2026-02-09T19:20:25 1770664825

rumors I've heard was that github is mostly run by contractors? That might explain the chaos more than simple vibe coding (which probably aggravates this)

arccy · 2026-02-09T16:39:24 1770655164

nah, they're just showing us how to vibecode your way to success

hnthrowaway0315 · 2026-02-09T16:42:08 1770655328

If the $$$ they saved > the $$$ they lose then yeah it is a success. Business only cares about $$$.

collingreen · 2026-02-09T17:51:15 1770659475

Definitely. The devil is in the details though since it's so damn hard to quantify the $$$ lost when you have a large opinionated customer base that holds tremendous grudges. Doubly so when it's a subscription service with effectively unlimited lifetime for happy accounts.

Business by spreadsheet is super hard for this reason - if you try to charge the maximum you can before people get angry and leave then you're a tiny outage/issue/controversy/breach from tipping over the wrong side of that line.

hnthrowaway0315 · 2026-02-09T18:04:04 1770660244

Yeah, but who cares about long-term? In the long term we are all dead. CEO only needs to be good for 5-10 max years, pop up stock prices and get applause every where and called as the smartest guy in the world.

hnthrowaway0315 · 2026-02-09T16:27:24 1770654444

Someone should make a timeline chart from that, lol.

jakub_g · 2026-02-09T17:01:55 1770656515

https://updog.ai/status/github

mrshu · 2026-02-09T18:30:19 1770661819

Here it is. It looks like they are down to a single 9 at this point across all services:

https://mrshu.github.io/github-statuses/

dreadnip · 2026-02-09T18:33:34 1770662014

Can you add a line graph with incidents per month? Would be useful to see if the number of incidents are going up or down over time.

matt_kantor · 2026-02-09T19:21:21 1770664881

I threw together <https://mkantor.github.io/github-incident-timeline/>. It's by day rather than month, and only shows the last 50 incidents since that's all their API returns.

bckmn · 2026-02-10T14:45:56 1770734756

Prompting

> Let's make a timeline chart of https://www.githubstatus.com/history for the past 1yr and upload it as a gist

yields [GitHub Status Incident Timeline — Feb 2025 to Feb 2026](https://htmlpreview.github.io/?https://gist.githubuserconten...)

219 total incidents across 12 full months, averaging 18.3/month. January 2026 was the worst month, and August 2025 was the calmest.

stefankuehnel · 2026-02-09T16:33:59 1770654839

Haha, that would be awesome!

gowld · 2026-02-09T16:57:31 1770656251

Light work for an LLM

nozzlegear · 2026-02-09T17:03:49 1770656629

But not Copilot.

peartickle · 2026-02-09T17:20:04 1770657604

Copilot is shown as having policy issues in the latest reports. Oh my, the irony. Satya is like "look ma, our stock is dropping...", Gee I wonder why Mr!!

hnthrowaway0315 · 2026-02-06T14:54:57 1770389697

    > you lay out a huge specification that would fully work through all of the complexity in advance, then build it.

I have tried this a couple of time even for small projects ( a few sprints ), and they never worked out. I'd argue it never works out if you are doing non-system programming projects, and only has a theoretical non-zero possibility to work out for system programming projects, and perhaps a 5-10% to work out for very critical and no patch possible projects (like moon landing).

Because requirements always change. Humans always change. That's it. No need to elaborate.

hnthrowaway0315 · 2026-02-02T19:21:20 1770060080

The guy wanted to be his own sysadmin. Quite understandable though.