More

mathisfun123 · 2026-02-28T06:32:38 1772260358

> Given this understanding, I don't see why I should quit.

https://en.wikipedia.org/wiki/Motivated_reasoning

mathisfun123 · 2026-02-26T17:15:41 1772126141

in theory that's what a compiler is - a thin wrapper over a SAT solver. in practice most compilers just use heuristics <shrug>.

mathisfun123 · 2026-02-24T22:11:46 1771971106

> I think this is solid proof that the bedrock of academia is deeply motivated by money and still defaults to optimizing where it impacts its bottom line.

no shit - could've asked literally anyone that's finished their phd to save yourself the conjecturing/hypothesizing about this fact.

mathisfun123 · 2026-02-20T06:28:07 1771568887

https://yuri.is/not-julia/

> My conclusion after using Julia for many years is that there are too many correctness and composability bugs throughout the ecosystem to justify using it in just about any context where correctness matters.

mathisfun123 · 2026-02-20T06:26:29 1771568789

i hope you realize this is purely because julia uses LLVM and LLVM has backends for those targets (noticeably absent are GPUs which do not have LLVM backends). any other language which uses LLVM could do the same exact same thing (and would be hampered in the exact same way).

majoe · 2026-02-20T16:43:00 1771605780

Probably true, but one unique thing about Julia is, that exposes almost all stages of the compilation to the user. From typed IR to native code generation you can customise the compilation in many ways. Together with the power of LISP's metaprogramming features, that's a really fine basis for powerful and performamt DSLs and code transformations.

All those GPU targets are powered by libraries, that are not part of Julia itself (GPUCompiler.jl). The same goes for automatic differentiation. That's remarkable in my opinion.

So you're right, that many programming languages could do it, but it's no wonder, that other languages are lacking in this regard compared to Julia.

mathisfun123 · 2026-02-20T06:24:26 1771568666

i write C++ every day (i actually like it...) but absolutely no one is going to switch from C to C++ just for dtors.

kibwen · 2026-02-20T07:40:04 1771573204

No, RAII is one of the primary improvements of C++ over C, and one of the most ubiquitous features that is allowed in "light" subsets of C++.

flohofwoe · 2026-02-20T08:16:51 1771575411

> but absolutely no one is going to switch from C to C++ just for dtors

The decision would be easier if the C subset in C++ would be compatible with modern C standards instead of being a non-standard dialect of C stuck in ca. 1995.

gpderetta · 2026-02-20T10:48:00 1771584480

Of course not! Those that would have, already did!

Pay08 · 2026-02-20T06:38:22 1771569502

Weren't dtors the reason GCC made the switch?

uecker · 2026-02-20T09:32:50 1771579970

I don't think so. As a contributor to GCC, I also wished it hadn't.

Pay08 · 2026-02-20T12:46:15 1771591575

Why do you think so?

uecker · 2026-02-20T12:50:32 1771591832

For two reasons: First, where C++ features are used, it make the code harder to understand rather than easier. Second, it requires newer and more complex toolchains to build GCC itself. Some people still maintain the last C version of GCC just to keep the bootstrap path open.

Pay08 · 2026-02-22T08:19:14 1771748354

I'm very far from compiler development, but in my experience, while C++ is hard to read, the equivalent C code would be much more unreadable.

uecker · 2026-02-22T09:18:53 1771751933

This is not my experience at all. In fact, my experience is where C++ is used in GCC it became harder to read. Note that GCC was written in C and then introduced C++ features later so this is not hypothetical.

In general, I think clean C code is easier to read than C++ due to much less complexity and not having language features that make it more difficult to understand by hiding crucial information from the reader (overloading, references, templates, auto, ..).

mathisfun123 · 2026-02-18T15:08:11 1771427291

> list object is constructed once and assigned to both variables

Ummm no the list is constructed once and assigned to b and then b is assigned to a. It would be crazy semantics if `a = b = ...` meant `a` was assigned `...`.

Edit: I'm wrong it's left to right not right to left, which makes the complaint in the article even dumber.

kccqzy · 2026-02-18T15:58:13 1771430293

It’s assigned left to right, not right to left. It’s documented in the Python language reference.

> An assignment statement evaluates the expression list and assigns the single resulting object to each of the target lists, from left to right.

Consider this:

    a = [1, 2]
    i = a[i] = 1

If assignment were to happen right to left, you would get a NameError exception because the first assignment would require an unbound variable.

mathisfun123 · 2026-02-18T16:40:20 1771432820

Fine but that even moreso illustrates how goofy the expectation that the "ctor" for [] would be called twice.

ayhanfuat · 2026-02-18T15:37:20 1771429040

> then b is assigned to a

Wouldn't that require a LOAD_FAST? Also a is assigned first (from left to right) so a = ... happens either way.

mathisfun123 · 2026-02-18T05:28:56 1771392536

it's astounding to me how many people pop off about "AMD SHOULD SUPPORT CUDA" not knowing that HIP (and hipify) has been around for literally a decade now.

colordrops · 2026-02-18T05:30:31 1771392631

Please explain to me why all the major players are buying Nvidia then? Is HIP a drop in replacement? No.

You have to port every piece of software you want to use. It's ridiculous to call this a solution.

woctordho · 2026-02-18T06:51:44 1771397504

Major players in China don't play like that. MooreThreads, Lisuan, and many other smaller companies all have their own porting kits, which are basically copied from HIP. They just port every piece of software and it just works.

If you want to fight against Nvidia monopoly, then don't just rant, but buy a GPU other than Nvidia and build on it. Check my GitHub and you'll see what I'm doing.

mathisfun123 · 2026-02-18T06:59:37 1771397977

> Is HIP a drop in replacement? No.

You don't understand what HIP is - HIP is AMD's runtime API. it resembles CUDA runtime APIs but it's not the same thing and it doesn't need to be - the hard part of porting CUDA isn't the runtime APIs. hipify is the thing that translates both runtime and kernels. Now is hipify a drop-in replacement? No of course but because the two vendors have different architectures. So it's absolutely laughable to imagine that some random could come anywhere near "drop-in replacement" when AMD can't (again: because of fundamental architecture differences).

colordrops · 2026-02-18T07:05:49 1771398349

Who said "some random"? Read the whole thread. I was suggesting AMD invest BILLIONS to make this happen. You're aguing with a straw man.

bigyabai · 2026-02-18T07:30:06 1771399806

I think you misunderstand what's fundamentally possible with AMD's architecture. They can't wave a magic wand for a CUDA compatibility layer any better than Apple or Qualcomm can, it's not low-hanging fruit like DirectX or Win32 translation. Investing billions into translating CUDA on raster GPUs is a dead end.

AMD's best option is a greenfield GPU architecture that puts CUDA in the crosshairs, which is what they already did for datacenter customers with AMD Instinct.

colordrops · 2026-02-18T08:32:07 1771403527

I do not misunderstand.

Let's say you put 50-100 seasoned devs on the problem, and within 2-3 years, probably get ZLUDA to the point where most mainstream CUDA applications — ML training/inference, scientific computing, rendering — run correctly on AMD hardware at 70-80% of the performance you'd get from a native ROCm port. Even if its not optimal due to hardware differences, it would be genuinely transformative and commercially valuable.

This would give them runway for their parallel effort to build native greenfield libraries and toolkits and get adoption, and perhaps make some tweaks to future hardware iterations that make compatibility easier.

zvr · 2026-02-18T14:05:53 1771423553

Before the "ZLUDA" project completion, they would be facing a lawsuit for IP infringement, since CUDA is owned by NVIDIA.

colordrops · 2026-02-18T16:45:19 1771433119

They would win, compatibility layers are not illegal.

bigyabai · 2026-02-18T18:21:53 1771438913

Win against who? AMD is the one that asked them to take it down: https://www.tomshardware.com/pc-components/gpus/amd-asks-dev...

And while compatibility layers aren't illegal, they ordinarily have to be a cleanroom design. If AMD knew that the ZLUDA dev was decompiling CUDA drivers to reverse-engineer a translation layer, then legally they would be on very thin ice.

bigyabai · 2026-02-18T18:14:27 1771438467

ROCm is supported by the minority of AMD GPUs, and is accelerated inconsistently across GPU models. 70-80% of ROCm's performance is an unclear target, to the point that a native ROCm port would be a more transparent choice for most projects. And even then, you'll still be outperformed by CUDA the moment tensor or convolution ops are called.

Those billions are much better-off being spent on new hardware designs, and ROCm integrations with preexisting projects that make sense. Translating CUDA to AMD hardware would only advertise why Nvidia is worth so much.

> it would be genuinely transformative and commercially valuable.

Bullshit. If I had a dime for every time someone told me "my favorite raster GPU will annihilate CUDA eventually!" then I could fund the next Nvidia competitor out of pocket. Apple didn't do it, Intel didn't do it, and AMD has tried three separate times and failed. This time isn't any different, there's no genuine transformation or commercial value to unlock with outdated raster-focused designs.

KeplerBoy · 2026-02-18T07:51:24 1771401084

This is a big part of AMD still not having a proper foothold in the space: AMD Instinct is quite different from what regular folks can easily put in their workstation. In Nvidia-land I can put anything from mid-range gaming cards, over a 5090 to an RTX 6000 Pro in my machine and be confident that my CUDA code will scale somewhat acceptably to a datacenter GPU.

bigyabai · 2026-02-18T08:07:21 1771402041

This is where I feel like Khronos could contribute, making a Compute Capability-equivalent hardware standard for vendors to implement. CUDA's versioning of hardware capabilities plays a huge role in clarifying the support matrix.

...but that requires buy-in from the rest of the industry, and it's doubtful FAANG is willing to thread that needle together. Nvidia's hedged bet against industry-wide cooperation is making Jensen the 21st century Mansa Musa.

mathisfun123 · 2026-02-18T15:02:52 1771426972

No I'm arguing with someone who clearly doesn't understand GPUs

> invest BILLIONS to make this happen

As I have already said twice, they already have, it's called hipify and it works as well as you'd imagine it could (ie poorly because this is a dumb idea).

KennyBlanken · 2026-02-18T07:49:50 1771400990

Wow you're so very smart! You should tell all the llm and stablediffusion developers who had no idea it existed! /s

HIP has been dismissed for years because it was a token effort at best. Linux only until the last year or two, and even now it only supports a small number of their cards.

Meanwhile CUDA runs on damn near anything, and both Linux and Windows.

Also, have you used AMD drivers on Windows? They can't seem to write drivers or Windows software to save their lives. AMD Adrenalin is a slow, buggy mess.

Did I mention that compute performance on AMD cards was dogshit until the last generation or so of GPUs?

mathisfun123 · 2026-02-16T19:51:17 1771271477

what's your point?

mathisfun123 · 2026-02-15T12:43:29 1771159409

> It might be necessary to create a legal basis, but it's just a matter of doing it

Tell me you don't know anything about the law without telling me