Well Python doesn't really do proper multi-threading currently thanks to GIL blocking any additional execution threads. So removing it would enable making Python code that is actually multi-threaded without resorting to extra processes and their overhead.
So if you are writing small single process Python script then removing GIL shouldn't really change much. If you are doing some heavier computing or eg. running server back-end, then there are significant performance gains available with this change.
You don’t have to use separate processes to get the benefit of multithreading in Python today — you can also call into a library written in native code that drops the GIL (e.g. Numpy or Pytorch).
Even then the GIL can cause issues, concerns of PyTorch are specifically one of the motivations of the PEP, and one of the reasons Meta / FB really really wants this:
> In PyTorch, Python is commonly used to orchestrate ~8 GPUs and ~64 CPU threads, growing to 4k GPUs and 32k CPU threads for big models. While the heavy lifting is done outside of Python, the speed of GPUs makes even just the orchestration in Python not scalable. We often end up with 72 processes in place of one because of the GIL. Logging, debugging, and performance tuning are orders-of-magnitude more difficult in this regime, continuously causing lower developer productivity.
I feel like orchestrating thousands of GPUs is such a niche use case that it’s fair to expect the people wanting to do it to learn a more suited language, rather than ruining Python for everyone else.
I notice you used the strong emotional word "ruining" when talking about the effect on Python of this change.
Why do you believe an obscure runtime concurrency detail which will make more things possible will "ruin" the language?
Now match and :=? Those definitely ruin the language. ;-)
But seriously, relax, nothing bad is happening here. It's not just people who have to use the torch launcher who have been bitten by Python's currently-terrible multicore story. I've been a Python programmer for 15 years and I think this is a wonderful change.
Yes, like bash, Python is a language that exists mainly as glue for code written in other languages. Do you think we need to add multithreading to bash?
It's (likely) much less expensive (in many ways, not just financially) to employ a larger number of python programmers than a smaller number of them skilled in a language more appropriate for the use case. Engineer flexibility, salary costs, maintenance/correctness concerns with implications for development time, etc., are all factors here. The technical choice of "python or not python" is rarely the only--or even most important--choice to make.
You are completely right. Why don't they write their stuff in another language? They've got the resources. Now the rest of the world will suffer the consequences, one of which may be that the devs of native libs will simply abandon the work, or that those libs will become too difficult to use for the casual or starting programmer, completely defeating the purpose.
I'm fine with two builds, but not a single non-GIL build.
> Now the rest of the world will suffer the consequences, one of which may be that the devs of native libs will simply abandon the work, or that those libs will become too difficult to use for the casual or starting programmer, completely defeating the purpose.
Or get the benefits, so casual or starting programmers won't be wondering why their python program refuses to go above 100% CPU, or have to deal with the bullshit of multiprocessing.
> I'm fine with two builds, but not a single non-GIL build.
That might have been the original intention. The latest notice from the SC says:
Our base assumptions are:
* Long-term (probably 5+ years), the no-GIL build should be the only build. We do not want to create a permanent split between with-GIL and no-GIL builds (and extension modules).
They repeat it later. It looks as if they really want to remove it.
> have to deal with the bullshit of multiprocessing
The problems multi-threading introduces outweigh that by far.
That only works in some cases, if the boundary between Python and native code is absolute. In many cases users want to extend/configure the behavior of that native code, e.g. through callbacks or subclassing, and the GIL makes the behavior prohibitively slow (needing to lock/unlock to serialize at any of these potential Python<->native boundaries) or unsafe (deadlocks/corruption if the GIL isn't handled).
There's a lot of C++ code bound in python (e.g. via pybind11) where the GIL currently imposes a hard bound on how users can employ parallelism, even in "nominally" native code.
So if you are writing small single process Python script then removing GIL shouldn't really change much. If you are doing some heavier computing or eg. running server back-end, then there are significant performance gains available with this change.