> You have a single big data structure that can't be shared easily between multiple processes. Can't you use multiprocessing with that? Maybe mapping the data structure to a file and mmapping that in multiple processes? Maybe wrapping the whole thing in database instead of just using one huge nested dictionary?
ton of additional complexity, not worth it for many use-cases and anything on the line of "using multiple processes or threads to increase python performance" does have (or at least did have) quite a bunch of additional foot guns in python
In that context porting a very trivial ad-hoc application to Java (or C# or Rust, depending on what knowhow exist in the Team) would faster or at least not much slower to do. But it would be reliable estimable by reducing the chance for any unexpected issues, like less perf then expected.
Basically the moment "use mmap" or "use multi-processing" is a reasonable recommendation for something ad-hocish there is something rally wrong with the tools you use IMHO.
How good is support for numpy / scipy / pandas or equivalents, if they exist, outside Python?
Actually the resulting structure should of course be dumped into an RDBMS or a graph DB and served from there more readily. Doing that takes skill and time though, which often are worth applying elsewhere.
The use case I'm thinking about is very simple: One big data structure that is mostly read from and sometimes written to. Use a single mutex with a shared lock for reading and an exclusive lock for writing. Then the readers are safe and would only block during updates when one writer is active. Everything else beside the data structure can be per-thread and wouldn't interfere.
The problem why we wouldn't want to port this application to another language is 100k lines of existing code that is best written in Python and no resources to rewrite all that.
> Basically the moment "use mmap" or "use multi-processing" is a reasonable recommendation for something ad-hocish there is something rally wrong with the tools you use IMHO.
Hmm. So you're saying only languages which bury lock and mutex over shared data are appropriate to use for async parallelism over shared data? Because calling explicit lock() and releae() isn't that hard. However it does incur a function call overhead. I suppose some explicit in language support could minimise that partially.
ton of additional complexity, not worth it for many use-cases and anything on the line of "using multiple processes or threads to increase python performance" does have (or at least did have) quite a bunch of additional foot guns in python
In that context porting a very trivial ad-hoc application to Java (or C# or Rust, depending on what knowhow exist in the Team) would faster or at least not much slower to do. But it would be reliable estimable by reducing the chance for any unexpected issues, like less perf then expected.
Basically the moment "use mmap" or "use multi-processing" is a reasonable recommendation for something ad-hocish there is something rally wrong with the tools you use IMHO.