As one of the authors of Happy, your post got me very interested.
We wrote the paper over a year ago, we used a version of PyPy from back then. It's possible that we get a bigger speedup with a newer PyPy.
It's interesting that you mention the advanced features. I looked at Hippy and the most interesting JIT feature Hippy uses is _virtualizable2_, which virtualizes all function locals and unboxes them. We tried using it ourselves, but it forces each function to have a static list of variables and no dynamic variable accesses (like $$x). It looks like Hippy falls back to the regular implementation for dynamic variable accesses, where the entire list is stored in a dictionary. Now I'm wondering how much this happens in real-world code, we assumed it does happen enough times.
Also, I'm working on posting a publicly-available version of the paper. I'll post a link when I do that.
For dynamic variable access you do what PyPy (the python interpreter) does for globals. You indeed fall back to a dict, but you keep a dict of cells, that is indexes in the list, not the dict of variables. That way you have extra indirection, but all accesses that are static are efficient, even if you have a dynamic access somewhere (which indeed need to do a lookup, but too bad). I fear this is not the best place to discuss further though, you have my mail.
Please find that some of us find the technical aspects of this conversation very fascinating. It's nice to jump into the mind of another developer and see how they approach problems.
It's interesting that you mention the advanced features. I looked at Hippy and the most interesting JIT feature Hippy uses is _virtualizable2_, which virtualizes all function locals and unboxes them. We tried using it ourselves, but it forces each function to have a static list of variables and no dynamic variable accesses (like $$x). It looks like Hippy falls back to the regular implementation for dynamic variable accesses, where the entire list is stored in a dictionary. Now I'm wondering how much this happens in real-world code, we assumed it does happen enough times.
Also, I'm working on posting a publicly-available version of the paper. I'll post a link when I do that.