Can you use the CPython approach: an open-addressed hashtable mapping keys to indices into a dynamic array holding (pointers to) the actual keys and values? (Reference stability of course precludes implementing deletes by moving the last key/value entry into the vacated slot; instead you must track empty slots for reuse.)
As soon as the hashmap needs to resize the array will be realloc’d, so it won’t be stable unless you add virtual memory tricks or an indirection via some sort of segmented array.