Since Doom renders the image with vertical columns of pixels (floor, lower wall, portal if exists continues rendering the other sector, then upper wall then ceiling) and since browsers are very good at drawing the sprites out of larger textures... You could send vertical divs shaded with the sector light level and picking the correct textures. Instead of hundreds per column you will have like 5 divs on average per column and they will be textured shaded and scaled by the browser?
I think the proposal here is to optimize for bandwidth by minimizing number of divs, because there are fewer divs per column per frame. It might actually turn out to be more work for the browser because it has to layout the columns with divs that are not uniformly sized.
IIRC someone did exactly that around 15 years ago, a game renderer using div strips, first with Wolfenstein and then Doom. It may have been "Jacob Seidelin" who was very active experimenting with early HTML5 tech, but I've lost all links or they've vanished from the web - I only keep two screenshots I used in a lecture back then.