The only thing I dislike is the blending across the scanlines (i.e., in the vertical dimension). This is highly inaccurate to how contemporary displays handled the image. Scanlines were completely discrete in the signal, and the only vertical blending that happened was inside the CRT itself, as the electrons excite the phosphors around the desired point on the screen. In fact, with regards to 320x200 mode, in the VGA era this was almost always scandoubled to 400 lines, meaning that sharp unblended scanlines were actually more prominent (especially considering the smaller dot pitch of the average VGA CRT versus the average consumer CRT TV). Moreover, prior to VGA's introduction, scanlines were even more physically separated in 15kHz 200-line modes, and black gaps were often present (the simulation of which is out of scope of mere aspect correctness, and gets into actual CRT emulation).
The 5:6 nearest neighbor idea is the ideal, as it requires no interpolation. However, outside of that, I feel that the best approach to take is to scale vertically in a nearest neighbor fashion, and then scale horizontally to the proper aspect-correct width using a suitable interpolation algorithm (I'm partial to simple bilinear at this step, as it provides a conceptual representation of the electron gun rising or falling in energy to match the target signal, which effectively provides a bit of horizontal blending). To adjust the sharpness of those transitions a bit, you can first horizontally integer scale to any multiple of the base stored resolution, and then scale to the desired target resolution. Going directly from stored to target resolution may present a quite soft image (which may be desired, depending on the characteristics of the contemporary display that would have shown the image), and the returns diminish quite quickly above 3x width in the interim step, but it does provide a small degree of control.
The 5:6 nearest neighbor idea is the ideal, as it requires no interpolation. However, outside of that, I feel that the best approach to take is to scale vertically in a nearest neighbor fashion, and then scale horizontally to the proper aspect-correct width using a suitable interpolation algorithm (I'm partial to simple bilinear at this step, as it provides a conceptual representation of the electron gun rising or falling in energy to match the target signal, which effectively provides a bit of horizontal blending). To adjust the sharpness of those transitions a bit, you can first horizontally integer scale to any multiple of the base stored resolution, and then scale to the desired target resolution. Going directly from stored to target resolution may present a quite soft image (which may be desired, depending on the characteristics of the contemporary display that would have shown the image), and the returns diminish quite quickly above 3x width in the interim step, but it does provide a small degree of control.