CNN models being "flaky" on GeForce hardware isn't something I've heard of? NVIDIA has made some deliberate decisions to make GeForce cards less attractive for deep learning in terms of performance, but I don't think making them produce incorrect results is in their best interest. What hardware did you test this against?