Please see my response there. It's not as simple as you're implying; often your throughput suffers if you can't handle requests fast enough. MICA, for instance, injects packets directly into L3 to avoid cache misses; it's important that MICA handle these requests as fast as possible because otherwise the L3 cache will fill up, the queues will start to back up, and ultimately they'll put backpressure on incoming packets and slow down the whole system (greatly decreasing throughput). The fact that it has tail latencies measured in microseconds is good on its own, but its real benefit is its effect on throughput!