If you want better latency and throughput, you wouldn't be using the kernel network stack and instead be opting for some userspace networking stack like DPDK or onload.
Depends obviously on what the bottlenecks of your application are, your NIC and the characteristics of your hardware as well.
True, and the Linux kernel has zero-copy AF_XDP that enable memory to be shared with userspace. However, low-latency networking is a lot more than just simple kernel bypass.
It's things like pinning cpu cores dedicated for networking, disabling C-states, epolling and being able to utilize bespoke firmware interfaces designed for smartnics. Also application protocol, ie using features like TCP checksum offload and TSO.
Heck the application would also need to be adjusted for a low-latency environment via probably a custom JVM and doing things like reading data structures/variables to ensure they are in CPU cache.
Frankly I would recommend trying openonload which at least is compatible with native Linux socket programming unlike DPDK.
If you want better latency and throughput, you wouldn't be using the kernel network stack and instead be opting for some userspace networking stack like DPDK or onload.
Depends obviously on what the bottlenecks of your application are, your NIC and the characteristics of your hardware as well.