Interesting! Do you think RLHF would be a necessity for smaller models to perfor...

Interesting! Do you think RLHF would be a necessity for smaller models to perform as par as state-of-the-art LLMs? In my view, instruction tuning will resolve any isssues related to output structure, tonality or the domain understanding but will it be enough to improve the reasoning capabilities of the smaller model?