I only heard of this today and thought those seriously interested in ML might also be interested.
You can use small LLMs as proxies to affect the outputs of large LLMs only bearing the cost of fine tuning the small LLMs. Quite a brilliant insight actually. (This a blog post that anyone can read, not a research paper.)