Optimising cost, quality, latency
There is a trifecta of variables that just about everyone building generative AI systems cares about — cost, quality, and latency. As is often the case in computing, there is a trade-off between these properties. And all three are bought with the currency of performance.