Optimising cost, quality, latency

There is a trifecta of variables that just about everyone building generative AI systems cares about — cost, quality, and latency. As is often the case in computing, there is a trade-off between these properties. And all three are bought with the currency of performance.

31 Oct 2023 Â· 2 min Â· Gianluca Truda