Caching Input with Google Gemini

A little over a month ago, Google announced multiple updates to their GenAI platform. I made a note of it for research later and finally got time to look at one aspect – context caching.
When you send prompts to a GenAI system, your input is tokenized for analysis. While not a "one token per word" relation, basically the bigger the input (context) the more the cost (tokens). The process of converting your input into tokens takes time, especially when dealing with large media, for example, a video. Google introduced a "Context caching" system that helps improve the performance of your queries. As the docs suggest, this is really suited for cases where you’ve got... more →
Posted in: JavaScript