Caching Input with Google Gemini

A little over a month ago, Google announced multiple updates to their GenAI platform. I made a note of it for research later and finally got time to look at one aspect – context caching. When you send prompts to a GenAI system, your input is tokenized for analysis. While not a "one token per word" relation, basically the bigger the input (context) the more the cost (tokens). The process of converting your input into tokens takes time, especially when dealing with large media, for example, a video. Google introduced a "Context caching" system that helps improve the performance of your queries. As the docs suggest, this is really suited for cases where you’ve got... more →
Posted in: JavaScript

Adding Caching to a Cloudflare Worker

Last week I blogged about my first experience building a Cloudflare Worker serverless function. In that post, I built a simple serverless function that wrapped calls to the Pirate Weather API, a free and simple-to-use API for getting weather information. For today’s post, I thought I’d show how easy it is to add a bit of caching to the worker to help improve its performance. As with my last post, I’ve also got a video walkthrough of everything you watch instead. (Or read and watch, go crazy!) The Application # In the last post, I shared the complete code of the Worker, but let me share it again: // Lafayette, LAconst LAT = 30.22;const LNG = -92.02;export default { async fetch(request,... more →
Posted in: JavaScript