Using the Gemini File API for Prompts with Media

Using media in your prompts (what’s called ‘multimodal’) with the Gemini API is fairly simple in small cases. You can encode your input with base64 and pass it along with your prompt. While this works well, it’s got limitations that may be quickly hit – most specifically a file size limit of 20 megs. A few months ago, I shared a demo of using your device’s camera to detect cat breeds. With today’s cameras taking incredibly detailed pictures, I hit that limit right away and had to write some code to resize the image to a smaller size. Luckily, the Gemini API has a better way of handling that, the File API.
The File API #
This API provides a separate method... more →
Posted in: JavaScript