Building a UI for Gemini File Stores

Back in November of last year I wrote up a blog post talking about a new (at the time) Google Gemini feature, File Stores: "Gemini File Search and File Stores for Easy RAG". In that post I discussed what it was, how it worked, and built up a simple example. You should definitely read that post first, but if you want the TLDR, here ya go: File Stores (referred to as "File Search") expands on Gemini’s previous ability to work on files in a temporary fashion by allowing you to create a permanent "store" of folders. You can use this for RAG systems and use flexible metadata filter for complex queries. This feature has been out for a few months now and I’ve... more →
Posted in: JavaScript

Gemini File Search and File Stores for Easy RAG

I am really excited about this post as it’s one of the most powerful changes I’ve seen to Google’s Gemini APIs in quite some time. For a while now it’s been really easy to perform searches against a document, or a group of documents. You would upload the file (or files), ask your questions, and that was all you needed. However, the files you uploaded were only there temporarily. This was fine for processes like summarization or categorization where you could automate the process and be done with it. This was also fine for basic chat uses. I blogged an example of this last month: "Building a Document Q&A System with Google Gemini". The new features I’m... more →
Posted in: JavaScript

Building a Document Q&A System with Google Gemini

Document summarization is a powerful and pretty darn useful feature of generative AI, but a proper "question and answer" system can really enable users to interact with a document. This is why you see various document viewing apps, like Acrobat, adding these features to their programs. I thought I’d take a look at building such a system via a simple web app to see how difficult it would be, and honestly, it wasn’t that bad. Having this in your own web app, versus an external vendor, gives you more control over the experience as well. Here’s what I built. The Stack The web app lets you drag and drop a PDF into the page, it then renders a preview of the PDF on the left... more →
Posted in: JavaScript

Generative Images with Gemini (New Updates)

Back in January of this year, I wrote up my experience testing out Google’s Imagen 3 APIs to generate dynamic images. A few days ago, Google updated their support with new experimental support in Flash. I’ve been playing with this the last few days and have some code and samples to share with you, but before that, what exactly changed? Gemini and Imagen 3 There are now two different models, and different APIs, to generate images with Google’s AI platform. The new one is Gemini 2.0 Flash Experimental and the previous one (the one covered in my blog post) is Imagen 3. Of course the next question is, why two, and what do you pick? The docs do a great job of explaining the differences,... more →
Posted in: JavaScript

Parsing Uploaded Resumes into Form Fields with Google Gemini

As I’ve recently become somewhat familiar with job application sites (sigh, thanks Adobe), I’ve noticed an interesting feature some sites use. After selecting your resume to upload, they will parse the resume and either offer to, or automatically, fill in some of the form fields of the application for you. I thought it would be interesting to try this myself making use of Google’s Gemini APIs. Here’s what I discovered. The Test Script As always, I began with a script that would take a hard-coded resume and attempt to parse it. For the most part, this is basic "upload a file and ask the AI to talk about", but in my case, I wanted a very particular set of data... more →
Posted in: JavaScript

Generative AI Images with Gemini and Imagen – an Introduction

I’ve been waiting for this to launch for a few days now, and while technically this isn’t quite yet available in Gemini, only Vertex, it should be testable in Gemini in the very short term. You can now use Google’s APIs to generate really high quality images via their Imagen 3 technology. I’ve got a few blog posts planned that will demonstrate these features (and from what I’ve been told, even more powerful stuff is coming), but I thought I’d start off today with a simple short example. To begin, and remember this may not available just yet, take a look at the docs, Imagen 3 in the Gemini API. First, let’s consider the sample code, that I’m going... more →
Posted in: JavaScript

Automating Object Detection with Google Gemini GenAI and Pipedream

For my last technical post of the year (although I can’t promise I’ll stop blogging!), I wanted to share an interesting workflow I built using Google Gemini and Pipedream. The idea was somewhat simple – how difficult would it be to build a "general purpose" workflow to look for objects in images and trigger an alert if certain things were found. Here’s what I was able to build. Step One – Image Input In my mind, I imagined this workflow would be tied to some service that was either streaming in video or generating still images. You could image a security camera posting new pictures every 30 seconds or so, or some other system that takes a picture at a regular... more →
Posted in: JavaScript

Adding AI Insights to Data with Google Gemini

Yesterday, Elizabeth Siegle, a developer advocate for CLoudflare, showed off a really freaking cool demo making use of Cloudflare’s Workers AI support. Her demo made use of WNBA stats to create a beautiful dashboard that’s then enhanced with AI. You can find the demo here: https://wnba-analytics-ai-insights.streamlit.app/ I found this incredibly exciting. I last looked at Cloudflare’s AI stuff almost an entire year ago ("Using Cloudflare’s AI Workers to Add Translations to PDFs"), and I haven’t quite had a chance to try it again, mostly because I’ve been focused on Google Gemini for my Generative AI work. From an API/usage perspective, Cloudflare’s... more →
Posted in: JavaScript
1 2 3