Using Generative AI to Improve Image Filenames

Last night I had an interesting thought. Many times I work with images that have vague filenames. For example, screenshot_1_24_12_23.jpg. Given that there are many APIs out there that can look at an image and provide a summary, what if we could use that to provide a better file name based on the content of the image? Here’s what I was able to find. As always, I began by prototyping in Google AI Studio. I apologize for stating this in basically every post on the topic, but I really want to stress how useful that is for development. I used a very simple prompt: Write a one sentence short summary of this image. The sentence should be no more than five words. And then did a quick test: If... more →
Posted in: JavaScript

Using Generative AI to Detect Cat Breeds

Let’s be honest, what other use is there for generative AI than working with cats? If you read my previous post on Google’s Gemini AI launch, you may have seen my test prompts asking it to identify the kind of cat shown in a picture. I decided to turn this into a proper web application as a real example of the API in action. Here’s what I came up with. The Front End # For the front end, I decided to make use of a native web platform feature to access the user’s camera via a simple HTML form field. By using capture="camera" on an input tag, you directly get access to the device camera. There are more advanced ways of doing this, but for quick and simple, it works... more →
Posted in: JavaScript

The Twelve (Generative) Days of Christmas

I tend to have a lot of silly ideas. Not useful ideas. Not good ideas. Silly ideas. Randomly yesterday I was thinking about the Twelve Days of Christmas song. If you aren’t familiar with it, it starts off with a gift for one day, then repeats and adds a second day, and so on and so on. The gifts are: partridge in a pear treetwo turtle dovesthree French hensfour calling birdsfive gold ringssix geese a-layingseven swans a-swimmingeight maids a-milkingnine ladies dancingten lords a-leapingeleven pipers pipingtwelve drummers drumming I thought – what if I took each of these phrases and dropped them into an AI image generator? I did, and the results were… kinda fun. Before I show... more →
Posted in: JavaScript

Creating Human-Readable Summaries of Data with Google PaLM Generative AI

Like a lot of folks, I’ve been spending a lot of time thinking about generative AI, and AI, in general, and oddly (well for me), trying to focus on productive uses for it when working with APIs. A few weeks ago I shared my initial impressions of Google’s PaLM 2 API, and today I came up with an interesting use case for it. I’ve seen text summarization as a fairly common use case for gen AI, and I agree, it can be incredibly helpful when working with lots of text. However, I got to thinking today, would it be possible to use this as a way to summarize numerical, or other data? So given some process that returns a set of information, can we use gen AI to summarize it? Here’s... more →
Posted in: JavaScript
1 2