Using Generative AI to Improve Image Filenames

Last night I had an interesting thought. Many times I work with images that have vague filenames. For example, screenshot_1_24_12_23.jpg. Given that there are many APIs out there that can look at an image and provide a summary, what if we could use that to provide a better file name based on the content of the image? Here’s what I was able to find. As always, I began by prototyping in Google AI Studio. I apologize for stating this in basically every post on the topic, but I really want to stress how useful that is for development. I used a very simple prompt: Write a one sentence short summary of this image. The sentence should be no more than five words. And then did a quick test: If... more →
Posted in: JavaScript

Using GenAI to Classify an Image as a Photo, Screenshot, or Meme

File this under the "I wasn’t sure if it would work and it did" category. Recently, a friend on Facebook wondered if there was some way to take a collection of photos and figure out which were ‘real’ photos versus memes. I thought it could possibly be a good exercise for GenAI and decided to take a shot at it. As usual, I opened up Google’s AI Studio and did a few initial tests: I then simply removed that image and pasted more info to test. From what I could see, it worked well enough. I then took the source code from AI Studio and began working. The Code # First, I grabbed some pictures from my collection, eleven of them, and tried to get a few photos, memes,... more →
Posted in: JavaScript

Using AI to Beat TimeGuessr

I am currently working on a project which requires me to identify the locations depicted in works of art (more about this, hopefully, very soon). In order to narrow down the exact locations shown in the paintings I have begun to use two AI image identification tools GeoSpy and Bard.The sketch shown at the top of this post is the ‘Tour de Montelban, Amsterdam’ by Maxime Lalanne. I downloaded Maps Mania… more →
Posted in: Interactive Maps

Using AI and PDF Services to Automate Document Summaries

I first discovered Diffbot way back in 2021 when I built a demo of their APIs for the Adobe Developer blog ("Natural Language Processing, Adobe PDF Extract, and Deep PDF Intelligence"). At that time, I was impressed with how easy Diffbot’s API was and also how quickly it responded. I had not looked at their API in a while, but a few days ago they announced new support for summarizing text. I thought this would be a great thing to combine with the Adobe PDF Extract API. Here’s what I found. First off, if you want to try this yourself, you’ll need: Adobe PDF Services credentials. These are free and you get 500 transactions per month for free. For folks who may not know,... more →
Posted in: JavaScript

Using Generative AI to Detect Cat Breeds

Let’s be honest, what other use is there for generative AI than working with cats? If you read my previous post on Google’s Gemini AI launch, you may have seen my test prompts asking it to identify the kind of cat shown in a picture. I decided to turn this into a proper web application as a real example of the API in action. Here’s what I came up with. The Front End # For the front end, I decided to make use of a native web platform feature to access the user’s camera via a simple HTML form field. By using capture="camera" on an input tag, you directly get access to the device camera. There are more advanced ways of doing this, but for quick and simple, it works... more →
Posted in: JavaScript

Using IndexedDB with Alpine.js

A lot of my "x with Alpine" blog posts end up being, well, nothing special. That’s a good thing I suppose as it really helps highlight how simple Alpine.js is. (Note, I go back and forth between including the ".js" when referring to Alpine. I should be more consistent I suppose. On one hand, Alpine.js is the formal name, but Alpine just feels simpler.) That being said, the impetus for this post was to get something basic done before I built something a bit more complex. So if you wish to TLDR – it just works, visit my CodePen for the full source, and come back for the next post. If you’re still curious, keep on reading. IndexedDB – Vanilla or Library?... more →
Posted in: JavaScript

Using Cloudflare’s AI Workers to Add Translations to PDFs

Late last month, Cloudflare announced new AI features in their (already quite stellar)Workers platform. I’ve been a big fan of their serverless feature (see my earlier posts) so I was quite excited to give this a try myself. Before I begin, I’ll repeat what the Cloudflare folks said in their announcement: "Usage is not currently recommended for production apps". So with that in mind, remember that what I’m sharing today may change in the future. The Demo # Before I get into the code, let me share what I’ve built. Now, at the time I wrote this, Cloudflare’s AI stuff was still in beta and there is no cost yet for using the features. This is, obviously, going... more →
Posted in: JavaScript

Using Google PaLM to Gather Sentiment Analysis on a Forum

I’ve really been enjoying working with Google’s PaLM 2 AI API and this week I used it to build a pretty interesting demo I think. What if we could use the generative AI features of PaLM to determine the ‘sentiment’ or general health of a forum? I was able to do so and I think the results are pretty interesting. I’ll remind my readers I’m still fairly new to this, so please reach out if you’ve got suggestions on how to do this better, or found any big mistakes in my implementation. Ok, let’s get started! Sentiment Analysis # In my first post on Google’s PaLM API, I talked about how their "MakerSuite" was a really cool web-based UI... more →
Posted in: JavaScript
1 2 3 4 6