Creating Images with Generative AI via Conversation

Last week, I blogged about updates to Google’s Gemini APIs in regards to image generation. That post detailed how there are now two models for generating images with the experimental Gemini Flash model having a nice free tier. One of the interesting features of the API is the ability to edit existing images, in other words, pass an image to Gemini and via a prompt, have Gemini update it. I thought it would be kind of fun to see if I could build a ‘chat’ interface for this model, one where you could simply talk to Gemini and have it work on your image along with you. Now to be clear, this is no different than what you can do now at the Gemini website, but I figured it would give... more →
Posted in: JavaScript

Generative Images with Gemini (New Updates)

Back in January of this year, I wrote up my experience testing out Google’s Imagen 3 APIs to generate dynamic images. A few days ago, Google updated their support with new experimental support in Flash. I’ve been playing with this the last few days and have some code and samples to share with you, but before that, what exactly changed? Gemini and Imagen 3 There are now two different models, and different APIs, to generate images with Google’s AI platform. The new one is Gemini 2.0 Flash Experimental and the previous one (the one covered in my blog post) is Imagen 3. Of course the next question is, why two, and what do you pick? The docs do a great job of explaining the differences,... more →
Posted in: JavaScript

Building a Resume Review and Revise System with Generative AI and Flask

The last two sessions of my live stream, Code Break, have been really interesting, at least to me anyway. I’ve been discussing generative AI with Google Gemini and building a relatively simple example while doing so – a resume review and revisement system. This started off pretty simply with a Python script and then iterated into a proper Flask app. I thought it would be fun to document the code here a bit and share it with those who couldn’t make the streams. If you would rather just watch the recordings, I’ve got them embedded at the bottom. Feel free to skip to that. Step One – The Script For my first iteration, I built a simple Python script that: Uploaded the... more →
Posted in: JavaScript

Doing Evil Things with Generative AI and Recipes

Let me preface this blog post with a very clear and direct message. Do not do what I did. This is a bad use of generative AI. This is pure silliness with no real practical value whatsoever. This is a really, really, bad idea. But it was fun as hell, so here goes. Last year I did two investigations into recipe parsing on the web. As we all know, most recipe sites go out of their way to make the actual recipe, you know, the thing you want to read, obfuscated and buried beneath a lot of stuff that is… well not the actual recipe. I first investigated JSON-LD and using that to parse web recipes into data: Scraping Recipes Using Node.js, Pipedream, and JSON-LD. This worked really well. I then... more →
Posted in: JavaScript

Generative AI Images with Gemini and Imagen – an Introduction

I’ve been waiting for this to launch for a few days now, and while technically this isn’t quite yet available in Gemini, only Vertex, it should be testable in Gemini in the very short term. You can now use Google’s APIs to generate really high quality images via their Imagen 3 technology. I’ve got a few blog posts planned that will demonstrate these features (and from what I’ve been told, even more powerful stuff is coming), but I thought I’d start off today with a simple short example. To begin, and remember this may not available just yet, take a look at the docs, Imagen 3 in the Gemini API. First, let’s consider the sample code, that I’m going... more →
Posted in: JavaScript

Using Generative AI to Help in Customer Service

Ok, before I begin, let me be absolutely clear. I do not think AI can replace customer support. I do think it can supplement and help customer support however, and I’d like to share an example of what this could look like. Imagine your service has a customer service form or email address. Typically, you type in your question, send it off, and wait. But what if you could provide an AI generated answer immediately while the person waits? At worse it doesn’t help. At best, it could be exactly what they need and the request could be terminated saving everyone time. Let’s consider an example of this. Setting up the AI/RAG System Let’s start with the most complex part, the AI/RAG... more →
Posted in: JavaScript

Classifying Documents with Generative AI

Generative AI and documents is a fairly common topic these days, typically in the form of creating summaries or asking questions about the documents. I was curious how generative AI could help in terms of classification. Way back in January of this year, I blogged about using Google’s Gemini API to classify images based on whether they were a photo, screenshot, or meme: "Using GenAI to Classify an Image as a Photo, Screenshot, or Meme". That actually worked well and I thought perhaps it could work with text as well. Specifically: Your organization gets an influx of documents, lets say many per day… And you would like to categorize them for sorting/processing later Before... more →
Posted in: JavaScript

The Twelve (Generative) Days of Christmas – 2024 Edition

Last year I did a fun little experiment where I asked a few different generative models to generateimages based on the classic Twelve Days of Christmas song. For those unfamiliar, the song is about a series of gifts given over twelve days: partridge in a pear treetwo turtle dovesthree French hensfour calling birdsfive golden ringssix geese a-layingseven swans a-swimmingeight maids a-milkingnine ladies dancingten lords a-leapingeleven pipers pipingtwelve drummers drumming To be clear, this was done for fun, nothing more. Also, the prompts were literally just the lyrics, nothing more (with some exceptions, see the details below). In a ‘real world’ example if you wanted to generate images... more →
Posted in: JavaScript
1 2 3