JSON Results with Google Gemini Generative AI API Calls

Forgive the somewhat alliterative title there, but today’s post covers something that’s been on my mind since I started playing with Google Gemini, specifically, how to get the results of your API calls in JSON. To be clear, the REST API returns a result in JSON, but I’m talking about the content of the result itself. Before I continue, a quick shot out to Allen Firstenberg who has been helping me off and on with Google Gemini stuff. Anything I get wrong though is entirely my fault. 😜 Ok, so before I go on, let’s look at a typical result. Take a prompt like so: "What is the nature of light". Pass this to Gemini via the API, and the result you get, once you... more →
Posted in: JavaScript

Using PDF Content with Google Gemini

Back in February Google announced Gemini 1.5, their latest, most powerful language model, and while access has been open via AI Studio, API access has only been available in the past few days. I thought I’d try out the new model and specifically make use of the larger context window to do prompts on PDF documents. I discussed something similar earlier this year(("Using AI and PDF Services to Automate Document Summaries")[https://www.raymondcamden.com/2024/01/08/using-ai-and-pdf-services-to-automate-document-summaries]) which made use of Diffbot, so I thought it would be interesting to build a similar experience with the Gemini API. At a high level, it’s not too difficult: Begin... more →
Posted in: JavaScript

Google Gemini 1.5 Announced (and more new features)

In general I don’t tend to blog about stuff that isn’t quite out yet, but as I’ve got early access (and permission to share), and as it’s pretty darn cool, I thought I’d share. Plus, some of the new stuff is available to everyone, so you can try it out as well! Today, Google introduced its newest language model, Gemini 1.5. You can, and probably should, read the marketing/nicely polished intro by Google here, but I thought I’d share some highlights and examples here. I’ve had access to this for a grand total of four hours so please consider this my first initial impressions. As the title says, this is not yet released, but you can sign up for the waitlist... more →
Posted in: JavaScript

Google Gemini as Your Dungeon Master

So this is absolutely just another example of me playing around too much, but I had to share. As I mentioned in my post yesterday, Google’s AI Studio now supports uploading files and working with them in your prompt. Today I decided to give the Chat interface a try as I hadn’t yet played with it. On a whim, I googled for "dungeons and dragons rules PDF" and… well, you won’t believe what happened next. (Sorry, I couldn’t resist.) First off, the most important thing to note if you want to test with PDFs, ensure that they are OCRed. Right now AI Studio does not handle that well, but it should be corrected in the future. My Google search turned up the PDF here,... more →
Posted in: JavaScript

Google Gemini and AI Studio Launch

While it feels like just yesterday I first blogged about Google’s PaLM APIs and MakerSuite, it was actually over two months ago, and of course, GenAI offerings are iterating and improving at lightning speed. In the past week, Google has announced Gemini, their new generative AI model. Naturally, I was curious about the API aspect of this and took a quick look. MakerSuite rebranded as AI Studio # First off, the web UI (which I reviewed back in my first post) has been renamed to the generic and boring, but probably more enterprise and appropriate, AI Studio. Along with that, when creating new prompts, it will default to use Gemini models. (You can still select PaLM if you want.) Another change…... more →
Posted in: JavaScript