Summarizing Docs with Built-in AI

Back in January of this year, I blogged about on-device summarization of PDFs: Summarizing PDFs with On-Device AI . In that post, I made use of Chrome’s Summary API and PDF.js to create summaries of PDFs completely within the browser. I thought I’d take a look at extending that demo into more document types, specifically Office. And even more specifically – Word, Excel, and PowerPoint. Here’s what I came up with. officeParser FTW So here comes the fun part. Last weekend I had this demo completely done using a few different libraries. Then – earlier this week one of the developer newsletters I subscribe to shared officeParser. This nifty library handles Office, PDF,... more →
Posted in: JavaScript

Testing OCR with Chrome Built-in AI

Sorry for the lack of posting this month. I’m on the way back home from speaking at CodeStock so I’ve been on the road a bit, and work has been incredibly busy (which is good!) so my usual blog cadence has slipped a bit. Luckily I had a great question in my session on Chrome’s Built-in AI which led to a bit of investigating last night. The question involved how well Chrome’s AI could do OCR on an image. I had a demo in my presentation showing using AI to describe an image and another to generate a list of tags, but not one specifically for OCR. Here’s what I found. Oh, before I get into the code – remember that as of the time I’m writing this, the Prompt... more →
Posted in: JavaScript

Using Chrome’s Built-in AI to Improve AI Prompts

Props for this article go to my best friend, Todd Sharp, who yesterday said something along the lines of, "Hey Ray, you should blog a demo of …" which is pretty much akin to bring out a laser pointer in front of a cat. Not only do I love getting ideas for new demos, his idea was actually pretty freaking brilliant, which means I get to pretend I’m brilliant as well. His idea was this: Given a user created prompt meant to be shipped off to a "proper" (i.e. maybe expensive) Generate AI API, can we use tools to help improve the prompt and make it "cheaper" before used. Given we’ve got AI in the browser via Chrome (ok, we will have it soon), this seemed... more →
Posted in: JavaScript

Adding Programming Language Detection with Built-in Chrome AI

As I’ve been playing, and thinking, more and more about how to best add Chrome AI support to web apps, I came across an interesting use-case that I think could be helpful, and like in my previous examples, be completely ok if it didn’t actually work. When I write on the developer blog at Foxit, I make use of WordPress plugin for code samples. This editor has a place for you to both paste in your code, and select the language so the proper highlighter is used: This works well enough, but it gets a bit annoying to have to constantly keep selecting Python in the dropdown. Ideally the form would use the last language (simple enough via LocalStorage), but I was curious how well Chrome’s... more →
Posted in: JavaScript

Getting Image Insights with Built-in Chrome AI and EXIF Data

It’s been a busy few weeks for Chrome’s Built-in AI support. Since the last time I blogged about it, four features have gone GA (which still means they are Chrome only but not behind a flag anymore): Translator Summarizer Language Detector Prompt API (for extensions only) And while announced back at the end of May, Gemma 3n as a model is available in Canary, Dev, and Beta Chrome builds. To be clear, the percentage of folks who can use these new features is still really low, but all of these features also work really well in progressive enhancement, and can be backed up by server calls to an API if need be. I continue to be really excited about the possibilities these APIs unlock,... more →
Posted in: JavaScript

Multimodal Support in Chrome’s Built-in AI

It’s been a few weeks since I blogged about Chrome’s built-in AI efforts, but with Google IO going this week there’s been a lot of announcements and updates. You can find a great writeup of recent changes on the Chrome blog: "AI APIs are in stable and origin trials, with new Early Preview Program APIs". One feature that I’ve been excited the most about has finally been made available, multimodal prompting. This lets you use both image and audio data for prompts. Now, remember, this is all still early preview and will likely change before release, but it’s pretty promising. As I’ve mentioned before, the Chrome team is asking folks to join the EPP (early... more →
Posted in: JavaScript
1 2