Google’s AI Edge Gallery will let developers deploy offline AI models — here’s how it works

Google’s AI Edge Gallery will let developers deploy offline AI models — here’s how it works

A curated hub for on-device AI

Google’s AI Edge Gallery is built on LiteRT (formerly TensorFlow Lite) and MediaPipe, optimized for running AI on resource-constrained devices. It supports open-source models from Hugging Face, including Google’s Gemma 3n — a small, multimodal language model that handles text and images, with audio and video support in the pipeline.

The 529MB Gemma 3 1B model delivers up to 2,585 tokens per second during prefill inference on mobile GPUs, enabling sub-second tasks like text generation and image analysis. Models run fully offline using CPUs, GPUs, or NPUs, preserving data privacy.

The app includes a Prompt Lab for single-turn tasks such as summarization, code generation, and image queries, with templates and tunable settings (e.g., temperature, top-k). The RAG library lets models reference local documents or images without fine-tuning, while a Function Calling library enables automation with API calls or form filling.

0 Shares:
Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like
My 2024 Black Friday Sale Picks
Read More

My 2024 Black Friday Sale Picks

Hey everyone, especially those in the US. Happy Thanksgiving and, of course, Happy Black Friday and Cyber Monday.…