Google AI Edge Gallery: How to Run AI Models on your Phone Locally

Artificial intelligence has become more accessible and now on your mobile phone. Google’s latest experimental offering, the AI Edge Gallery, lets you run any generative model locally on your Android device. All you need to do is download your favorite AI model and load it from the new Google AI Edge Gallery app. See how to download and install AI Edge Gallery APK and run generative AI locally.

For Windows, Linux and macOS, we have Ollama which lets you run any AI model locally. However, running generative AI on a phone was quite difficult as it required technical knowledge. That is, until AI Edge Gallery was introduced, which enables the download and execution of AI models on a mobile device. Currently, only an Android APK is available. For iOS, you may need to wait a while.

Subscribe to Posts by Email

What is Google AI Edge Gallery?

Google AI Edge Gallery is mobile app that democratizes access to cutting-edge generative AI by running sophisticated language models entirely on your device. Unlike traditional AI applications that require constant internet connectivity, cloud processing, an account, and a website, this innovative platform brings the power of large language models directly to your smartphone, enabling fully offline AI interactions once models are downloaded.

The app serves multiple purposes within the AI ecosystem. For developers, it acts as an inspiration hub showcasing the possibilities of Google AI Edge technology. For researchers and AI enthusiasts, it provides a hands-on platform to experiment with different models and compare their performance. Most importantly, for everyday users, it offers an accessible gateway to explore advanced AI capabilities without technical barriers.

This makes it even more useful as you can run local AI on any virtual box or Android emulators as well. Here are some of the key features of AI Edge Gallery:

Run Locally, Fully Offline: Experience the magic of GenAI without an internet connection. All processing happens directly on your device.
Choose Your Model: Easily switch between different models from Hugging Face and compare their performance.
Ask Image: Upload an image and ask questions about it. Get descriptions, solve problems, or identify objects.
Prompt Lab: Summarize, rewrite, generate code, or use free form prompts to explore single-turn LLM use cases.
AI Chat: Engage in multi-turn conversations.
Performance Insights: Real-time benchmarks (TTFT, decode speed, latency).
Bring Your Own Model: Test your local LiteRT .task models.
Developer Resources: Quick links to model cards and source code.

Core Features That Set It Apart

Fully Offline Operation

The most compelling feature of Google AI Edge Gallery is its ability to function completely offline. Once you’ve downloaded your preferred AI models, all processing occurs locally on your device. This approach offers several advantages including enhanced privacy, reduced data usage, and consistent performance regardless of internet connectivity. Your conversations, image analyses, and prompt interactions remain entirely private, never leaving your device.

Flexible Model Selection

The app integrates seamlessly with Hugging Face’s extensive model repository, allowing users to browse, download, and experiment with various AI models. This flexibility enables direct performance comparisons between different models, helping users identify which works best for their specific use cases. The ability to switch between models provides insights into how different AI architectures handle various tasks.

Diverse AI Capabilities

Google AI Edge Gallery organizes its functionality into distinct modules, each designed for specific AI interactions:

Ask Image transforms your device into a powerful visual AI assistant. Upload any image and engage in natural language conversations about its contents. Whether you need detailed descriptions, want to solve visual problems, or require object identification, this feature demonstrates the impressive capabilities of multimodal AI models running locally.

Prompt Lab serves as your creative AI workspace, perfect for exploring single-turn interactions with language models. Use it to summarize lengthy documents, rewrite content for different audiences, generate code snippets, or experiment with custom prompts. This feature particularly appeals to content creators, writers, and developers seeking AI assistance for specific tasks.

AI Chat enables extended conversations with AI models, supporting multi-turn dialogues that maintain context throughout the interaction. This feature showcases how on-device AI can provide sophisticated conversational experiences without relying on cloud infrastructure.

Performance Monitoring

The app includes comprehensive performance insights, displaying real-time metrics such as Time to First Token (TTFT), decode speed, and overall latency. These benchmarks help users understand how different models perform on their specific device hardware, enabling informed decisions about model selection based on performance requirements.

Custom Model Support

Advanced users can import their own LiteRT-compatible models in the .task file format, extending the app’s capabilities beyond the curated model selection. This feature appeals to researchers and developers working on specialized AI applications who want to test their custom models in a user-friendly environment.

Technical Foundation

The Google AI Edge Gallery builds upon several sophisticated technologies that enable its impressive capabilities:

Google AI Edge provides the core APIs and development tools necessary for on-device machine learning. This framework optimizes AI models for mobile hardware while maintaining performance and accuracy.

LiteRT serves as the lightweight runtime engine that executes AI models efficiently on mobile devices. This technology ensures optimal resource utilization while maintaining responsive performance across different Android devices.

LLM Inference API powers the on-device large language model capabilities, enabling sophisticated natural language processing without cloud dependencies.

Hugging Face Integration facilitates seamless model discovery, download, and management, connecting users to the broader AI community’s model ecosystem.

Getting Started: Download Google AI Edge Gallery

Beginning your journey with Google AI Edge Gallery requires minimal effort. Simply download the latest APK from the project’s GitHub releases page and install it on your Android device. The app features an intuitive interface designed for users of all technical backgrounds.

Download Google AI Edge Gallery APK (direct link) (121MB)
Upon launching the app, you’ll encounter a clean grid interface showcasing available AI capabilities including (e.g., “Prompt Lab”, “AI Chat”, “Ask Image”)
Selecting any capability presents you with compatible models. The default model is Google’s own Gemma 3 (currently) or Qwen 2.5 (currently)
Download whichever AI model suites you.
For models requiring special permissions, such as Gemma 3, the app seamlessly guides you through Hugging Face authentication, ensuring you comply with model-specific license requirements while maintaining security.

How to Run Any Generative AI Model Locally on your Phone using AI Edge Gallery?

In order to run any generative AI model locally, follow the steps below.

Download any LiteRT .task model from Hugging Face or Kaggle Models.
Transfer the.task model file from your computer into the Download folder to your phone
Open the Google AI Edge Gallery app.
On the main screen, look for a “+” (plus) icon located at the bottom-right corner of the screen. Tap it.
A file picker should appear. Select the .task file you pushed.
An “Import Model” dialog will appear. Configure default model parameters, and if it’s a multimodal model, check “Support image” and specify CPU/GPU preference.
Tap “Import”.
The model will then appear in your model list and can be used like any other model.

Google AI Edge Gallery in making advanced AI accessible, private, and practical for mobile users. The upcoming iOS version will extend these capabilities to Apple users, further expanding the reach of on-device AI. As mobile hardware continues improving, we can expect even more sophisticated models to run efficiently on smartphones and tablets.

Download Google AI Edge Gallery: Run Any Generative AI Model Locally on your Phone