Google AI Edge Gallery: On-Device AI on Your Phone (iOS & Android)

On April 2, 2026, alongside Gemma 4, Google released AI Edge Gallery—the premier destination for running frontier open-source Large Language Models on your mobile device. The iOS app was released in February 2026, expanding from Android.

All model inferences happen directly on your device hardware. No internet is required, ensuring total privacy for your prompts, images, and sensitive data.

Your phone is no longer just an AI terminal connecting to cloud services—it's becoming an autonomous AI device.

What is AI Edge Gallery?

AI Edge Gallery is Google's experimental app (released March 2025, 500K+ downloads) that lets you run open-source LLMs on mobile devices. The April 2026 update brought official Gemma 4 support, making it the most capable on-device AI experience available.

Core Features

Feature	Description
Agent Skills	Extend LLMs with tools: Wikipedia, maps, web search, custom skills
Thinking Mode	Visualize the model's reasoning process
Prompt Lab	Test prompts with granular control over temperature, top-k
Model Management	Download, benchmark, and manage models locally
100% Offline	All inference on-device, no internet required

Supported Models (April 2026)

Gemma 4 family (E2B, E4B, 26B A4B, 31B)
FunctionGemma variants
Community models via Hugging Face integration

Why On-Device AI Matters

The Privacy Case

Every cloud AI interaction involves data leaving your device. With AI Edge Gallery:

Zero data transmission: Your prompts never leave your phone
Complete offline: Works in airplane mode, no signal? No problem
Sensitive data: Analyze documents, contracts, medical info—all local

The Latency Case

Cloud AI: Input → Upload → Process → Download → Output
On-Device: Input → Process → Output

The Cost Case

Cloud API calls add up. On-device = one-time model download, then free forever.

Agent Skills: Extending the LLM

The Agent Skills system transforms conversational LLMs into capable agents:

Built-in Skills

- Wikipedia: Fact-grounded responses
- Interactive Maps: Location-aware AI
- Web Search: Real-time information
- Custom Skills: Load from URL or community repos

How Skills Work

When you enable a skill, the LLM gains function-calling capabilities automatically:

User: "What's the population of Tokyo?"
Agent Skill activates:
1. Recognizes question requires factual data
2. Calls Wikipedia skill
3. Retrieves current population
4. Formats response with citation

Thinking Mode: Seeing Inside the Model

One of the most compelling features is Thinking Mode—tap the toggle to watch the model reason in real-time.

This shows:

How the model breaks down the problem
Intermediate reasoning steps
Confidence adjustments

Performance and Hardware

What You Need

Model	Recommended Device	Performance
E2B	Any modern phone	~30 tokens/sec
E4B	iPhone 15+, Pixel 7+	~20 tokens/sec
26B A4B	iPhone 15 Pro, high-end Android	~8 tokens/sec
31B	Not recommended	Too demanding

Use Cases That Work

1. Personal AI Assistant

Keep a lightweight model (E2B/E4B) always ready:

Quick Q&A without internet
Draft emails, review documents

2. Developer Sandbox

Test prompts and model behaviors in isolation
Prototype before cloud deployment

3. Privacy-First workflows

Legal document review
Medical record summarization
Financial analysis

4. Offline Development

Airplane coding sessions
Security-sensitive environments

The Developer Opportunity

AI Edge Gallery isn't just an app—it's a proving ground for on-device AI development.

Building Custom Skills

from skill_runtime import skill

@skill("analyze_image")
def analyze_image(image_path):
    # Custom ML pipeline
    return {"objects": [...], "description": ...}

AI Edge Gallery is available on Google Play and iOS App Store (iOS 17+).