Google AI Edge Gallery: On-Device AI on Your Phone (iOS & Android)

February 15, 2026

On April 2, 2026, alongside Gemma 4, Google released AI Edge Gallery—the premier destination for running frontier open-source Large Language Models on your mobile device. The iOS app was released in February 2026, expanding from Android.

All model inferences happen directly on your device hardware. No internet is required, ensuring total privacy for your prompts, images, and sensitive data.

Your phone is no longer just an AI terminal connecting to cloud services—it's becoming an autonomous AI device.

What is AI Edge Gallery?

AI Edge Gallery is Google's experimental app (released March 2025, 500K+ downloads) that lets you run open-source LLMs on mobile devices. The April 2026 update brought official Gemma 4 support, making it the most capable on-device AI experience available.

Core Features

FeatureDescription
Agent SkillsExtend LLMs with tools: Wikipedia, maps, web search, custom skills
Thinking ModeVisualize the model's reasoning process
Prompt LabTest prompts with granular control over temperature, top-k
Model ManagementDownload, benchmark, and manage models locally
100% OfflineAll inference on-device, no internet required

Supported Models (April 2026)

  • Gemma 4 family (E2B, E4B, 26B A4B, 31B)
  • FunctionGemma variants
  • Community models via Hugging Face integration

Why On-Device AI Matters

The Privacy Case

Every cloud AI interaction involves data leaving your device. With AI Edge Gallery:

  • Zero data transmission: Your prompts never leave your phone
  • Complete offline: Works in airplane mode, no signal? No problem
  • Sensitive data: Analyze documents, contracts, medical info—all local

The Latency Case

Cloud AI: Input → Upload → Process → Download → Output On-Device: Input → Process → Output

The Cost Case

Cloud API calls add up. On-device = one-time model download, then free forever.

Agent Skills: Extending the LLM

The Agent Skills system transforms conversational LLMs into capable agents:

Built-in Skills

- Wikipedia: Fact-grounded responses - Interactive Maps: Location-aware AI - Web Search: Real-time information - Custom Skills: Load from URL or community repos

How Skills Work

When you enable a skill, the LLM gains function-calling capabilities automatically:

User: "What's the population of Tokyo?" Agent Skill activates: 1. Recognizes question requires factual data 2. Calls Wikipedia skill 3. Retrieves current population 4. Formats response with citation

Thinking Mode: Seeing Inside the Model

One of the most compelling features is Thinking Mode—tap the toggle to watch the model reason in real-time.

This shows:

  • How the model breaks down the problem
  • Intermediate reasoning steps
  • Confidence adjustments

Performance and Hardware

What You Need

ModelRecommended DevicePerformance
E2BAny modern phone~30 tokens/sec
E4BiPhone 15+, Pixel 7+~20 tokens/sec
26B A4BiPhone 15 Pro, high-end Android~8 tokens/sec
31BNot recommendedToo demanding

Use Cases That Work

1. Personal AI Assistant

Keep a lightweight model (E2B/E4B) always ready:

  • Quick Q&A without internet
  • Draft emails, review documents

2. Developer Sandbox

  • Test prompts and model behaviors in isolation
  • Prototype before cloud deployment

3. Privacy-First workflows

  • Legal document review
  • Medical record summarization
  • Financial analysis

4. Offline Development

  • Airplane coding sessions
  • Security-sensitive environments

The Developer Opportunity

AI Edge Gallery isn't just an app—it's a proving ground for on-device AI development.

Building Custom Skills

from skill_runtime import skill @skill("analyze_image") def analyze_image(image_path): # Custom ML pipeline return {"objects": [...], "description": ...}

AI Edge Gallery is available on Google Play and iOS App Store (iOS 17+).

Home
Blog
GitHub
LinkedIn
X