On-device AI has crossed the threshold from research project to production feature. In 2026, Gemini Nano is available on hundreds of millions of Android devices through AICore, and ML Kit provides specialized models for vision, text, and audio tasks — all running locally, privately, and without internet connectivity.
Why On-Device AI?
- Privacy — user data never leaves the device.
- Latency — no round-trip to a server; responses in milliseconds.
- Offline — works without connectivity.
- Cost — no per-request API charges.
Gemini Nano via AICore
Gemini Nano is Google's smallest Gemini model, optimized for on-device inference on modern
Android devices. Access it through the android.ai.llm API (API 36+) or the
Jetpack AI client library for broader compatibility.
// Check if Gemini Nano is available
val aiManager = context.getSystemService(AI_MANAGER_SERVICE) as AiManager
val isAvailable = aiManager.isGeminiNanoAvailable()
// Create a session and prompt
val session = aiManager.createSession(GeminiNanoOptions.defaults())
val response = session.generate(
prompt = "Summarize this article in two sentences: $articleText"
)
Log.d("AI", response.text)
Good use cases for Gemini Nano: text summarization, smart reply suggestions, translation, and classification tasks where a large model would be overkill.
ML Kit for Specialized Tasks
ML Kit provides pre-trained, optimized models for common AI tasks. All run on-device with no API key required.
- Text Recognition v2 — reads text from images in 50+ languages.
- Face Detection/Mesh — 468 facial landmarks for AR and filter apps.
- Pose Detection — full-body pose estimation for fitness apps.
- Object Detection and Tracking — real-time object detection in camera feeds.
- Smart Reply — context-aware reply suggestions for messaging apps.
- Document Scanner — perspective correction and enhancement for scanned documents.
// Text recognition example
val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)
val image = InputImage.fromBitmap(bitmap, 0)
recognizer.process(image)
.addOnSuccessListener { visionText ->
val fullText = visionText.text
processRecognizedText(fullText)
}
.addOnFailureListener { e -> Log.e("MLKit", "Recognition failed", e) }
MediaPipe for Custom Models
When ML Kit does not have the model you need, MediaPipe lets you run custom TFLite models with hardware acceleration (GPU/NPU). The MediaPipe Tasks API provides a clean interface for image, audio, and text models.
Best Practices
- Check device capability first — not all devices support all models. Degrade gracefully.
- Run inference off the main thread — use coroutines or a background thread even for fast models.
- Warm up models before first use — first inference is slower due to model loading.
- Always provide a non-AI fallback — the AI path should enhance, not block.