Back to articles
Apple DevelopmentUpdated 7 min read

Apple Core AI, in plain terms

Apple Core AI is the local AI runtime for models you bring yourself. coreai-models helps export supported models into .aimodel resources for Apple apps.

AppleCore AIcoreai-modelsFoundation ModelsCore MLLocal AI

Apple Core AI is Apple’s framework for running your own AI models locally on Apple devices.

That is the most important idea. Foundation Models gives developers access to Apple’s on-device language model. Core AI is different: it is the runtime for models you bring yourself. Those models might come from Hugging Face, PyTorch, a research project, or Apple’s own coreai-models repository.

In simple terms:

  • Foundation Models is for Apple’s system language model.
  • Core AI is for your own local models.
  • coreai-models helps you export supported models into Core AI.

The result is a more open path for local AI on Apple platforms. Instead of sending every request to a cloud API, an app can bundle or download a model, load it with Core AI, and run inference on the user’s device.

Why it matters

Most AI features still start in the cloud.

That is often practical. Cloud models are powerful, easy to update, and do not make your app bundle huge. But they also introduce tradeoffs: user data leaves the device, every request has latency, every request can have cost, and the feature depends on network access.

Core AI points in the other direction. It makes local models a more native part of Apple app development.

That changes the shape of an app feature:

  • a transcription model can run without uploading audio
  • a document classifier can work on private local files
  • an image model can process camera or photo data on device
  • a small language model can power focused app-specific assistance
  • a model can keep working even when the network is poor

This is not a replacement for every cloud model. Local models still have limits. They are smaller, they use memory and battery, and they need careful testing on real hardware. But for private, focused, device-adjacent tasks, Core AI is a big shift.

What Apple actually built

Core AI is a native inference framework for Apple platforms.

Apple describes it as a Swift-first, memory-safe API for loading and running AI models locally. It is designed around Apple silicon and the kinds of models developers now want to ship: language models, diffusion models, audio models, vision models, and multimodal models.

The framework focuses on the runtime side of local AI:

  • loading .aimodel resources
  • specializing models for Apple hardware
  • managing inference memory
  • supporting stateful execution
  • reducing unnecessary copies with zero-copy data paths
  • making models visible to Xcode, Instruments, and the Core AI Debugger

That last part matters. Core AI is not only a file format. It is part of a deployment pipeline: export the model, optimize it, add it to an app, run it locally, then profile and debug it like any other serious piece of app infrastructure.

Core AI vs Foundation Models

The easiest way to understand Core AI is to compare it with the Foundation Models framework. I covered that framework separately in Apple’s Foundation Models, in plain terms; the short version is that Foundation Models is the system-provided path, while Core AI is the bring-your-own-model path.

Foundation Models is the high-level API for Apple’s own on-device language model. If your app needs summarization, rewriting, classification, structured generation, or tool calling, Foundation Models may be the right fit. You do not choose the model. Apple provides it through the system.

Core AI is the lower-level path for custom models. You choose a model, export it, include the required resources, and run it yourself.

The split looks like this:

  • Foundation Models: Apple’s local language model as a developer API.
  • Core AI: Apple’s runtime for your own local models.
  • coreai-models: Apple’s model recipes and utilities for getting models into Core AI.

That means Core AI is more flexible, but also more work. With flexibility comes responsibility for model size, quality, memory, battery, fallbacks, and updates.

Core AI vs Core ML

Core ML is still Apple’s established framework for machine learning in apps.

For many classic use cases, Core ML remains the right tool: image classification, object detection, tabular models, smaller regressors, and many task-specific neural networks.

Core AI is aimed at newer and often larger model families. Think LLMs, diffusion pipelines, vision-language models, audio models, and models with more complex execution needs such as KV caches, tokenizers, multi-part pipelines, and specialized operations.

So Core AI is not simply “Core ML, renamed.” It is a runtime layer for the generation of models shaped by PyTorch, Hugging Face, and modern generative AI.

What coreai-models is

apple/coreai-models is the practical starting point.

It is Apple’s GitHub repository for model export recipes, model registry entries, Python tooling, and Swift runtime helpers. The goal is to make it easier to take a supported model and turn it into resources an Apple app can use with Core AI.

The important parts are:

  • models/: supported model families and export recipes
  • python/: Python utilities for preparing and exporting models
  • swift/: Swift packages that help load and use exported models
  • skills/: coding-agent skills for working with Core AI projects

The workflow is roughly:

  1. Clone coreai-models.
  2. Pick a supported model from the registry.
  3. Export it into Core AI resources.
  4. Add the full exported resource folder to your app.
  5. Load the model from Swift and run it locally.

At the command line, that starts like this:

bash
git clone https://github.com/apple/coreai-models.git
cd coreai-models
uv run coreai.model.registry --list-models

For a small language model, the export can look like this:

bash
uv run coreai.llm.export Qwen/Qwen3-0.6B

For iOS, you can make the target platform explicit:

bash
uv run coreai.llm.export Qwen/Qwen3-0.6B --platform iOS --max-context-length 4096

That platform flag matters. A model exported for macOS can have different assumptions from a model exported for iOS. For example, an iOS language model export may need fixed shapes and a fixed maximum context length.

What models it supports

coreai-models is not a universal Hugging Face converter.

It supports specific model families and presets. Apple’s current model catalog includes examples across several categories:

  • LLMs: Gemma 3, GPT-OSS, Mistral, Mixtral, Qwen2.5, Qwen3, Qwen3 MoE
  • Diffusion: Stable Diffusion 1.5, Stable Diffusion 2.1, Stable Diffusion 3.5 Medium, FLUX.2
  • Vision-language: Qwen3-VL
  • Vision: CLIP, Depth Anything v3, EDSR, EfficientSAM, PVT v2, SAM 3, YOLOS
  • Audio: CLAP, Wav2Vec 2.0, Whisper
  • Text: RoBERTa, T5

That is an important detail. If a model family is supported, the repository can give you a practical path. If it is not supported, you may need to write your own export code, add a registry entry, or work at a lower level with related tools such as coreai-torch.

The exported model is more than one file

The most common mistake is to look only for the .aimodel file.

Sometimes that is enough. A simple model can be represented as a single model file. But many modern AI models are not that simple.

A language model usually needs a tokenizer. A diffusion pipeline can include several model components. A vision-language model may need a text decoder, token embedding model, vision encoder, tokenizer files, and metadata. The app needs all of those pieces to reconstruct the pipeline at runtime.

So the practical rule is simple:

Add the full exported model folder to Xcode, not just the .aimodel file.

If the tokenizer or metadata is missing, the model might compile, but the app will fail when it tries to run real inference.

Using it from Swift

Once the resources are in your app, the Swift side depends on the model type.

For language models, coreai-models includes Swift helpers that can connect an exported Core AI language model to a LanguageModelSession. That is interesting because the app can use the familiar Foundation Models session style while running a model you exported yourself.

At a high level, it looks like this:

swift
import FoundationModels
import CoreAILanguageModels
 
let model = try await CoreAILanguageModel(resourcesAt: modelURL)
let session = LanguageModelSession(model: model)
 
let response = try await session.respond(
    to: "Explain quantum computing in simple terms."
)
 
print(response)

The important part is the model source. This is not Apple’s system model. It is a model loaded from your local Core AI resources.

That gives you more control. You can choose a model family, choose a compression strategy, ship different models for different app features, or download resources separately from the main app bundle.

If you want to see this idea in a real app, my project CoreAIChat is a small SwiftUI example for experimenting with Core AI language models locally.

Compression matters

Local AI is mostly a resource management problem.

A model that looks small compared with cloud-scale models can still be huge for an app. It affects download size, install size, memory pressure, launch time, inference latency, and battery use.

That is why coreai-models puts real weight on compression and sensible defaults. For known model families, the export tool can choose platform-specific defaults for precision, compression, and context length. Some macOS LLM exports use 4-bit quantization. Some iOS exports use palettization presets such as 4bit_weight_palettized_group32 or 4bit_weight_palettized_group8.

You can override the default compression:

bash
uv run coreai.llm.export Qwen/Qwen3-0.6B --compression none

Or point the export at a custom coreai-optimization recipe:

bash
uv run coreai.llm.export Qwen/Qwen3-0.6B \
  --platform iOS \
  --compression-config my_custom_recipe.yaml

This is where the work becomes technical. Exporting the model is only one step. You still need to decide what tradeoff makes sense for your app: quality, size, memory, latency, and supported devices.

Ahead-of-time compilation

Core AI models can also be compiled ahead of time.

Apple exposes that through coreai-build:

bash
xcrun coreai-build compile --help

Ahead-of-time compilation can reduce work at runtime. Instead of making the app prepare everything when the user first opens a feature, you can prepare the model resource earlier in the build or packaging process.

The tradeoff is operational. You need a clean asset pipeline. If you update the model, change the target platform, adjust compression, or change metadata, the compiled resources need to stay in sync.

Debugging and profiling

Core AI also has tooling for the part people often underestimate: debugging model behavior.

The Core AI Debugger can inspect model structure, visualize data flow, run models on connected devices, and compare results against reference runs. Instruments can help you understand performance, memory, and runtime behavior.

That matters because local model bugs often do not look like normal app bugs.

The model might run, but produce worse output after compression. It might allocate too much memory. It might be fast on a Mac and unusable on a phone. It might fail only for a specific context length or input shape.

For production apps, Core AI should be treated like a full deployment pipeline:

  1. export
  2. optimize
  3. compile
  4. integrate
  5. test on real devices
  6. profile
  7. update carefully

Platform requirements

Core AI is new, so availability matters.

Apple presents Core AI as spanning iPhone, iPad, Mac, and Apple Vision Pro. As of July 2026, the coreai-models documentation lists macOS and iOS 27.0 or newer, with Xcode 27.0 or newer, for its running and app-integration workflow. The Core AI Debugger beta requires macOS 27 or later to view Core AI models, and paired devices need iOS 27, iPadOS 27, or macOS 27 or later to specialize and run them.

That makes Core AI an early-adopter technology. It is promising, but it is not a broad backward-compatibility layer. If you build with it now, design the feature around platform checks, clear fallbacks, and real-device testing.

When to use it

Use Core AI when you want a model that is local, private, and specific to your app.

Good candidates include:

  • local speech transcription with Whisper
  • document classification without a server
  • image search with CLIP embeddings
  • segmentation or object detection on device
  • a focused local assistant with Qwen, Gemma, or Mistral
  • app-specific AI features that should not upload user data

Do not use Core AI just because local AI sounds good. If your feature needs current web knowledge, very strong reasoning, centralized model updates, large retrieval systems, or broad device support today, a server model may still be the better product choice.

Core AI is best understood as Apple’s native path for custom local models. coreai-models is the bridge that makes that path approachable.

Together, they make local AI on Apple platforms feel less like a research demo and more like something app developers can actually ship.