Quickstart¶
Installation¶
pip install coreai-onnx
To enable numerical verification against ONNX Runtime (requires onnxruntime):
pip install "coreai-onnx[verify]"
Convert a model¶
from pathlib import Path
import coreai_onnx
program = coreai_onnx.convert("model.onnx", input_names=["x"], output_names=["y"])
program.optimize()
program.save_asset(Path("model.aimodel"))
That is the entire API for the common case. convert() accepts either a path string or
an onnx.ModelProto loaded in memory.
CLI examples¶
Inspect coverage before converting:
coreai-onnx inspect model.onnx
Convert to .aimodel:
coreai-onnx convert model.onnx -o model.aimodel
Verify numerical parity against ONNX Runtime:
coreai-onnx verify model.onnx model.aimodel
Attention fusion¶
Raw scaled-dot-product attention chains (MatMul → scale → Softmax → MatMul,
as exported by PyTorch, YOLO, BERT, etc.) crash the on-device GPU compiler —
the symptom is Xcode’s Performance report aborting with
MPSGraphExecutable: "MLIR pass manager failed". The converter therefore
fuses these chains automatically into the same scaled_dot_product_attention
composite op that coreai-torch emits, which the OS compilers replace with
their fused attention kernel. The rewrite is numerically equivalent; chains
that do not match the conservative pattern (dynamic shapes, masked softmax on
a non-final axis, multi-consumer intermediates, …) are left untouched and
convert exactly as before.
You can also instantiate the composite directly from a custom ONNX graph with
a coreai::ScaledDotProductAttention node — query [.., L, E],
key [.., S, E], value [.., S, E] inputs (rank 3 or 4), an optional float
additive mask, and a float scale attribute (default E**-0.5).
Runtime note¶
Conversion (producing .aimodel files) works on any platform: macOS, Linux,
Windows. The converter does not invoke the Core AI runtime.
Execution of .aimodel files requires macOS 27+ or iOS 27+ with the Core
AI framework present. Parity tests and the verify command require the runtime and
will be skipped automatically on unsupported platforms.
Next steps¶
Supported ops — 143 built-in op lowerings supported today
CLI reference — all subcommands and exit codes
Custom lowerings — extend coverage for your own ops