Migration Guide
Step-by-step migration paths from TFLite Micro, TensorRT, and NMC 0.8.x to NMC 0.9.x on neuromorphic silicon.
Migrating from TFLite Micro
TFLite Micro runs int8-quantized dense inference on Cortex-M. NMC replaces the quantized model with a spike-coded equivalent, and the TFLite Micro runtime with nrm_runtime.h. Accuracy delta is typically −1 to −2% in exchange for a 47–83× inference/Watt improvement.
NMC accepts PyTorch TorchScript or ONNX — not .tflite. If your current pipeline starts from TFLite, export your float32 PyTorch source model using torch.jit.trace before compiling with NMC.
Remove the TFLite Micro include headers, arena allocator, and interpreter setup. Replace with nrm_arena_init + nrm_load + nrm_step. The event loop structure is similar: both run one inference per timestep.
TFLite Micro reads/writes tensors directly; NMC uses the HAL spike bus. Convert your sensor data to spike-encoded format using neurmorph.encode in Python, or the bundled C encoder nrm_encode.h.
Migrating from TensorRT
TensorRT runs FP16/INT8 dense inference on NVIDIA GPUs. NMC targets neuromorphic silicon at a fraction of the power draw. The migration path is similar to TFLite, but the model sizes and deployment architecture differ significantly.
Key differences to plan for:
- TensorRT model size is typically 5–15 MB; NMC .snn binaries are 80–200 KB for equivalent models.
- TensorRT requires CUDA environment; NMC runtime is C99 with no OS dependency.
- Latency characteristics invert: TensorRT is fast with batching, NMC is faster at batch size 1 (single sensor event).
Migrating from NMC 0.8.x
NMC 0.9.x introduces breaking changes to the Python SDK and the runtime C API. The .snn binary format is not backward compatible with 0.8.x binaries — recompile all models.
| 0.8.x | 0.9.x | Notes |
|---|---|---|
nmc.Compiler() | neurmorph.compile() | Functional style replaces class-based |
nrm_run() | nrm_step() | Renamed for clarity |
--target neuromorphic | --target nt3000 | Must specify exact chip now |
.nmbin | .snn | Binary format and extension changed |
Common migration issues
- Accuracy below expected: Run
nmc calibratewith a larger representative dataset. The default 512-sample calibration may be insufficient for models trained on highly varied input distributions. - SRAM overflow on deploy: Check
nmc inspectoutput for total SRAM estimate. If it exceeds the target's internal SRAM, apply dead-spike elimination with a stricter threshold using--dse-threshold 0.05. - HAL spike bus not draining: Check that
hal_spike_readreturns the actual number of spikes read, not always 0. A return value of 0 causes the runtime to stall indefinitely. - Latency higher than benchmarks: Confirm
hal_power_mode(ACTIVE)is called before the firstnrm_step. Many default HAL implementations leave the chip in SLEEP at startup.