Neuromorphic Compiler Stack

PyTorch to neuromorphic silicon.
No data center required.

Neurmorph compiles your existing PyTorch inference model into a spike-coded executable — running on neuromorphic silicon at 50× lower power. Continuous, always-on edge inference on a coin-cell budget.

50×
lower power vs CPU inference
<2ms
deterministic inference latency
3+ yr
coin-cell runtime at 0.8 mW

The Problem

The power wall at the edge

GPU-accelerated inference consumes 150–400W. Even quantized CNN models on Cortex-M microcontrollers draw 20–80mW continuously — draining a 250mAh coin cell in under 48 hours. For always-on industrial sensors, autonomous navigation systems, and perimeter monitoring nodes, that arithmetic is unworkable.

The constraint isn't the silicon. It's the execution model. Dense matrix arithmetic fires every neuron on every cycle regardless of signal. Spiking neural networks fire only when input changes. Neurmorph makes that compute model accessible from standard PyTorch.

Execution Model Comparison

TRADITIONAL — dense always-on Power: continuous ~45mW SPIKE-CODED — event-driven Power: event-driven ~0.8mW 50× power reduction same PyTorch model · Neurmorph compiler

Platform

Three layers. One toolchain.

NMC Compiler, Edge Runtime, and Hardware Abstraction Layer work together as a single integrated stack — compile once, deploy to any supported neuromorphic target.

NMC Compiler

Takes a trained PyTorch model, runs graph optimization passes, spike-encodes activations, and emits a binary executable targeting any supported neuromorphic ISA.

Compiler docs →

Edge Runtime

Event-driven execution engine. Manages spike queues, memory pools, and timestep windowing on-chip — zero OS dependency, deterministic latency, power-gated idle states.

Runtime docs →

Hardware Abstraction Layer

Isolates compiler output from silicon-specific instruction quirks. HAL drivers exist for NT3000, BrainPulse-2, and SynCore-V targets. New targets added without recompilation.

Supported hardware →

Workflow

From PyTorch to silicon in four steps

STEP 01
PyTorch Model
Standard trained model — ResNet, MobileNet, custom CNN. No SNN-specific training required.
STEP 02
NMC Compiler
Graph optimizer fuses ops, spike-encodes activations, applies threshold calibration and dead-spike elimination.
STEP 03
Spike Executable
Hardware-targeted binary with spike rate encoding, HAL mapping, and power profiling metadata embedded.
STEP 04
Neuromorphic Silicon
Event-driven runtime executes on-chip. Sub-milliwatt idle, coin-cell survivable for years of continuous inference.

Benchmarks

Power and efficiency numbers

Measured on ResNet-18 image classification workload. Neurmorph on NT3000 vs conventional inference targets. All results reproducible — methodology at benchmark-methodology-v1.

Platform Power (mW) Latency (ms) Throughput (TOPS/W) Runtime (coin cell)
ARM Cortex-M7 (int8) 38.0 4.2 0.9 18 hours
RISC-V MCU (fp32) 82.0 12.1 0.3 8 hours
NXP i.MX RT (int8) 55.0 3.8 0.6 12 hours
View full benchmark methodology →

Applications

Where sub-milliwatt inference matters

Industrial IoT

Factory sensor nodes

  • Vibration anomaly detection at 5-year battery life
  • Gas / particulate classification at <1mW continuous draw
  • No wired power or frequent battery swap in inaccessible fixtures
Autonomous Robotics

Navigation inference

  • Obstacle avoidance and path planning on a 200mAh LiPo
  • Deterministic <2ms latency required for reactive navigation
  • Compact form factor — no thermal management needed
Defense / Perimeter

Always-on monitoring

  • Acoustic event classification without radio uplink
  • Air-gap deployable, no external connectivity required
  • Extreme environment tolerance with no fan or active cooling

Engineering Pilot Program

Join the engineering pilot

We work with a small number of teams directly. Pilot access includes: NMC Compiler CLI, Edge Runtime SDK, evaluation hardware allocation, and weekly engineering office hours. Bring your PyTorch model. We ship working silicon within 90 days.