Technology

Scale AI on any Host Processor

  • All MemryX chips work together as one logical unit, with a single connection to the host using a standard interface (PCIe or USB).
  • Fully offload AI models while only using the host for pre and post processing.
  • Supports x86, ARM, and RISC-V processors and Windows and Linux operating systems.

At-Memory AI Processing

  • High bandwidth at-memory computing eliminates memory bottlenecks.
  • Innovative, highly configurable native dataflow architecture adapts to your AI models.
  • Memory is the only interconnect used between compute engines, rather than managing data movement with a control plane or a network-on-chip (NoC).

Performance and Accuracy without a Model Zoo

  • >2X higher utilization than any competitor enables the MX3 to outperform others with higher TOPS/TFLOPS.
  • BF16 activations (in an efficient Block Floating Point format) ensure high accuracy without a user needing pilot images or retraining.
  • Compiler and Mapper are more efficient than any human in optimizing compute resources for any model.
  • The result: Users with high or limited expertise can easily use our online software tools to efficiently run their own model.

Pipelined Execution

  • Dataflow enables pipelined operation which is ideal for streaming inputs
    (such as cameras).
  • Unlike CPUs, GPUs, or AI Accelerators with control flow, our architecture minimizes data movement for maximum efficiency.
  • Data is seamlessly streamed within a chip and across any number of chips.
  • Every input is processed identically, providing deterministic performance.
  • All data is processed with batch = 1.