Project • Research & development

IOTA POW hardware accelerator


Developing IOTA POW hardware accelerator

  • curl
  • driver
  • arm
  • fpga
  • verilog
  • accelerator
  • pow


To benefit from IOTA features you should be able to send IOTA transations. The rate of IOTA transactions limited by POW operation. Due to slow calculation, IOTA POW operation on embedded devices can last up to 50 minutes!

So, we created IOTA POW hardware accelerator on Cyclone V FPGA with embedded ARM using System Verilog HDL. Device utilizes DMA for reading transaction from input buf and writing nonce to output buf. After finishing POW, device is generated interrupt to ARM. To simplify communication between user space programs and hardware accelerator we developed a Linux driver. With the proposed accelerator, POW computing time for MWM=15 reduced from 10-50 minutes to 0.01-2 sec (0.42 sec in average).

Performance & Resources:

- Parameterized design. Parameter CL_NUM specifies the number of POW clusters. Parameter CU_NUM defines the number of POW computing units per cluster

- Hardware resources: 1 200 ALMs, 2 400 flip-flops per POW computing unit

- Hashrate: 1 204 819 hash/sec per POW comput. unit at 100 MHz

- Fmax: 130-140 MHz for Cyclone V depending on number of POW comput. units

Proof-of-Concept was launched on DE10-nano board (Cyclone V 5CSEBA6U23I7 FPGA device) which costs 110-130$

PoC parameters:

- 28 POW computing units (CL_NUM = 7, CU_NUM = 4)

- Operation frequency: 100 MHz (Fmax = 131 MHz)

- Hashrate: 33 734 940 hash/sec

- Resources: 33 239 ALMs, 68 019 flip-flops (79% of 5CSEBA6U23I7 FPGA)

- POW calculation time (for MWM=15): 0.01-2 sec, 0.42 sec in average

- POW calculation time (for MWM=14): 0.001-0.8 sec, 0.14 sec in average

Future plans:

- Increase Fmax and number of computing cores

- Unroll 81 rounds of transform-function into pipeline and feed midstate with new nonce in this pipeline on each clock cycle. This will significantly increase performance of POW calculation 

- Add PCIe interface and synthesize system for board with huge FPGA and PCIe (e.g. DE5-Net). It can be great hardware accelerator for IOTA full nodes in cloud data centers