Skip to main content

Table 4 Performance and resource usage comparison for the different OpenCL kernel implementations

From: SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

Kernel int_bw256 int_bw512 int_bw1024 int_bw1152 short_bw512 short_bw1024 short_bw1536 char_bw512 char_bw1024 char_bw1536
Integer type int (32 bits) short (16 bits) char (8 bits)
Maximum value 2147483647 32767 127
BW 256 512 1024 1152 512 1024 1536 512 1024 1536
Resource ALMs 29% 49% 87% 94% 32% 52% 73% 21% 31% 41%
usage Regs 3% 3% 4% 4% 3% 4% 5% 3% 4% 4%
  RAM 8% 8% 20% 22% 7% 18% 27% 7% 18% 23%
  DSPs 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Matrix size (cells) Performance (GCUPS)
  100K 24.15 31.57 44.99 49.81 48.00 52.35 56.92 - - -
  3M 34.94 61.59 101.89 105.14 80.71 122.72 160.44 93.03 152.75 223.1
  28M 36.70 68.11 119.15 122.91 85.96 146.80 186.74 102.50 173.23 255.49
  291M 37.32 69.23 122.32 126.95 87.18 149.90 195.17 105.14 181.16 268.83
  1G 37.42 70.13 124.93 129.44 - - - - - -
  9G 37.84 70.80 126.96 131.45 88.40 155.85 202.56 - - -
  25G 37.91 70.92 127.49 131.96 - - - - - -
  35G 37.93 70.94 127.47 131.98 88.71 156.43 203.51 - - -
  100G 37.98 70.99 127.68 132.15 - - - - - -
  575G 38.03 71.09 127.85 132.33 88.87 156.83 204.06 - - -