Skip to main content

Table 5 Performance comparison among SW implementations using the small and medium sets

From: SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

Implementation SWIFOLD SWAPHI-LS SW# CUDAlign SW# CUDAlign
Accelerator Intel Arria 10 GX Intel Xeon Phi 3120P NVIDIA GTX980 NVIDIA GTX1080
Matrix size (cells) Performance (GCUPS)
100K 49.81 (56.92) 0.42 0.3 0.03 0.23 0.03
3M 105.14 (223.1) 7.69 7.62 1.08 7.55 1.08
28M 122.91 (255.49) 21.24 33.33 8.18 41.47 8.63
291M 126.95 (268.83) 30.67 64.53 45.89 111.60 58.24
1G 129.44 32.84 75.24 79.21 144.97 117.97
9G 131.45 (202.56) 33.9 69.54 84.05 143.50 152.63
25G 131.96 34.16 120.92 160.79 255.89 295.43
35G 131.98 (203.51) 34.38 68.84 84.43 142.12 155.19
100G 132.15 33.19 118.81 163.77 253.13 297.05
575G 132.33 (204.06) 30.36 67.55 84.84 143.51 158.13
  1. SWIFOLD performance rates belong to the best 32-bits kernel version but faster performances from smaller data types are also reported (between parenthesis) whenever correspond