Skip to main content

Table 5 Performance comparison among SW implementations using the small and medium sets

From: SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

Implementation

SWIFOLD

SWAPHI-LS

SW#

CUDAlign

SW#

CUDAlign

Accelerator

Intel Arria 10 GX

Intel Xeon Phi 3120P

NVIDIA GTX980

NVIDIA GTX1080

Matrix size (cells)

Performance (GCUPS)

100K

49.81 (56.92)

0.42

0.3

0.03

0.23

0.03

3M

105.14 (223.1)

7.69

7.62

1.08

7.55

1.08

28M

122.91 (255.49)

21.24

33.33

8.18

41.47

8.63

291M

126.95 (268.83)

30.67

64.53

45.89

111.60

58.24

1G

129.44

32.84

75.24

79.21

144.97

117.97

9G

131.45 (202.56)

33.9

69.54

84.05

143.50

152.63

25G

131.96

34.16

120.92

160.79

255.89

295.43

35G

131.98 (203.51)

34.38

68.84

84.43

142.12

155.19

100G

132.15

33.19

118.81

163.77

253.13

297.05

575G

132.33 (204.06)

30.36

67.55

84.84

143.51

158.13

  1. SWIFOLD performance rates belong to the best 32-bits kernel version but faster performances from smaller data types are also reported (between parenthesis) whenever correspond