SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

BMC Systems Biology

Table 5 Performance comparison among SW implementations using the small and medium sets

Implementation	SWIFOLD	SWAPHI-LS	SW#	CUDAlign	SW#	CUDAlign
Accelerator	Intel Arria 10 GX	Intel Xeon Phi 3120P	NVIDIA GTX980		NVIDIA GTX1080
Matrix size (cells)	Performance (GCUPS)
100K	49.81 (56.92)	0.42	0.3	0.03	0.23	0.03
3M	105.14 (223.1)	7.69	7.62	1.08	7.55	1.08
28M	122.91 (255.49)	21.24	33.33	8.18	41.47	8.63
291M	126.95 (268.83)	30.67	64.53	45.89	111.60	58.24
1G	129.44	32.84	75.24	79.21	144.97	117.97
9G	131.45 (202.56)	33.9	69.54	84.05	143.50	152.63
25G	131.96	34.16	120.92	160.79	255.89	295.43
35G	131.98 (203.51)	34.38	68.84	84.43	142.12	155.19
100G	132.15	33.19	118.81	163.77	253.13	297.05
575G	132.33 (204.06)	30.36	67.55	84.84	143.51	158.13

SWIFOLD performance rates belong to the best 32-bits kernel version but faster performances from smaller data types are also reported (between parenthesis) whenever correspond

ISSN: 1752-0509