Skip to main content

Table 2 Quality comparison of different clustering algorithms on bioinformatics datasets

From: A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics

Dataset

Measure metrics

PBC-A

PBC-R

MFCCs

DLBCL-B

Wine

WDBC

MPE-A

MPE-R

ESR

MKDCI

F-m

0.351

0.360

0.728

0.749

0.652

0.858

0.470

0.482

0.491

 

aRe

0.956

0.953

0.406

0.526

0.704

0.382

0.693

0.689

0.852

 

NMI

0.351

0.362

0.692

0.532

0.414

0.495

0.538

0.554

0.446

 

AMI

0.070

0.076

0.615

0.496

0.379

0.453

0.429

0.438

0.219

DBSCAN (MinPts=4, ε1)

F-m

0.660

0.665

0.509

0.510

0.576

0.811

0.448

0.452

0.350

 

aRe

0.999

0.998

0.858

0.956

0.772

0.602

0.796

0.794

0.967

 

NMI

0.023

0.026

0.221

0.054

0.361

0.395

0.492

0.499

0.060

 

AMI

0.005

0.005

0.124

0.039

0.269

0.295

0.347

0.347

0.003

HDBSCAN (MinPts=4)

F-m

0.623

0.627

0.785

0.565

0.620

0.853

0.265

0.271

0.332

 

aRe

0.998

0.998

0.260

0.985

0.715

0.386

0.926

0.923

0.989

 

NMI

0.029

0.032

0.686

0.174

0.386

0.469

0.518

0.523

0.082

 

AMI

0.019

0.020

0.613

0.115

0.353

0.373

0.335

0.337

0.020

DENCLUE2.0 (ε2,h=std(X)/5)

F-m

0.023

0.025

0.415

0.493

0.372

0.007

0.304

0.308

0.650

 

aRe

0.997

0.996

0.983

0.987

0.908

0.998

0.708

0.699

0.685

 

NMI

0.344

0.347

0.105

0.184

0.385

0.322

0.472

0.478

0.472

 

AMI

0.061

0.064

0.018

0.114

0.122

0.002

0.392

0.396

0.201

PFClust

F-m

0.315

0.320

0.375

0.442

0.373

0.432

0.202

0.207

0.271

 

aRe

0.981

0.978

0.887

0.993

0.971

0.988

0.998

0.998

0.872

 

NMI

0.002

0.002

0.123

0.043

0.033

0.019

0.024

0.028

0.135

 

AMI

0.001

0.001

0.094

0.001

0.001

0.007

0.006

0.007

0.111

Parameters

ε 1

24.657

24.657

0.306

19.819

3.626

20.413

2.221

2.221

1.426

 

ε 2

19.591

19.591

0.306

0.413

6.552

1.426

0.432

0.432

1.853

  1. MinPts is the minimum number of data samples required to form a cluster, ε1 is the maximum distance between two data samples for them to be considered as in the same neighborhood, ε2 is the convergence threshold for density attractors and h is the parameter of a Gaussian kernel. ε1 and ε2 are the corresponding parameters when the better clustering results are obtained for F-m evaluation metric during clustering with ten random values of the parameters between 0.0 and 50.0