Distribution of disease/gene associations and regression analyses of gene overlap as a function of random gene set size. (A) Distribution of disease/gene associations. The plot shows how many genes (y-axis) are linked to a certain number of diseases (x-axis). The majority of genes (about 86%) in the BKL data set are not associated with more than 5 disorders, whereas other genes are much more strongly connected. These differences in disease links per gene were taken into account by generating random sets containing genes with the same frequency as the original data. (B-C) Regression functions obtained for calcinosis (B), T2DM (C), and prostatic neoplasms (D). Given a causal gene set, the mean overlap with a random gene set is a linear function of the random gene set size. Deviations from regression curves are observed for small and large random gene sets, which are supported by fewer samples as shown by corresponding data point weights and predictive intervals.