We have evaluated the performance of the proposed biclustering algorithms for additive models using the cell cycle expression of S. cerevisiae [2], which consists of 2884 genes and 17 conditions. Our performance measurement is based on the functional enrichment: percentage of biclusters which are overrepresented in one or more Gene Ontology (GO) annotation. A bicluster is said to be overrepresented in a functional category if the p-value is below a certain threshold. The p-value for a bicluster B is defined in [6] as follow

p-value(B) = _{G} C _{r} ´
_{N-G} C _{k-r} / _{N} C _{k}

where G and r are the number of genes associated with the given category in the gene expression matrix and in the bicluster respectively while N and k are the number of genes in the gene expression matrix and in the biclusters respectively. The p-value is the probability that the genes are selected into the cluster by random. A small p-value implies that the cluster is highly unlikely found by chance. The annotations of genes for five ontologies including biolgoical process, cellular component, molecular function, deletion viability and regulatory pathway are obtained using GeneMerge[6]. The parameters of the proposed algorithm are set as follows: min. % of rows = 0.68, noise threshold = 5, min. no. of columns = 6 and max. overlapping allowed = 80. Table I summaries the results at various ranges of p-value. We have also tested Cheng and Church algorithm and included the results in Table II for comparison.

Table I. Functional enrichment of biclusters found by our proposed algorithm for additive models

p-value | BP | CC | MF | DEL | PATH |
---|---|---|---|---|---|

<0.00001 | 0% | 0% | 0% | 0% | 0% |

<0.00005 | 0% | 4% | 0% | 0% | 0% |

<0.0001 | 1% | 4% | 0% | 0% | 0% |

<0.0005 | 9% | 5% | 6% | 0% | 0% |

<0.001 | 9% | 13% | 6% | 0% | 0% |

<0.005 | 15% | 34% | 8% | 4% | 1% |

<0.01 | 33% | 44% | 18% | 4% | 9% |

<0.05 | 65% | 81% | 37% | 33% | 11% |

Table II. Functional enrichment of biclusters found by Cheng & Church algorithm for additive models

p-value | BP | CC | MF | DEL | PATH |
---|---|---|---|---|---|

<0.00001 | 2% | 1% | 2% | 0% | 0% |

<0.00005 | 3% | 1% | 3% | 0% | 1% |

<0.0001 | 4% | 3% | 4% | 0% | 2% |

<0.0005 | 10% | 9% | 6% | 0% | 3% |

<0.001 | 15% | 12% | 9% | 0% | 4% |

<0.005 | 33% | 24% | 21% | 3% | 7% |

<0.01 | 40% | 32% | 25% | 5% | 11% |

<0.05 | 54% | 55% | 35% | 11% | 17% |

By comparing Tables I and II, it can be found that the proposed algorithm outperforms Cheng and Church algorithm when p-value is less than 0.05 for the four of five categories including biological process, cellular component, molecular function and deletion viability. For cellar component ontology, both of the methods give highest percentage of enriched biclusters but the percentage of the proposed algorithm is higher than that of Cheng & Church algorithm by 26%. When the lower range of p-value is considered, the percentage of enriched biclusters drops. The proposed algorithm still shows higher performance for cellular component ontology for p-value < 0.001. The details of annotation results for the three ontologies biological process, cellular component and molecular function with p-value < 0.05 have been given in this file. Note that the annotation results for other cases are the subsets of the results for p-value < 0.05.