Supplementary Information
BiVisu can detect biclusters of constant, constant rows, constant columns, additive and multiplicative
types as defined in [5]. Fig. 1 shows examples of these biclusters. The additive type biclusters
can be described by the following equation
a_{ij} = m
+ a_{i}
+ b_{j}
+ e_{ij}
where a_{ij} is an expression value at row i and column j.
m is a constant,
a_{i} is a parameter varying with row,
b_{j} is a parameter varying with column and
e_{ij} is the error term of the model.
It can be easily verified that the additive models are able to describe constant, constant rows and
constant columns but not multiplicative models. For multiplicative models, the expression values
are given by
a_{ij} = m
´ a_{i}
´ b_{j}
+ e_{ij}
where the symbols are defined in the same way as in the additive model. Apart from the additiverelated
biclusters, the biclusters of constant, constant rows and constant columns types can be formulated as
the multiplicative models. In both additive and multiplicative models, the error term
e_{ij} is used to compensate for the noise in the data. Besides
noise, the real gene expression data usually has very high dimension (especially in rows). A good
biclustering algorithm should be able to detect biclusters with minimum errors under noisy environment
within an acceptable time period.
Fig.1. Different types of biclusters: (a) constant (b) constant rows (c) constant Columns (d) additive and
(e) multiplicative.


(d)  (e) 
Fig.2. Parallel Coordinate (PC) plots of biclusters: (a) constant (b) constant rows (c) constant columns
(d) additive and (e) multiplicative.
Parallel coordinate (PC) plots allow visualization of high dimensional data in a 2D plane.
Besides visualization, it has been exploited to reformulate the biclustering problem [1]. The PC
plots of constant, constant rows, constant columns, additive and multiplicative types are illustrated
in Fig. 2. The coherence of the first four bicluster types can be easily seen in their PC plots. The
formulation can be further simplified if a difference matrix is considered. Fig. 3 shows the PC plot
of the elementwise difference between each column and the first column (reference column) for the
additiverelated bicluster in Fig. 1(d). As seen in Fig. 2(e), the multiplicativerelated bicluster
is usually harder to observe. With similar formulation for additive biclusters, the observation of
coherence of multiplicativerelated biclusters can be facilitated by considering the elementwise
ratio between each column and the first column (reference column) as demonstrated in Fig. 4. These
observations form the basic idea of clustering operations in the biclustering algorithm as described
in [1].
Fig. 3. The PC plot of the elementwise difference between each column and the first column
(reference column) for the additiverelated bicluster in Fig. 2(d).
Fig. 4. The PC plot of the elementwise ratio between each column and the first column (reference
column) for the multiplicativerelated bicluster in Fig. 2(e).
As seen in Fig.3 and Fig.4, coherent genes in a bicluster are clustered along the
modified axes in the PC plot.
In the problem of
biclustering, the conditions at which a subset of genes coexpresses are not known in advance. In
order to cluster the genes (rows) and conditions (columns) simultaneously, clustering of rows is
first performed for each pair of columns in our proposed algorithm [1]. Further columns are then
merged to form a big bicluster.
Here we demonstrate the idea using an
example with one constant bicluster, one constant column bicluster and one additiverelated
bicluster as given in Fig.5. These biclusers will be referred as biclusters 1, 2 and 3 respectively
in the subsequent discussion.

C1  C2  C3 
C4  C5  C6 
R1 
2  4  6 
1  8  8 
R2 
2  5  3 
6  1  2 
R3 
2  4  6 
3  8  8 
R4 
1  0  2 
5  0  1 
R5 
0  4  4 
6  2  3 
R6 
2  4  6 
2  2  8 
R7 
6  3  5 
1  3  4 
R8 
2  4  6 
0  1  8 
R9 
4  1  9 
1  1  0 
R10 
2  1  7 
1  1  8 
R11 
3  7  0 
8  4  6 
R12 
5  1  8 
1  1  9 
Fig.5. An expresson matrix which consists of three biclusters: a constant bicluster (bicluster 1
highlighted in light blue color), a constant column bicluster (bicluster 2 highlighted in green
color) and an additiverelated bicluster (bicluster 3 highlighted in yellow color).
As the constant bicluster and constant column bicluster can be described using the additive model,
difference matrix given in Fig.6 is thus computed. In the seventh column
of the difference matrix, which contains the elementwise differences between C2 and C4, there are
eight distinct values. According to these eight distinct values, the elements can be clustered into
eight groups. For example, elements in R9, R10 and R12 are grouped as their entries are all zero.
The overall clustering results are summarized in Table.I.
As seen, besides group 4, the counts for all other groups are at most two.
These groups are too small to be considered as valid biclusters. Thus
only group 4 which consists of R9, R10 and R12 is retained. In this subset of rows, C2 and C4 can be
merged together as in Fig. 7(a). Also, C2
can be selected to compare with other columns. From the difference matrix given in Fig.7(b), only the
column differences between C2 and C5 can form a sufficient large cluster.
Therefore, C5 is joined with C2 and C4. The subset of rows then becomes {R9, R10, R12} while the
subset of columns is {C2, C4, C5}. Thus,
bicluster 1 is detected.

C1C2  C1C3  C1C4 
C1C5  C1C6  C2C3 
C2C4  C2C5  C2C6 
C3C4  C3C5  C3C6 
C4C5  C4C6  C5C6 
R1 
2  4  1 
6  6  2 
3  4  4 
5  2  2 
7  7  0 
R2 
3  1  4 
1  0  2 
1  4  3 
3  2  1 
5  4  1 
R3 
2  4  1 
6  6  2 
1  4  4 
3  2  2 
5  5  0 
R4 
1  1  4 
1  0  2 
5  0  1 
3  2  1 
5  4  1 
R5 
4  4  6 
2  3  0 
2  2  1 
2  2  1 
4  3  1 
R6 
2  4  0 
0  6  2 
2  2  4 
4  4  2 
0  6  6 
R7 
3  1  5 
3  2  2 
2  0  1 
4  2  1 
2  3  1 
R8 
2  4  2 
1  6  2 
4  3  4 
6  5  2 
1  8  7 
R9 
3  5  3 
3  4  8 
0  0  1 
8  8  9 
0  1  1 
R10 
1  5  1 
1  6  6 
0  0  7 
6  6  1 
0  7  7 
R11 
4  3  5 
1  3  7 
1  3  1 
8  4  6 
4  2  2 
R12 
4  3  4 
4  4  7 
0  0  8 
7  7  1 
0  8  8 
Fig.6. The difference matrix of the whole expression matrix in Fig. 5.
Table I. Results of clustering based on distinct difference values between C2 and C4.
group  1  2 
3  4  5 
6  7  8 
values  5  2 
1  0  1 
2  3  4 
counts  1  1 
2  3  1 
2  1  1 

{C2,C4}  C1  C3 
C5  C6 
R9 
{1,1}  4  9 
1  0 
R10 
{1,1}  2  7 
1  8 
R12 
{1,1}  5  8 
1  9 


C1C2  C3C2  C5C2 
C6C2 
R9 
3  8  0 
1 
R10 
1  6  0 
7 
R12 
4  7  0 
8 

(a)  (b) 
Fig.7. Comparison of columns C1, C3, C5 and C6 with {C2, C4}: (a) in expression values and (b)
in a difference matrix.
The other two biclusters can be found similarly by considering the twelfth column of the difference
matrix (Fig. 6), which consists of the elementwise differences between C3 and C6 . The twelfth column of the
difference matrix has five distinct values so five groups can be formed as shown in Table.II. Among these
five groups, only groups 2 and 4 are large enough to be considered as bicluster candidates. In this
way, the rows are split into two groups; group 2 consisting of R1, R3, R6 and R8 and group 4
consisting of R2, R4, R5 and R7.
After that, C3 is compared with columns other than C3 and C6 over the rows in group 2. A difference matrix
shown in Fig.8(b) is calculated. As seen from Fig. 8(b), the difference values in C1C3 are all equal to 4
Therefore,
{C3, C6} and {C1} are merged to a single group for rows {R1, R3, R6 and R8}.
This “merge and split process”, i.e. merging
paired columns and splitting in rows is performed for other columns. It can be found that C2 can be further
merged into the column cluster while the row cluster remains unchanged.
The final clustering in rows and columns gives bicluster 2. In the other row group split at the
beginning, i.e. group 4, the “merge and split” process is applied. As demonstrated in Fig.9, this results
in bicluster 3. Actually, the "merge and split" process is repeated for each columnpair.
Since there is either no significantly large bicluster found or the same biclusters are
detected, only the three biclusters are obtained from BiVisu. That is, the same set of bicluster
embedded in the
dataset are detected.
Table II. The distinct difference values between C3 and C6 and the corresponding number of counts.
group  1  2 
3  4  5 
values  6  2 
1  1  9 
counts  1  4 
2  4  1 

{C3,C6}  C1  C2 
C4  C5 
R1 
{6,8}  2  4 
1  8 
R3 
{6,8}  2  4 
3  8 
R6 
{6,8}  2  4 
2  2 
R8 
{6,8}  2  4 
0  1 


C1C3  C2C3  C4C3 
C5C3 
R1 
4  2  5 
2 
R3 
4  2  3 
2 
R6 
4  2  4 
4 
R8 
4  2  6 
5 

(a)  (b) 
Fig.8. Comparison of columns C1, C2, C4 and C5 with {C3, C6}: (a) in expression values and (b)
in a difference matrix.

{C3,C6}  C1  C2 
C4  C5 
R2 
{3,2}  2  5 
6  1 
R4 
{2,1}  1  0 
5  0 
R5 
{4,3}  0  4 
6  2 
R7 
{5,4}  6  3 
1  3 


C1C3  C2C3  C4C3 
C5C3 
R2 
1  2  3 
2 
R4 
1  2  3 
2 
R5 
4  0  2 
2 
R7 
1  2  4 
2 

(a)  (b) 
Fig.9. Comparison of columns C1, C2, C4 and C5 with {C3, C6}: (a) in expression values and (b)
in a difference matrix.