This file is indexed.

/usr/share/doc/mcl/html/mcl.html is in mcl-doc 1:14-137-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Copyright (c) 2014 Stijn van Dongen -->
<head>
<meta name="keywords" content="manual">
<style type="text/css">
/* START aephea.base.css */
body
{ text-align: justify;
margin-left: 0%;
margin-right: 0%;
}
a:link { text-decoration: none; }
a:active { text-decoration: none; }
a:visited { text-decoration: none; }
a:link { color: #1111aa; }
a:active { color: #1111aa; }
a:visited { color: #111166; }
a.local:link { color: #11aa11; }
a.local:active { color: #11aa11; }
a.local:visited { color: #116611; }
a.intern:link { color: #1111aa; }
a.intern:active { color: #1111aa; }
a.intern:visited { color: #111166; }
a.extern:link { color: #aa1111; }
a.extern:active { color: #aa1111; }
a.extern:visited { color: #661111; }
a.quiet:link { color: black; }
a.quiet:active { color: black; }
a.quiet:visited { color: black; }
div.verbatim
{ font-family: monospace;
margin-top: 1em;
margin-bottom: 1em;
font-size: 10pt;
margin-left: 2em;
white-space: pre;
}
div.indent
{ margin-left: 8%;
margin-right: 0%;
}
.right { text-align: right; }
.left { text-align: left; }
.nowrap { white-space: nowrap; }
.item_leader
{ position: relative;
margin-left: 8%;
}
.item_compact { position: absolute; vertical-align: baseline; }
.item_cascade { position: relative; }
.item_leftalign { text-align: left; }
.item_rightalign
{ width: 2em;
text-align: right;
}
.item_compact .item_rightalign
{ position: absolute;
width: 52em;
right: -2em;
text-align: right;
}
.item_text
{ position: relative;
margin-left: 3em;
}
.smallcaps { font-size: smaller; text-transform: uppercase }
/* END aephea.base.css */
body { font-family: "Garamond", "Gill Sans", "Verdana", sans-serif; }
body
{ text-align: justify;
margin-left: 8%;
margin-right: 8%;
}
</style>
<title>The mcl manual</title>
</head>
<body>
<p style="text-align:right">
16 May 2014&nbsp;&nbsp;&nbsp;
<a class="local" href="mcl.ps"><b>mcl</b></a>
14-137
</p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">1.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#name">NAME</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">2.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#started">GETTING STARTED</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">3.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#synopsis">SYNOPSIS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">4.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#description">DESCRIPTION</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">5.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#options">OPTIONS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">6.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#pruneoptions">PRUNING OPTIONS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">7.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#impl">IMPLEMENTATION OPTIONS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">8.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#examples">EXAMPLES</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">9.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#applicability">APPLICABILITY</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">10.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#files">FILES</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">11.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#environment">ENVIRONMENT</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">12.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#diagnostics">DIAGNOSTICS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">13.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#bugs">BUGS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">14.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#author">AUTHOR</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">15.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#history">HISTORY/CREDITS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">16.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#seealso">SEE ALSO</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">17.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#references">REFERENCES</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">18.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#notes">NOTES</a>
</div>
</div>

<a name="name"></a>
<h2>NAME</h2>
<p style="margin-bottom:0" class="asd_par">
mcl &mdash; The Markov Cluster Algorithm, aka the MCL algorithm.</p>
<p style="margin-bottom:0" class="asd_par">
This program implements <b>mcl</b>, a cluster algorithm for graphs. A single
parameter controls the granularity of the output clustering, namely the
<a class="intern" href="#opt-I"><b>-I</b>&nbsp;<i>inflation</i></a> option described further below.
In standard usage of the program this parameter is the only one that may
require changing. By default it is set to&nbsp;2.0 and this is a good way to
start. If you want to explore cluster structure in graphs with MCL, vary
this parameter to obtain clusterings at different levels of granularity. A
good set of starting values is 1.4, 2, 4, and 6.
</p>
<p style="margin-bottom:0" class="asd_par">
The program has a rather large set of options. Except for <a class="intern" href="#opt-I"><b>-I</b></a>
none affects the clustering method itself. The other options are for a
variety of aspects, such as study of the underlying <span class="smallcaps">MCL</span> process (i.e.
dumping of iterands), network preprocessing (incorporated for efficiency),
resource allocation options (for large-scale analyses), output naming
and placement, output formatting, setting of verbosity levels, and so on.
</p>
<p style="margin-bottom:0" class="asd_par">
Network construction and reduction techniques should not be considered as
part of a clustering algorithm. Nevertheless particular techniques may
benefit particular methods or applications. In mcl many transformations are
accessible through the <a class="intern" href="#opt-tf"><b>-tf</b></a> option. It can be used for edge
weight transformations and selection, as well as transformations that act on
a graph as a whole.
It is for example possible to remove edges with weight below 0.7 by issuing
<b>-tf</b>&nbsp;<b>'gq(0.7)'</b>, where the quotes are necessary to prevent the shell
from interpreting the parentheses. The option accepts more complicated
sequences, such as <b>-tf</b>&nbsp;<b>'gq(0.7),add(-0.7)'</b>. This causes all
remaining edge weights to be shifted to the range [0-0.3], assuming that the
input contains correlations. Many more transformations are supported, as
documented in <a class="local sibling" href="mcxio.html">mcxio</a>. Examples of graph-wide transformations are
<tt>'#knn(&lt;num&gt;)'</tt> and <tt>'#ceilnb(&lt;num&gt;)'</tt>. The first only keeps those edges
that occur in the list of top-<tt>&lt;num&gt;</tt> edges of highest weight in both of
its incident nodes. The second removes edges from nodes of highest degree
first, proceeding until all node degrees satisfy the given threshold.
The <a class="intern" href="#opt-pi"><b>-pi</b></a> (pre-inflation) option can be used to increase the
contrast in edge weights. This may be useful when clusterings are coarse and
fine-grained clusterings are difficult to obtain.
</p>

<a name="started"></a>
<h2>GETTING STARTED</h2>
<p style="margin-top:0em; margin-bottom:0em">There are two main modes of invocation. The most accessible is
<i>label mode</i>
which assumes a format alternatively called label input or <span class="smallcaps">ABC</span>-format.
The input is then a file or stream in which each
line encodes an edge in terms of two labels (the 'A' and the 'B')
and a numerical value (the 'C'), all separated
by white space. The most basic example of usage is this:
</p>
<div class="verbatim">   <b>mcl</b> &lt;-|fname&gt; <a class="intern" href="#opt--abc"><b>--abc</b></a> <a class="intern" href="#opt-o"><b>-o</b>&nbsp;<i>fname-out</i></a></div>
<p style="margin-top:0em; margin-bottom:0em">
The output is then a file where each line is a cluster of tab-separated
labels.
If clustering is part of a larger workflow where it
is desirable to analyse and compare multiple clusterings,
then it is a good idea to use native mode rather than <span class="smallcaps">ABC</span>&nbsp;mode.
The reason for this is that native mode is understood
by all programs in the mcl suite. It is a more stringent
and unambiguous format, and hence more suitable for data exchange.
The reader is refered to <a class="local sibling" href="clmprotocols.html">clmprotocols</a> for more information.
</p>

<a name="synopsis"></a>
<h2>SYNOPSIS</h2>
<p style="margin-top:0em; margin-bottom:0em">
The example invocation below assumes matrix input, as explained above
and described in the <a class="local sibling" href="mcxio.html">mcxio</a> section. Switching to label mode requires
the input file to be in <span class="smallcaps">ABC</span>-format and the addition of the <a class="intern" href="#opt--abc"><b>--abc</b></a>
option.</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> &lt;-|fname&gt;
<a class="intern" href="#opt-I"><b>[-I</b> &lt;num&gt; (<i>inflation</i>)<b>]</b></a>
<a class="intern" href="#opt-o"><b>[-o</b> &lt;str&gt; (<i>fname</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0" class="asd_par">
These options are sufficient in 95 percent of the cases or more. The first
argument must be the name of a file containing a graph/matrix in the mcl
input format, or a hyphen to read from STDIN. With respect to clustering,
the <a class="intern" href="#opt-I"><b>-I</b> option</a> is foremost relevant.
</p>
<p style="margin-bottom:0" class="asd_par">
The full listing of <b>mcl</b> options is shown further below, separated
into parts corresponding with functional aspects such
as clustering, threading, verbosity, network preprocessing, pruning and resource management,
automatic output naming, and dumping.
</p>
<p style="margin-bottom:0"><b>Baseline clustering options</b><br>
<a class="intern" href="#opt-I"><b>[-I</b> &lt;num&gt; (<i>inflation</i>)<b>]</b></a>
<a class="intern" href="#opt-o"><b>[-o</b> &lt;fname&gt; (<i>fname</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Output options</b><br>
<a class="intern" href="#opt-odir"><b>[-odir</b> &lt;dname&gt; (<i>directory</i>)<b>]</b></a>
<a class="intern" href="#opt--d"><b>[--d</b> (<i>use input directory for output</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Input options</b><br>
<a class="intern" href="#opt--abc"><b>[--abc</b> (<i>expect/write labels</i>)<b>]</b></a>
<a class="intern" href="#opt--sif"><b>[--sif</b> (<i>expect/write labels</i>)<b>]</b></a>
<a class="intern" href="#opt--etc"><b>[--etc</b> (<i>expect/write labels</i>)<b>]</b></a>
<a class="intern" href="#opt--expect-values"><b>[--expect-values</b> (<i>sif or etc stream contains values</i>)<b>]</b></a>
<a class="intern" href="#opt-use-tab"><b>[-use-tab</b> &lt;fname&gt; (<i>use mapping to write</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Transform options</b><br>
<a class="intern" href="#opt-tf"><b>[-tf</b> &lt;tf-spec&gt; (<i>transform input matrix values</i>)<b>]</b></a>
<a class="intern" href="#opt-abc-tf"><b>[-abc-tf</b> &lt;tf-spec&gt; (<i>transform input stream values</i>)<b>]</b></a>
<a class="intern" href="#opt--abc-neg-log10"><b>[--abc-neg-log10</b> (<i>take log10 of stream values, negate sign</i>)<b>]</b></a>
<a class="intern" href="#opt--abc-neg-log"><b>[--abc-neg-log</b> (<i>take log of stream values, negate sign</i>)<b>]</b></a>
<a class="intern" href="#opt-icl"><b>[-icl</b> &lt;fname&gt; (<i>create subgraph on clustering</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Cache options</b><br>
<a class="intern" href="#opt-write-graph"><b>[-write-graph</b> &lt;fname&gt; (<i>write graph</i>)<b>]</b></a>
<a class="intern" href="#opt-write-graphx"><b>[-write-graphx</b> &lt;fname&gt; (<i>write transformed graph</i>)<b>]</b></a>
<a class="intern" href="#opt-write-expanded"><b>[-write-expanded</b> &lt;fname&gt; (<i>write expanded graph</i>)<b>]</b></a>
<a class="intern" href="#opt--write-limit"><b>[--write-limit</b> (<i>write mcl process limit</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Input manipulation options</b><br>
<a class="intern" href="#opt-pi"><b>[-pi</b> &lt;num&gt; (<i>pre-inflation</i>)<b>]</b></a>
<a class="intern" href="#opt-ph"><b>[-ph</b> &lt;num&gt; (<i>pre-inflation, max-bound</i>)<b>]</b></a>
<a class="intern" href="#opt-if"><b>[-if</b> &lt;num&gt; (<i>start-inflation</i>)<b>]</b></a>
<a class="intern" href="#opt--discard-loops"><b>[--discard-loops=</b>&lt;y/n&gt; (<i>discard y/n loops in input</i>)<b>]</b></a>
<a class="intern" href="#opt--sum-loops"><b>[--sum-loops</b> (<i>set loops to sum of other arcs weights</i>)<b>]</b></a>
<a class="intern" href="#opt-c"><b>[-c</b> &lt;num&gt; (<i>reweight loops</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Clustering processing options</b><br>
<a class="intern" href="#opt-sort"><b>[-sort</b> &lt;str&gt; (<i>sort mode</i>)<b>]</b></a>
<a class="intern" href="#opt-overlap"><b>[-overlap</b> &lt;str&gt; (<i>overlap mode</i>)<b>]</b></a>
<a class="intern" href="#opt--force-connected"><b>[--force-connected=</b>&lt;y/n&gt; (<i>analyze components</i>)<b>]</b></a>
<a class="intern" href="#opt--check-connected"><b>[--check-connected=</b>&lt;y/n&gt; (<i>analyze components</i>)<b>]</b></a>
<a class="intern" href="#opt--analyze"><b>[--analyze=</b>&lt;y/n&gt; (<i>performance criteria</i>)<b>]</b></a>
<a class="intern" href="#opt--show-log"><b>[--show-log=</b>&lt;y/n&gt; (<i>show log</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Verbosity options</b><br>
<a class="intern" href="#opt-q"><b>[-q</b> &lt;spec&gt; (<i>log levels</i>)<b>]</b></a>
<a class="intern" href="#opt-v"><b>[-v</b> &lt;str&gt; (<i>verbosity type on</i>)<b>]</b></a>
<a class="intern" href="#opt-V"><b>[-V</b> &lt;str&gt; (<i>verbosity type off</i>)<b>]</b></a>
<a class="intern" href="#opt--show"><b>[--show</b> (<i>print (small) matrices to screen</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Thread options</b><br>
<a class="intern" href="#opt-te"><b>[-te</b> &lt;int&gt; (<i>#expansion threads</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Output file name and annotation options</b><br>
<a class="intern" href="#opt-o"><b>[-o</b> &lt;str&gt; (<i>fname</i>)<b>]</b></a>
<a class="intern" href="#opt-ap"><b>[-ap</b> &lt;str&gt; (<i>use str as file name prefix</i>)<b>]</b></a>
<a class="intern" href="#opt-aa"><b>[-aa</b> &lt;str&gt; (<i>append str to suffix</i>)<b>]</b></a>
<a class="intern" href="#opt-az"><b>[-az</b> (<i>show output file name and exit</i>)<b>]</b></a>
<a class="intern" href="#opt-ax"><b>[-ax</b> (<i>show output suffix and exit</i>)<b>]</b></a>
<a class="intern" href="#opt-annot"><b>[-annot</b> &lt;str&gt; (<i>dummy annotation option</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Dump options</b><br>
<a class="intern" href="#opt-dump-interval"><b>[-dump-interval</b> &lt;i:j&gt; (<i>dump interval</i>)<b>]</b></a>
<a class="intern" href="#opt-dump-modulo"><b>[-dump-modulo</b> &lt;int&gt; (<i>dump modulo</i>)<b>]</b></a>
<a class="intern" href="#opt-dump-stem"><b>[-dump-stem</b> &lt;stem&gt; (<i>dump file stem</i>)<b>]</b></a>
<a class="intern" href="#opt-dump"><b>[-dump</b> &lt;str&gt; (<i>type</i>)<b>]</b></a>
<a class="intern" href="#opt-digits"><b>[-digits</b> &lt;int&gt; (<i>printing precision</i>)<b>]</b></a>
<a class="intern" href="#opt--write-binary"><b>[--write-binary</b> (<i>write matrices in binary format</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Info options</b><br>
<a class="intern" href="#opt--jury-charter"><b>[--jury-charter</b> (<i>explains jury</i>)<b>]</b></a>
<a class="intern" href="#opt--version"><b>[--version</b> (<i>show version</i>)<b>]</b></a>
<a class="intern" href="#opt-how-much-ram"><b>[-how-much-ram</b> k (<i>RAM upper bound</i>)<b>]</b></a>
<a class="intern" href="#opt-h"><b>[-h</b> (<i>most important options</i>)<b>]</b></a>
<a class="intern" href="#opt--help"><b>[--help</b> (<i>one-line description for all options</i>)<b>]</b></a>
<a class="intern" href="#opt-z"><b>[-z</b> (<i>show current settings</i>)<b>]</b></a>
<a class="intern" href="#opt-az"><b>[-az</b> (<i>show output file name and exit</i>)<b>]</b></a>
<a class="intern" href="#opt-ax"><b>[-ax</b> (<i>show output suffix and exit</i>)<b>]</b></a>
<a class="intern" href="#opt--show-schemes"><b>[--show-schemes</b> (<i>show resource schemes</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Implementation options</b><br>
<a class="intern" href="#opt-sparse"><b>[-sparse</b> &lt;int&gt; (<i>sparse matrix multiplication threshold</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0"><b>Pruning options</b><br>
The following options all pertain to the various pruning strategies that can
be employed by <b>mcl</b>. They are described in the <a class="intern" href="#pruneoptions">PRUNING OPTIONS</a>
section, accompanied by a description of the mcl pruning strategy.
If your graphs are huge
and you have an appetite for tuning, have a look at the following:</p>
<p style="margin-bottom:0" class="asd_par">
<a class="intern" href="#opt-scheme"><b>[-scheme</b> &lt;int&gt; (<i>resource scheme</i>)<b>]</b></a>
<a class="intern" href="#opt-resource"><b>[-resource</b> &lt;int&gt; (<i>per-node resource maximum</i>)<b>]</b></a>
<a class="intern" href="#opt-p"><b>[-p</b> &lt;num&gt; (<i>cutoff</i>)<b>]</b></a>
<a class="intern" href="#opt-P"><b>[-P</b> &lt;int&gt; (<i>1/cutoff</i>)<b>]</b></a>
<a class="intern" href="#opt-S"><b>[-S</b> &lt;int&gt; (<i>selection number</i>)<b>]</b></a>
<a class="intern" href="#opt-R"><b>[-R</b> &lt;int&gt; (<i>recovery number</i>)<b>]</b></a>
<a class="intern" href="#opt-pct"><b>[-pct</b> &lt;int&gt; (<i>recover percentage</i>)<b>]</b></a>
<a class="intern" href="#opt-warn-pct"><b>[-warn-pct</b> &lt;int&gt; (<i>prune warn percentage</i>)<b>]</b></a>
<a class="intern" href="#opt-warn-factor"><b>[-warn-factor</b> &lt;int&gt; (<i>prune warn factor</i>)<b>]</b></a>
</p>
<p style="margin-bottom:0" class="asd_par">
The first argument of <b>mcl</b> must be a file name, but some options are allowed
to appear as the first argument instead. These are the options that cause
mcl to print out information of some kind, after which it will gracefully
exit. The full list of these options is</p>
<p style="margin-bottom:0" class="asd_par">
<a class="intern" href="#opt-z"><b>-z</b></a>,
<a class="intern" href="#opt-h"><b>-h</b></a>,
<a class="intern" href="#opt--help"><b>--help</b></a>,
<a class="intern" href="#opt--version"><b>--version</b></a>,
<a class="intern" href="#opt--show-settings"><b>--show-settings</b></a>,
<a class="intern" href="#opt--show-schemes"><b>--show-schemes</b></a>,
<a class="intern" href="#opt--jury-charter"><b>--jury-charter</b></a>.
</p>

<a name="description"></a>
<h2>DESCRIPTION</h2>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> implements the <b>MCL algorithm</b>, short for the <b>Markov cluster
algorithm</b>, a cluster algorithm for graphs developed by Stijn van Dongen at
the Centre for Mathematics and Computer Science in Amsterdam, the
Netherlands. The algorithm simulates flow using two simple algebraic
operations on matrices.
The inception of this flow process and the theory behind it are
described elsewhere (see <a class="intern" href="#references">REFERENCES</a>). Frequently asked questions
are answered in the <a class="local sibling" href="mclfaq.html">mclfaq</a> section.
The program described here is a fast threaded implementation written by the
algorithm's creator with contributions by several others. Anton Enright
co-implemented threading; see the <a class="intern" href="#history">HISTORY/CREDITS</a> section for a complete
account.
See the <a class="intern" href="#applicability">APPLICABILITY</a> section for a description of the type of
graph mcl likes best, and for a qualitative assessment of its speed.
<b>mcl</b> is accompanied by several other utilities for analyzing clusterings and
performing matrix and graph operations; see the <a class="intern" href="#seealso">SEE ALSO</a> section.</p>
<p style="margin-bottom:0" class="asd_par">
The first argument is the input file name,
or a single hyphen to read from stdin. The rationale for
making the name of the input file a fixed parameter is that you typically do
several runs with different parameters. In command line mode it is
pleasant if you do not have to skip over an immutable parameter all the
time.</p>
<p style="margin-bottom:0" class="asd_par">
The <a class="intern" href="#opt-I"><b>-I</b>&nbsp;<i>f</i> option</a> is the main control,
affecting cluster granularity.
In finding good <b>mcl</b> parameter settings for a particular domain,
or in finding cluster structure at different levels of granularity,
one typically runs mcl multiple times for varying values of f (refer
to the <a class="intern" href="#opt-I"><b>-I</b>&nbsp;<i>inflation</i></a> option for further information).</p>
<p style="margin-bottom:0" class="asd_par"><b>NOTE</b> <span class="smallcaps">MCL</span> interprets the matrix
entries or graph edge weights as <b>similarities</b>, and it likes
<b>undirected input graphs</b> best. It can handle directed graphs, but any
node pair (i,j) for which w(i,j) is much smaller than w(j,i) or vice versa
will presumably have a slightly negative effect on the clusterings output by
mcl. Many such node pairs will have a distinctly negative effect, so try to
make your input graphs undirected. How your edge weights are computed may
affect mcl's performance. In protein clustering, one way to go is to
choose the negated logarithm of the BLAST probabilities (see
<a class="intern" href="#references">REFERENCES</a>).</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b>'s default parameters should make it quite fast under almost all
circumstances. Taking default parameters, mcl has been used to generate
good protein clusters on 133k proteins, taking 10 minutes running time on a
Compaq ES40 system with four alpha EV6.7 processors. It has been applied
(with good results) to graphs with two million nodes, and if you have the memory
(and preferably CPUs as well) nothing should stop you from going further.</p>
<p style="margin-bottom:0" class="asd_par">
For large graphs, there are several groups of parameters available for
tuning the mcl computing process, should it be necessary. The easiest thing
to do is just vary the <a class="intern" href="#opt-scheme"><b>-scheme</b> option</a>. This
triggers different settings for the group of pruning parameters
<a class="intern" href="#opt-P"><b>-p/-P</b>, <b>-R</b>, <b>-S</b>, and
<b>-pct</b></a>. The default setting corresponds with
<b>-scheme</b>&nbsp;<b>6</b>.
When doing multiple mcl runs for the same graphs with different
<b>-I</b> settings (for obtaining clusterings at different levels
of granularity), it can be useful to factor out the first bit
of computation that is common to all runs, by using
the <a class="intern" href="#opt-write-expanded"><b>-write-expanded</b></a> option one time
and then using <a class="intern" href="#opt-if"><b>-if</b>&nbsp;<i>inflation</i></a> for each run in the set.
Whether mcl considers a graph large depends mainly on the graph
connectivity; a highly connected graph on 50,000 nodes is large to
mcl (so that you might want to tune resources) whereas a sparsely
connected graph on 500,000 nodes may be business as usual.</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> is a memory munger. Its precise appetite depends on the resource
settings. You can get a rough (and usually much too pessimistic) upper
bound for the amount of RAM that is needed by using the
<a class="intern" href="#opt-how-much-ram"><b>-how-much-ram</b> option</a>. The corresponding
entry in this manual page contains the simple formula via which the upper
bound is computed.</p>
<p style="margin-bottom:0" class="asd_par">
Other options of interest are the option to specify threads
<a class="intern" href="#opt-te"><b>-te</b></a>, and the verbosity-related options
<a class="intern" href="#opt-v"><b>-v</b> and <b>-V</b></a>.
The actual settings are shown with <b>-z</b>, and for graphs with
at most 12 nodes or so you can view the MCL matrix iterands on screen
by supplying <a class="intern" href="#opt--show"><b>--show</b></a> (this may give some
more feeling).</p>
<p style="margin-bottom:0" class="asd_par">
MCL iterands allow a generic interpretation as clusterings as well. The
clusterings associated with early iterands may contain a fair amount of
overlap. Refer to the <a class="intern" href="#opt-dump"><b>-dump</b> option</a>, the <a class="local sibling" href="mclfaq.html">mclfaq</a>
manual, and the <b>clm imac</b> utility (Interpret Matrices As Clusterings).
Use <b>clm imac</b> only if you have a special reason; the normal usage of <b>mcl</b> is
to do multiple runs for varying <b>-I</b> parameters and use the
clusterings output by mcl itself.</p>
<p style="margin-bottom:0" class="asd_par">
Under very rare circumstances, <b>mcl</b> might get stuck in a seemingly infinite
loop. If the number of iterations exceeds a hundred and the <i>chaos</i>
indicator remains nearly constant (presumably around value 0.37), you can
force mcl to stop by sending it the ALRM signal (usually done
by <b>kill -s ALRM</b> <i>pid</i>). It will finish the current
iteration, and interpret the last iterand as a clustering. Alternatively, you
can wait and mcl might converge by itself or it will certainly stop after
10,000 iterations. The
most probable explanation for such an infinite loop is that the input graph
contains the flip-flop graph of node size three as a subgraph.</p>
<p style="margin-bottom:0" class="asd_par">
The creator of this page feels that manual pages are a valuable resource,
that online html documentation is also a good thing to have, and
that info pages are way <i>way</i> ahead of their time. The
<a class="intern" href="#notes">NOTES</a> section explains how this page was created.</p>
<p style="margin-bottom:0" class="asd_par">
In the <a class="intern" href="#options">OPTIONS</a> section options are listed in order of
importance, with related options grouped together.</p>

<a name="options"></a>
<h2>OPTIONS</h2>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-I"></a><b>-I</b> &lt;num&gt; (<i>inflation</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Sets the main inflation value to <i>&lt;num&gt;</i>. This value is the main handle
for affecting cluster granularity. It is usually chosen somewhere
in the range [1.2-5.0]. <b>-I</b>&nbsp;<b>5.0</b> will tend to result
in fine-grained clusterings, and <b>-I</b>&nbsp;<b>1.2</b> will tend to
result in very coarse grained clusterings. Your mileage will vary
depending on the characteristics of your data. That is why it is
a good idea to test the quality and coherency of your clusterings
using <b>clm dist</b> and <b>clm info</b>. This will most likely reveal that
certain values of <b>-I</b> are simply not right for your data. The
<b>clm dist</b> section contains a discussion of how to use the cluster
validation tools shipped with <b>mcl</b> (see the <a class="intern" href="#seealso">SEE ALSO</a> section).</p>
<p style="margin-bottom:0" class="asd_par">
With low values for <b>-I</b>, like <b>-I</b>&nbsp;<b>1.2</b>, you should be
prepared to use more resources in order to maintain quality of
clusterings, i.e. increase the argument to the
<a class="intern" href="#opt-scheme"><b>-scheme</b> option</a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-o"></a><b>-o</b> &lt;fname&gt; (<i>output file name</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-odir"></a><b>-odir</b> &lt;dname&gt; (<i>output directory name</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--d"></a><b>--d</b> (<i>use input directory for output</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The default mode of output creation for <b>mcl</b> is to create a file name that
uses the input file name stripped of any leading path components, augmented
with a prefix '<tt>out.</tt>' and a suffix encoding pivotal <b>mcl</b> parameters.
This will usually be the inflation value which is the argument to the <b>-I</b>
option. By default the output file is written in the current directory.
For example, if the input is named <tt>data/small.mci</tt> for example and
inflation is set to three, the output file will be named
<tt>out.small.mci.I30</tt>.
</p>
<p style="margin-bottom:0" class="asd_par">
This behaviour can be overridden in various ways. The <b>-o</b> option simply
specifies the output file name, which may include path components that
should exist. It is possible to send the clustering to STDOUT by supplying
<b>-o</b>&nbsp;<b>-</b>. With the <b>-odir</b>&nbsp;<i>&lt;dname&gt;</i> option <b>mcl</b> constructs the
output file name as before, but writes the file in the directory <i>&lt;dname&gt;</i>.
Finally, the option <b>--d</b> is similar but more specific in that <b>mcl</b>
will write the output in the directory specified by the path component of
the input file, that is, the directory in which the input file resides.
</p>
<p style="margin-bottom:0" class="asd_par">
If either one of
<a class="intern" href="#opt--abc"><b>--abc</b></a>, <a class="intern" href="#opt--sif"><b>--sif</b></a>, <a class="intern" href="#opt--etc"><b>--etc</b></a> or
<a class="intern" href="#opt-use-tab"><b>-use-tab</b>&nbsp;<i>tab-file</i></a> is used the output will be in label format.
Otherwise the clustering is output in the mcl matrix format; see the <a class="local sibling" href="mcxio.html">mcxio</a>
section for more information on this. Refer also to the group of options
discussed at <a class="intern" href="#opt--abc"><b>--abc</b></a>.
</p>
<p style="margin-bottom:0" class="asd_par">
Look at the <a class="intern" href="#opt-ap"><b>-ap</b>&nbsp;<i>prefix</i></a> option and its siblings for the
automatic naming constructions employed by <b>mcl</b> if the <b>-o</b> option is
not used.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-c"></a><b>-c</b> &lt;num&gt; (<i>reweight loops</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--sum-loops"></a><b>--sum-loops</b> (<i>set loops to sum of other arcs weights</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
With the <b>-c</b>&nbsp;<i>&lt;num&gt;</i> option, as the final step of loop computation
(i.e. after initialization and shadowing) all loop weights are multiplied by
<b>&lt;num&gt;</b>, if supplied.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--discard-loops"></a><b>--discard-loops</b>=&lt;y/n&gt; (<i>discard loops in input</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
By default <b>mcl</b> will remove any loops that are present in the input. Use
<b>--discard-loops</b>=<b>n</b> to turn this off. Bear in mind that loops will
still be modified in all cases where the loop weight is not maximal among
the list of edge weights for a given node.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--abc"></a><b>--abc</b> (<i>expect/write labels</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--sif"></a><b>--sif</b> (<i>expect/write labels</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--etc"></a><b>--etc</b> (<i>expect/write labels</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--expect-values"></a><b>--expect-values</b> (<i>expect label:weight format</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-use-tab"></a><b>-use-tab</b> &lt;fname&gt; (<i>use mapping to write</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
These items all relate to label input and/or label output.
<b>--abc</b> tells <b>mcl</b> to expect label input and output clusters
in terms of those labels. This simple format expects two or
three fields separated by white space on each line.
The first and second fields are interpreted as labels specifying
source and destination node respectively. The third field, if present,
specifies the weight of the arc connecting the two nodes.
</p>
<p style="margin-bottom:0" class="asd_par">
The option <b>--sif</b> tells <b>mcl</b> to expect <span class="smallcaps">SIF</span> (Simple Interaction File)
format. This format is line based. The first two fields
specify the source node (as a label) and the relationship type. An arbitrary number
of fields may follow, each containing a label identifying a destination node.
The second field is simply ignored by <b>mcl</b>.
As an extension to the SIF format
weights may optionally follow the labels, separated from them with a colon character.
It is in this case necessary to use the <b>--expect-values</b> option.
The <b>--etc</b> option expects a format identical in all respects except
that the relationship type is not present, so that all fields after the first
are interpreted as destination labels.
</p>
<p style="margin-bottom:0" class="asd_par">
<b>-use-tab</b> is only useful when matrix input is used.
It will use the tab file to convert the output to labels; it does
not fail on indices missing from the tab file, but will bind
these to generated dummy labels.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-tf"></a><b>-tf</b> &lt;tf-spec&gt; (<i>transform input matrix values</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-abc-tf"></a><b>-abc-tf</b> &lt;tf-spec&gt; (<i>transform input stream values</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--abc-neg-log10"></a><b>--abc-neg-log10</b> (<i>take log10 of stream values, negate sign</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--abc-neg-log"></a><b>--abc-neg-log</b> (<i>take log of stream values, negate sign</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
<b>-tf</b> transforms the values of the input matrix according
to <b>&lt;tf-spec&gt;</b>. <b>-abc-tf</b> transforms the stream values
(when <a class="intern" href="#opt--abc"><b>--abc</b></a> is used) according to <b>&lt;tf-spec&gt;</b>.
<b>--abc-neg-log</b> and <b>--abc-neg-log10</b>
imply that the stream input values are
replaced by the negation of their log or log10 values, respectively.
The reason for their existence is documented in <a class="local sibling" href="mcxio.html">mcxio</a>.
For a description of the transform language excpected/accepted
in <b>&lt;tf-spec&gt;</b> refer to the same.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-icl"></a><b>-icl</b> &lt;fname&gt; (<i>create subgraph on clustering</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-bottom:0" class="asd_par">
With this option <b>mcl</b> will subcluster the provided clustering.
It does so by removing, first of all, all edges from the input
graph that connect different clusters.
The resulting graph consists of different
components, at least as many as there are clusters in
the input clustering. This graph is then subjected to transformations,
if any are specified, and then clustered.
The output name is constructed by appending the normal mcl-created
file name suffix to the name of the input clustering.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-write-graph"></a><b>-write-graph</b> &lt;fname&gt; (<i>write graph</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-write-graphx"></a><b>-write-graphx</b> &lt;fname&gt; (<i>write transformed graph</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-write-expanded"></a><b>-write-expanded</b> &lt;fname&gt; (<i>write expanded graph</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--write-limit"></a><b>--write-limit</b> (<i>write mcl process limit</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The first two options are somewhat outdated, in that the prefered way of
loading networks is by using <a class="local sibling" href="mcxload.html">mcxload</a>. The option
<b>-write-expanded</b> can be useful for exploring more complicated input
transformations that incorporate an expansion step, but is not really
relevant for production use. The last option is mainly educational and for
analyzing the <b>mcl</b> process itself.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-scheme"></a><b>-scheme</b> &lt;num&gt; (<i>use a preset resource scheme</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-resource"></a><b>-resource</b> &lt;num&gt; (<i>allow n neighbours throughout</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
There are currently seven different resource schemes, indexed 1..7.
High schemes result in more expensive computations that may possibly be
more accurate. The default scheme is 4. When <b>mcl</b> is done, it will give a
grade (the so called <i>jury synopsis</i>) to the appropriateness of the
scheme used. <i>A low grade does not necessarily imply that the
resulting clustering is bad</i> - but anyway, a low grade should be reason
to try for a higher scheme.
</p>
<p style="margin-bottom:0" class="asd_par">
Use the <b>-resource</b>&nbsp;<i>&lt;num&gt;</i> option to cap for each nodes the number of
neighbours tracked during computation at <i>&lt;num&gt;</i> nodes.
</p>
<p style="margin-bottom:0" class="asd_par">
The <a class="intern" href="#pruneoptions">PRUNING OPTIONS</a> section contains an elaborate description
of the way <b>mcl</b> manages resources, should you be interested.
In case you are worried about the validation of the resulting
clusterings, the <a class="local sibling" href="mclfaq.html">mclfaq</a> section
has several entries discussing this issue. The bottom line is
that you have to compare the clusterings resulting from different
schemes (and otherwise identical parameters) using utilities
such as <b>clm dist</b>, <b>clm info</b> on the one hand, and your
own sound judgment on the other hand.
</p>
<p style="margin-bottom:0" class="asd_par">
If your input graph is extremely dense, with an average node degree
(i.e. the number of neighbours per node) that is somewhere above
500, you may need to filter the input graph by removing edges,
for example by using one of <b>-tf</b>&nbsp;<b>'#ceilnb()'</b>
or <b>-tf</b>&nbsp;<b>'#knn()'</b>.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--show-schemes"></a><b>--show-schemes</b> (<i>show preset resource schemes</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Shows the explicit settings to which the different preset schemes
correspond.</p>
<p style="margin-bottom:0" class="asd_par">
The characteristics are written in the same format (more or less) as
the output triggered by <a class="intern" href="#opt-v-pruning"><b>-v</b>&nbsp;<b>pruning</b></a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-V"></a><b>-V</b> &lt;str&gt; (<i>verbosity type off</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
See the <b>-v</b> option below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-v"></a><b>-v</b> &lt;str&gt; (<i>verbosity type on</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
These are the different verbosity modes:</p>
<p style="margin-bottom:0" class="asd_par">
<b>pruning</b><br>
<b>explain</b><br>
<b>cls</b><br>
<b>all</b></p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-q"></a><b>-q</b> &lt;spec&gt; (<i>log levels</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-bottom:0" class="asd_par">
To make mcl as quiet as can be, add
<b>-q</b>&nbsp;<b>x</b> <b>-V</b>&nbsp;<b>all</b> to the command line.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>-q</b> option governs a general logging mechanism.
The format accepted is described in the <a class="local sibling" href="tingea.log.html">tingea.log</a> manual page.</p>
<p style="margin-bottom:0" class="asd_par">
The other options govern verbosity levels specific to mcl. <b>-v</b>&nbsp;<b>all</b>
turns them all on, <b>-V</b>&nbsp;<b>all</b> turns them all off. <b>-v</b>&nbsp;<i>str</i> and
<b>-V</b>&nbsp;<i>str</i> turn on/off the single mode <i>str</i> (for <i>str</i>
equal to one of <b>pruning</b>, <b>cls</b>, or <b>explain</b>). Each verbosity
mode is given its own entry below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><b>-v</b>&nbsp;<b>explain</b></div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This mode causes the output of explanatory headers illuminating the
output generated with the <b>pruning</b> verbosity mode.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><b>-v</b>&nbsp;<b>pruning</b></div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This mode causes output of resource-related quantities. It has
<a class="intern" href="#opt-v-pruning">a separate entry in the PRUNING OPTIONS section</a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><b>-v</b>&nbsp;<b>cls</b></div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This mode (on by default) prints a terse list of characteristics of the
clusterings associated with intermediate iterands. The characteristics are
<b>E/V</b>, <b>cls</b>, <b>olap</b>, and <b>dd</b>. They respectively stand for the
number of outgoing arcs per node (as an average), the number of clusters in
the overlapping clustering associated with the iterand, the number of nodes
in overlap, and the <i>dag depth</i> associated with the DAG (directed acyclic
graph) associated with the iterand. For more information on this DAG refer
to the <a class="intern" href="#opt-dump"><b>-dump</b></a> option description in this manual and also
<a class="local sibling" href="mclfaq.html">mclfaq</a>.</p>
<p style="margin-bottom:0"><b>Standard log information</b><br></p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_leftalign nowrap " >m-ie</div></div>
<div class=" item_text " style="margin-left:5em">
This gives the ratio of (1) the number of edges after initial expansion, before pruning, to
(2) the number of edges of the current iterand.
</div>
<div class=" item_compact"><div class=" item_leftalign nowrap " >m-ex</div></div>
<div class=" item_text " style="margin-left:5em">
This gives the ratio of (1) the number of edges after expansion (including pruning), to
(2) the number of edges of the current iterand.
</div>
<div class=" item_compact"><div class=" item_leftalign nowrap " >i-ex</div></div>
<div class=" item_text " style="margin-left:5em">
This gives the ratio of (1) the number of edges after expansion (including pruning), to
(2) the number of edges of the original input graph.
</div>
<div class=" item_compact"><div class=" item_leftalign nowrap " >fmv</div></div>
<div class=" item_text " style="margin-left:5em">
This gives the percentage of nodes (matrix columns) for which full matrix/vector
computation was used (as opposed to using a sparse technique).
</div>
</div>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-aa"></a><b>-aa</b> &lt;str&gt; (<i>append &lt;str&gt; to suffix</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
See the <b>-ap</b> option below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-ap"></a><b>-ap</b> &lt;str&gt; (<i>use &lt;str&gt; as file name prefix</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
If the <a class="intern" href="#opt-o"><b>-o</b>&nbsp;<i>fname</i></a> option is not used,
<b>mcl</b> will create a file name (for writing output to) that
should uniquely characterize the important parameters used in the
current invocation of mcl. The default format is <b>out.fname.suf</b>,
where <b>out</b> is simply the literal string <tt>out</tt>, <b>fname</b> is the
first argument containing the name of the file (with the graph) to be
clustered, and where <b>suf</b> is the suffix encoding a set of parameters
(described further below).</p>
<p style="margin-bottom:0" class="asd_par">
The <b>-ap</b>&nbsp;<i>str</i> option specifies a prefix to use
rather than <b>out.fname</b> as sketched above.
However, <b>mcl</b> will interpret the character '=', if present
in <i>str</i>, as a placeholder for the input file name.</p>
<p style="margin-bottom:0" class="asd_par">
If the <b>-aa</b>&nbsp;<i>str</i> option is used, <b>mcl</b> will append
<b>str</b> to the suffix <b>suf</b> created by itself.
You can use this if you need to encode some extra information in the
file name suffix.</p>
<p style="margin-bottom:0" class="asd_par">
The suffix is constructed as follows. The <b>-I</b>&nbsp;<i>f</i>
and <b>-scheme</b> parameter are always encoded.
Other options, such as <b>-pi</b>&nbsp;<i>f</i> and <b>-knn</b>
are only encoded if they are used. Any real argument <i>f</i>
is encoded using <i>exactly one</i> trailing digit behind the decimal
separator (which itself is not written). The setting <b>-I</b>&nbsp;<b>3.14</b>
is thus encoded as I31. The <b>-scheme</b> option is encoded using the
letter 's', all other options mentioned here are encoded as themselves
(stripped of the hyphen). For example</p>
<div class="verbatim">mcl small.mci -I 3 -c 2.5 -pi 0.8 -scheme 5</div>
<p style="margin-top:0em; margin-bottom:0em">
results in the file name <tt>out.small.mci.I30s5c25pi08</tt>.
If you want to know beforehand what file name will be produced,
use the <b>-az</b> option.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-az"></a><b>-az</b> (<i>show output file name and exit</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-ax"></a><b>-ax</b> (<i>show output suffix and exit</i>)</div>
<div class=" item_text " style="margin-left:2em">
If <b>mcl</b> automatically constructs a file name, it can be helpful to known
beforehand what that file name will be. Use <b>-az</b> and mcl will
write the file name to STDOUT and exit. This can be used if mcl is
integrated into other software for which the automatic creation of
unique file names is convenient.
<p style="margin-bottom:0" class="asd_par">
By default mcl incorporates the input file name into the output file
name and appends a short suffix describing the most important
option settings. Use <b>-ax</b> to find out what that suffix is.
This can be useful in wrapper pipeline scripts such as clxcoarse.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-annot"></a><b>-annot</b> &lt;str&gt; (<i>dummy annotation option</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
<b>mcl</b> writes the command line with which it was invoked to the output
clustering file. Use this option to include any additional
information. MCL does nothing with this option except copying
it as just described.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-te"></a><b>-te</b> &lt;int&gt; (<i>#expansion threads</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Threading is useful if you have a multi-processor system. <b>mcl</b> will
spawn <i>k</i> threads of computation. If these are computed
in parallel (this depends on the number of CPUs available to the
mcl process) it will speed up the process accordingly.</p>
<p style="margin-bottom:0" class="asd_par">
When threading, it is best not to turn on pruning verbosity
mode if you are letting mcl run unattended, unless you want to
scrutinize its output later. This is because it makes <b>mcl</b> run
somewhat slower, although the difference is not dramatic.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-pi"></a><b>-pi</b> &lt;num&gt; (<i>pre-inflation</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-ph"></a><b>-ph</b> &lt;num&gt; (<i>pre-inflation, max-bound</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
If used, <b>mcl</b> will apply inflation one time to the input graph
before entering the main process. This can be useful for
making the edge weights in a graph either more homogeneous (which
may result in less granular clusterings) or more heterogeneous
(which may result in more granular clusterings).
Homogeneity is achieved for values <i>&lt;num&gt;</i> less than one,
heterogeneity for values larger than one.
Values to try are normally in the range <tt>[2.0,10.0]</tt>.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>-ph</b> option is special in that it does not rescale
columns to be stochastic. Instead, it rescales columns so that
the maximum value found in the column stays the same after
inflation was applied. There is little significance to this,
and what little there is is undocumented.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-if"></a><b>-if</b> &lt;num&gt; (<i>start-inflation</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
If used, <b>mcl</b> will apply inflation one time to the input graph
before entering the main process. The difference with
<b>-pi</b> is that with the latter option mcl may apply
certain transformations after reading in the matrix such
as adding or modifying loops. The purpose of
the <b>-if</b> (mnemonic for <i>inflation-first</i>)
option is to use it on graphs saved
with the <a class="intern" href="#opt--write-expanded"><b>--write-expanded</b></a> option and convey
to mcl that it should not apply those transformations.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-dump-interval"></a><b>-dump-interval</b> &lt;i:j&gt; (<i>dump interval</i>)</div><div class=" item_cascade item_leftalign nowrap" ><b>-dump-interval</b>&nbsp;<i>all</i></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Dump during iterations i..j-1. Use <i>all</i> to dump in all
iterations. See the <b>-dump</b>&nbsp;<i>str</i> option below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-dump-modulo"></a><b>-dump-modulo</b> &lt;int&gt; (<i>dump i+0..i+&lt;int&gt;..</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Sampling rate: select only these iterations in the dump interval.
See the <b>-dump</b>&nbsp;<i>str</i> option below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-dump-stem"></a><b>-dump-stem</b> &lt;stem&gt; (<i>file stem</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Set the the stem for file names of dumped
objects (default <i>mcl</i>). See the <b>-dump</b>&nbsp;<i>str</i> option below.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-dump"></a><b>-dump</b> &lt;str&gt; (<i>type</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
<i>str</i> is checked for substring occurrences of the following entries.
Repeated use of <b>-dump</b> is also allowed.</p>
<p style="margin-bottom:0" class="asd_par">
<b>ite</b><br>
<b>dag</b><br>
<b>cls</b><br>
<b>chr</b><br>
<b>lines</b><br>
<b>cat</b></p>
<p style="margin-bottom:0" class="asd_par">
<b>lines</b> and <b>cat</b> change the mode of dumping. The first
changes the dump format to a line based pairwise format rather
than the default mcl matrix format. The second causes all
dumped items to be dumped to the default stream used for the
output clustering, which is appended at the end.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>ite</b> option writes <b>mcl</b> iterands to file. The <b>cls</b>
option writes clusterings associated with mcl iterands to file.
These clusters are obtained from a particular directed acyclic graph
(abbreviated as DAG) associated with each iterand. The <b>dag</b> option
writes that DAG to file. The DAG can optionally be further
pruned and then again be interpreted as a
clustering using <b>clm imac</b>, and <b>clm imac</b> can also
work with the matrices written using the <b>ite</b> option.
It should be noted that clusterings associated with intermediate
iterands may contain overlap, which is interesting in
many applications. For more information
refer to <a class="local sibling" href="mclfaq.html">mclfaq</a> and the <a class="intern" href="#references">REFERENCES</a> section below.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>result</b> option dumps the usual MCL clustering.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>chr</b> option says, for each iterand I, to output a matrix C with
characteristics of I. C has the same number of columns as I. For each
column k in C, row entry 0 is the diagonal or 'loop' value of column k in
I <i>after</i> expansion and pruning, and <i>before</i> inflation and
rescaling. Entry 1 is the loop value <i>after</i> inflation and rescaling.
Entry 2 is the center of column k (the sum of its entries squared)
computed <i>after</i> expansion and <i>before</i> pruning, entry 3 is the
maximum value found in that column at the same time. Entry 4 is the
amount of mass kept for that column <i>after pruning</i>.</p>
<p style="margin-bottom:0" class="asd_par">
The <b>-ds</b> option sets the stem for file names of dumped
objects (default <i>mcl</i>). The <b>-di</b> and <b>-dm</b>
options allow a selection of iterands to be made.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-digits"></a><b>-digits</b> &lt;int&gt; (<i>printing precision</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This has two completely different uses. It sets
the number of decimals used for pretty-printing <b>mcl</b> iterands
when using the <a class="intern" href="#opt--show"><b>--show</b> option</a> (see below),
and it sets the number of decimals used for writing
the expanded matrix when using the <a class="intern" href="#opt-write-expanded"><b>-write-expanded</b></a> option.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--show"></a><b>--show</b> (<i>print matrices to screen</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Print matrices to screen. The number of significant digits to be
printed can be tuned with <b>-digits</b>&nbsp;<i>n</i>. An 80-column screen
allows graphs (matrices) of size up to 12(x12) to be printed with
three digits precision (behind the comma), and of size up to 14(x14)
with two digits. This can give you an idea of how <b>mcl</b> operates,
and what the effect of pruning is.
Use e.g. <b>-S</b>&nbsp;<b>6</b> for such
a small graph and view the MCL matrix iterands with <b>--show</b>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--write-binary"></a><b>--write-binary</b> (<i>output format</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Write matrix dump output in binary mcl format rather
than interchange mcl format (the default). Note that <b>mcxconvert</b>
can be used to convert each one into the other.
See <a class="local sibling" href="mcxio.html">mcxio</a> and <a class="local sibling" href="mcx.html">mcx</a> for more information.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-sort"></a><b>-sort</b> &lt;str&gt; (<i>sort mode</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
<i>str</i> can be one of <b>lex</b>, <b>size</b>, <b>revsize</b>,
or <b>none</b>. The default is 'revsize', in which the largest
clusters come first. If the mode is 'size', smallest clusters
come first, if the mode is 'lex', clusters are ordered
lexicographically, and if the mode is 'none', the order
is the same as produced by the procedure used by mcl to
map matrices onto clusterings.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-overlap"></a><b>-overlap</b> &lt;str&gt; (<i>overlap mode</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Mode <i>keep</i> causes mcl to retain overlap should this improbable event
occur. In theory, <b>mcl</b> may generate a clustering that contains overlap,
although this almost never happens in practice, as it requires some
particular type of symmetry to be present in the input graph (not just any
symmetry will do). Mathematically speaking, this is a conjecture and not a
theorem, but the present author wil eat his shoe if it fails to be true (for
marzipan values of shoe). It is easy though to construct an input graph for
which certain mcl settings result in overlap - for example a line graph on
an odd number of nodes. The default is to excise overlapping parts and
introduce them as clusters in their own right. It is possible to allocate
nodes in overlap to the first cluster in which they occur (i.e. rather
arbitrarily), corresponding with mode <i>cut</i>.
</p>
<p style="margin-bottom:0" class="asd_par">
In mode <i>split</i> mcl will put all nodes in overlap into
separate clusters. These clusters are chosen such that
two nodes are put in the same new cluster if and only if
they always occur paired in the clusters of the
overlapping clustering.</p>
<p style="margin-bottom:0" class="asd_par">
This option has no effect on the clusterings that are
output when using <a class="intern" href="#opt-dump"><b>-dump</b>&nbsp;<i>cls</i></a> -
the default for those is that overlap is not touched,
and this default can not yet be overridden.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--force-connected"></a><b>--force-connected</b>=&lt;y/n&gt; (<i>analyze components</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--check-connected"></a><b>--check-connected</b>=&lt;y/n&gt; (<i>analyze components</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
If the input graph has strong bipartite characteristics,
mcl may yield clusters that do not correspond to connected
components in the input graph. Turn one of these modes on to
analyze the resultant clustering.</p>
<p style="margin-bottom:0" class="asd_par">
If loose clusters are found
they will be split into subclusters corresponding to
connected components.
With <b>--force-connected</b>=<i>y</i> mcl will write the
corrected clustering to the normal output file, and the old clustering
to the same file with suffix <tt>orig</tt>.
With <b>--check-connected</b>=<i>y</i> mcl will write the
loose clustering to the normal output file, and the corrected clustering
to the same file with suffix <tt>coco</tt>.</p>
<p style="margin-bottom:0" class="asd_par">
These options are not on by default, as the analysis
is currently (overly) time-consuming
and mcl's behaviour actually makes some sense
(when taking bipartite characteristics into account).</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--analyze"></a><b>--analyze</b>=&lt;y/n&gt; (<i>performance criteria</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
With this mode turned on, <b>mcl</b> will reread the input matrix
and compute a few performance criteria and attach them to
the output file. Off by default.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--show-log"></a><b>--show-log</b>=&lt;y/n&gt; (<i>show log</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Shows the log with process characteristics on STDOUT.
By default, this mode is off.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--jury-charter"></a><b>--jury-charter</b> (<i>explains jury</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Explains how the jury synopsis is computed from the jury marks.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--version"></a><b>--version</b> (<i>show version</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Show version.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-how-much-ram"></a><b>-how-much-ram</b> &lt;int&gt; (<i>RAM upper bound</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
<b>&lt;int&gt;</b> is interpreted as the number of nodes of an input graph.
mcl will print the maximum amount of RAM it needs for its computations.
The formula for this number in bytes is:</p>
<div class="verbatim">   2 * c * k * &lt;int&gt;

   2  :  two matrices are concurrently held in memory.
   c  :  mcl cell size (as shown by -z).
 &lt;int&gt;:  graph cardinality (number of nodes).
   k  :  MAX(s, r).
   s  :  select number (-S, -scheme options).
   r  :  recover number (-R, -scheme options).</div>
<p style="margin-top:0em; margin-bottom:0em">
This estimate will usually be too pessimistic. It does assume though that
the average node degree of the input graph does not exceed k. The
<b>-how-much-ram</b> option takes other command-line arguments into
account (such as <b>-S</b> and <b>-R</b>), and it expresses the
amount of RAM in megabyte units.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-h"></a><b>-h</b> (<i>show help</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Shows a selection of the most important <b>mcl</b> options.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--help"></a><b>--help</b> (<i>show help</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Gives a one-line description for all options.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-z"></a><b>-z</b> (<i>show settings</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Show current settings for tunable parameters.
<b>--show-settings</b> is a synonym.</p>
</div>
</div>

<a name="pruneoptions"></a>
<h2>PRUNING OPTIONS</h2>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_cascade item_leftalign nowrap" ><a name="opt-p"></a><b>-p</b> &lt;num&gt; (<i>cutoff</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-P"></a><b>-P</b> &lt;int&gt; (<i>1/cutoff</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-S"></a><b>-S</b> &lt;int&gt; (<i>selection number</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-R"></a><b>-R</b> &lt;int&gt; (<i>recover number</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-pct"></a><b>-pct</b> &lt;pct&gt; (<i>recover percentage</i>)</div>
<div class=" item_text " style="margin-left:2em">
After computing a new (column stochastic) matrix vector during expansion
(which is matrix multiplication c.q. squaring), the vector is
successively exposed to different pruning strategies. The intent of
pruning is that many small entries are removed while retaining much of
the stochastic mass of the original vector. After pruning, vectors are
rescaled to be stochastic again. MCL iterands are theoretically known to
be sparse in a weighted sense, and this manoever effectively perturbs the
MCL process a little in order to obtain matrices that are genuinely
sparse, thus keeping the computation tractable. An example of monitoring
pruning can be found in the discussion of
<a class="intern" href="#opt-v-pruning"><b>-v</b>&nbsp;<b>pruning</b></a>
at the end of this section.
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> proceeds as follows. First, entries that are smaller than
<i>cutoff</i> are removed, resulting in a vector with at most
<i>1/cutoff</i> entries. The cutoff can be supplied either by
<b>-p</b>, or as the inverse value by <b>-P</b>. The latter is more
intuitive, if your intuition is like mine (P stands for precision or pruning).
The cutoff just described is rigid; it is the same for all vectors. The
<a class="intern" href="#opt--adapt"><b>--adapt</b> option</a> causes the computation of a
cutoff that depends on a vector's homogeneity properties, and this option
may or may not speed up mcl.
</p>
<p style="margin-bottom:0" class="asd_par">
Second, if the remaining stochastic mass (i.e. the sum of all remaining
entries) is less than <i>&lt;pct&gt;</i>/100 and the number of remaining
entries is less than <i>&lt;r&gt;</i> (as specified by the <b>-R</b> flag),
<b>mcl</b> will try to regain ground by recovering the largest discarded
entries. The total number of entries is not allowed to grow larger than
<i>&lt;r&gt;</i>.
If recovery was not necessary, mcl tries to prune the vector further
down to at most <i>s</i> entries (if applicable), as specified by the
<b>-S</b> flag. If this results in a vector that satisfies the recovery
condition then recovery is attempted, exactly as described above. The
latter will not occur of course if <i>&lt;r&gt;</i> &lt;= <i>&lt;s&gt;</i>.
</p>
<p style="margin-bottom:0" class="asd_par">
The default setting is something like <b>-P</b>&nbsp;<b>4000</b> <b>-S</b>&nbsp;<b>500</b>
<b>-R</b>&nbsp;<b>600</b>. Check the <b>-z</b> flag to be sure. There is a set
of precomposed settings, which can be triggered with the
<a class="intern" href="#opt-scheme"><b>-scheme</b>&nbsp;<i>k</i> option</a>. <i>k</i>=4 is the default
scheme; higher values for <i>k</i> result in costlier and more accurate
computations (vice versa for lower, cheaper, and less accurate).
The schemes are listed using the <b>--show-schemes</b> option. It is
advisable to use the <b>-scheme</b> option only in interactive mode,
and to use the explicit expressions when doing batch processing. The
reason is that there is <i>no guarantee whatsoever</i> that the schemes
will not change between different releases. This is because the scheme
options should reflect good general purpose settings, and it may become
appararent that other schemes are better.
</p>
<p style="margin-bottom:0" class="asd_par">
Note that 'less accurate' or 'more accurate' computations may still
generate the same output clusterings. Use <b>clm dist</b> to compare output
clusterings for different resource parameters. Refer to <a class="local sibling" href="clmdist.html">clm&nbsp;dist</a>
for a discussion of this issue.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-warn-pct"></a><b>-warn-pct</b> &lt;int&gt; (<i>prune warn percentage</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-warn-factor"></a><b>-warn-factor</b> &lt;int&gt; (<i>prune warn factor</i>)</div>
<div class=" item_text " style="margin-left:2em">
The two options <b>-warn-pct</b> and <b>-warn-factor</b> relate to
warnings that may be triggered once the <i>initial</i> pruning of a vector
is completed. The idea is to issue warnings if initial pruning almost
completely destroys a computed vector, as this may be a sign that the
pruning parameters should be changed. It depends on the mass remaining
after initial pruning whether a warning will be issued. If that mass is
less than <i>warn-pct</i> or if the number of remaining entries is smaller
by a factor <i>warn-factor</i> than both the number of entries originally
computed <i>and</i> the recovery number, in that case, <b>mcl</b> will issue a
warning.
<p style="margin-bottom:0" class="asd_par">
<b>-warn-pct</b> takes an integer between 0 and 100 as parameter,
<b>-warn-factor</b> takes a real positive number. They default to
something like 30 and 50.0. If you want to see less warnings, decrease
<i>warn-pct</i> and increase <i>warn-factor</i>. Set <i>warn-factor</i> to zero
if you want no warnings.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-v-pruning"></a><b>-v</b>&nbsp;<b>pruning</b></div></div>
<div class=" item_text " style="margin-left:2em">
Pruning verbosity mode causes <b>mcl</b> to emit several statistics related to
the pruning process, each of which is described below. Use
<b>-v</b>&nbsp;<b>explain</b> to get explanatory headers in the output as well
(or simply use <b>-v</b>&nbsp;<b>all</b>).
</div>
</div>

<a name="impl"></a>
<h2>IMPLEMENTATION OPTIONS</h2>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-sparse"></a><b>-sparse</b> &lt;int&gt; (<i>sparse matrix multiplication threshold</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This value (by default set to 10) determines when mcl switches to sparse matrix/vector multiplication.
For a given column stochastic vector (corresponding with all the neighbours
of a given node <tt>v</tt> according to the current mcl iterand) the sum <tt>S</tt> of neighbour counts
of all neighbours of <tt>v</tt> is computed, counting duplicates. This is exactly the
number of matrix entries involved in the computation of the new column vector
for the matrix product. If <tt>S</tt> times <i>&lt;int&gt;</i> does not exceed the
number of nodes in the graph (equal to both column and row dimension of
the matrices used) then a sparse implementation is used. Otherwise
an optimized regular implementation is used. Intuitively, this option can
be thought of as the estimated overhead per matrix floating point operation incurred
by the sparse implementation compared with the regular implementation.
MCL uses this estimated overhead to determine which implementation is likely
to be quicker. Testing has shown this strategy works very well for graphs of a wide
range of sizes, including graphs with up to 3 million nodes and 500 million edges.
</p>
<p style="margin-bottom:0"><b>NOTE</b><br>
The effectiveness of this option is influenced by hardware-specific properties
such as the CPU L2 cache size. The default value should work reasonably well
across a wide variety of scenarios, but it may be possible to squeeze
faster run times out of mcl by tuning this parameter to the graphs that are
specific for your application domain.
</p>
</div>
</div>

<a name="examples"></a>
<h2>EXAMPLES</h2>
<p style="margin-bottom:0" class="asd_par">The following is an example of label input</p>
<div class="verbatim">---8&lt;------8&lt;------8&lt;------8&lt;------8&lt;---
cat hat  0.2
hat bat  0.16
bat cat  1.0
bat bit  0.125
bit fit  0.25
fit hit  0.5
hit bit  0.16
---&gt;8------&gt;8------&gt;8------&gt;8------&gt;8---</div>
<p style="margin-top:0em; margin-bottom:0em">It can be clustered like this:</p>
<div class="verbatim">mcl cathat --abc -o out.cathat</div>
<p style="margin-top:0em; margin-bottom:0em">The file out.cathat should now like like this</p>
<div class="verbatim">---8&lt;------8&lt;------8&lt;------8&lt;------8&lt;---
cat hat bat
bit fit hit
---&gt;8------&gt;8------&gt;8------&gt;8------&gt;8---</div>
<p style="margin-top:0em; margin-bottom:0em">A few things to note. First, MCL will symmetrize any
arrow it finds. If it sees <tt>bat cat 1.0</tt> it will act as if it also
saw <tt>cat bat 1.0</tt>. You can explicitly specify <tt>cat bat 1.0</tt>,
mcl will in the first parse stage simply end up with duplicate
entries. Second, MCL deduplicates repeated edges by taking the
one with the maximum value. So,</p>
<div class="verbatim">---8&lt;------8&lt;------8&lt;------8&lt;------8&lt;---
cat hat  0.2
hat cat  0.16
hat cat  0.8
---&gt;8------&gt;8------&gt;8------&gt;8------&gt;8---</div>
<p style="margin-top:0em; margin-bottom:0em">Will result in two arrows <tt>cat-hat</tt> and <tt>hat-cat</tt> both
with value 0.8.</p>

<a name="applicability"></a>
<h2>APPLICABILITY</h2>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> will work very well for graphs in which the diameter of the natural
clusters is not too large. The presence of many edges between different
clusters is not problematic; as long as there is cluster structure, mcl
will find it. It is less likely to work well for graphs with clusters
(inducing subgraphs) of large diameter, e.g. grid-like graphs derived from
Euclidean data. So mcl in its canonical form is certainly not fit for
boundary detection or image segmentation. I experimented with a modified
mcl and boundary detection in the thesis pointed to below (see
<a class="intern" href="#references">REFERENCES</a>). This was fun and not entirely unsuccesful, but not
something to be pursued further.
</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> likes <i>undirected input graphs best</i>, and it really dislikes graphs
with node pairs (i,j) for which an arc going from i to j is present and the
counter-arc from j to i is absent. Try to make your input graph undirected.
Furthermore, mcl interprets edge weights in graphs as similarities. If you
are used to working with dissimilarities, you will have to convert those to
similarities using some conversion formula. The most important thing is
that you feel confident that the similarities are reasonable, i.e. if X is
similar to Y with weight 2, and X is similar to Z with weight 200, then this
should mean that the similarity of Y (to X) is neglectible compared with the
similarity of Z (to X).
</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> is probably not suited for clustering <i>tree graphs</i>. This is because
mcl works best if there are multiple paths between different nodes in the
natural clusters, but in tree graphs there is only one path between any pair
of nodes. Trees are too sparse a structure for mcl to work on.
</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> may well be suited for clustering <i>lattices</i>. It will depend
on the density characteristics of the lattice, and the conditions for
success are the same as those for clustering graphs in general: The
diameter of the natural clusters should not be too large.
<b>NOTE</b> when clustering a lattice, you <i>have</i> to cluster
the underlying undirected graph, and not the directed graph that represents
the lattice itself. The reason is that one has to allow mcl (or any other
cluster algorithm) to 'look back in time', so to speak. Clustering and
directionality bite each other (long discussion omitted).
</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcl</b> has a worst-case time complexity O(N*k^2), where N is the number of
nodes in the graph, and k is the maximum number of neighbours tracked during
computations. k depends on the <b>-P</b> and <b>-S</b> options. If the
<b>-S</b> option is used (which is the default setting) then k equals the
value corresponding with this option. Typical values for k are in the range
500..1000. The average case is much better than the worst case though, as
cluster structure itself has the effect of helping mcl's pruning schemes,
certainly if the diameter of natural clusters is not large.
</p>

<a name="files"></a>
<h2>FILES</h2>
<p style="margin-bottom:0" class="asd_par">
There are currently no resource nor configuration files.
The mcl matrix format is described in the <a class="local sibling" href="mcxio.html">mcxio</a> section.
</p>

<a name="environment"></a>
<h2>ENVIRONMENT</h2>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_cascade"><div class=" item_leftalign nowrap " >MCLXIODIGITS</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
When writing matrices in interchange format, mcl will use this variable (if
present) as the precision (number of digits) for printing the fractional
part of values.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " >MCLXIOVERBOSITY</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
MCL and its sibling applications will usually report about matrix
input/output from/to disk. The verbosity level can be regulated
via MCLXIOVERBOSITY. These are the levels it can currently be set to.</p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">1</div></div>
<div class=" item_text " style="margin-left:2em">
Silent but applications may alter this.
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">2</div></div>
<div class=" item_text " style="margin-left:2em">
Silent and applications can not alter this.
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">4</div></div>
<div class=" item_text " style="margin-left:2em">
Verbose but applications may alter this.
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">8</div></div>
<div class=" item_text " style="margin-left:2em">
Verbose and applications can not alter this (default).
</div>
</div>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " >MCLXIOFORMAT</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
MCL and its sibling applications will by default output matrices
in interchange format rather than binary format (cf. <a class="local sibling" href="mcxio.html">mcxio</a>).
The desired format can be controlled via the variable
MCLXIOFORMAT. These are the levels it can currently be set to.</p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">1</div></div>
<div class=" item_text " style="margin-left:2em">
Interchange format but applications may alter this.
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">2</div></div>
<div class=" item_text " style="margin-left:2em">
Interchange format and applications can not alter this (default).
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">4</div></div>
<div class=" item_text " style="margin-left:2em">
Binary format but applications may alter this.
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-1em">8</div></div>
<div class=" item_text " style="margin-left:2em">
Binary format and applications can not alter this.
</div>
</div>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " >MCLXICFLAGS</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
If matrices are output in interchange format, by default empty vectors
will not be listed. Equivalently (during input time),
vectors for which no listing is present are understood to be empty -
note that the <i>presence</i> of a vector is established using
the domain information found in the header part.
It is possible to enforce listing of empty vectors by
setting bit '1' in the variable MCLXICFLAGS.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " >MCLXIOUNCHECKED</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
MCL and its sibling applications will always check a matrix for consistency
while it is being read. If this variable is set, the consistency check is
omitted. For large graphs the speed up can be considerable. However, if the
input graph is not conforming it will likely crash the application that
is using it.</p>
</div>
</div>

<a name="diagnostics"></a>
<h2>DIAGNOSTICS</h2>
<p style="margin-bottom:0" class="asd_par">
If <b>mcl</b> issues a diagnostic error, it will most likely be
because the input matrix could not be parsed succesfully.
<b>mcl</b> tries to be helpful in describing the kind of parse error.
The mcl matrix format is described in the <a class="local sibling" href="mcxio.html">mcxio</a> section.</p>

<a name="bugs"></a>
<h2>BUGS</h2>
<p style="margin-bottom:0" class="asd_par">
No known bugs at this time.</p>

<a name="author"></a>
<h2>AUTHOR</h2>
<p style="margin-bottom:0" class="asd_par">
Stijn van Dongen.</p>

<a name="history"></a>
<h2>HISTORY/CREDITS</h2>
<p style="margin-bottom:0" class="asd_par">
The MCL algorithm was conceived in spring 1996 by the present author.
The first implementation of the MCL algorithm followed that spring
and summer. It was written in Perl and proved the viability of
the algorithm. The implementation described here began its life in
autumn 1997. The first versions of the vital matrix library
were designed jointly by Stijn van Dongen and Annius Groenink in
the period Oktober 1997 - May 1999. The efficient matrix-vector
multiplication routine was written by Annius. This routine is
without significant changes still one of the cornerstones of this
MCL implementation.</p>
<p style="margin-bottom:0" class="asd_par">
Since May 1999 all MCL libraries have seen much development and
redesign by the present author. Matrix-matrix multiplication has been
rewritten several times to take full advantage of the sparseness
properties of the stochastic matrices brought forth by the MCL
algorithm. This mostly concerns the issue of pruning &mdash; removal of
small elements in a stochastic column in order to keep matrices
sparse.</p>
<p style="margin-bottom:0" class="asd_par">
Very instructive was that around April 2001 Rob Koopman pointed out
that selecting the k largest elements out of a collection of n is
best done using a min-heap. This was the key to the second major
rewrite (now counting three) of the MCL pruning schemes, resulting in
much faster code, generally producing a more accurate computation of
the MCL process.</p>
<p style="margin-bottom:0" class="asd_par">
In May 2001 Anton Enright initiated the parallellization of the
<b>mcl</b> code and threaded inflation. From this example, Stijn threaded
expansion. This was great, as the MCL data structures and operands
(normal matrix multiplication and Hadamard multiplication) just beg
for parallellization.</p>
<p style="margin-bottom:0" class="asd_par">
Onwards.
The January 2003 03-010 release introduced support for sparsely
enumerated (i.e. indices need not be sequential) graphs and matrices, the
result of a major overhaul of the matrix library and most higher layers.
Conceptually, the library now sees matrices as infinite quadrants
of which only finite subsections happen to have nonzero entries.</p>
<p style="margin-bottom:0" class="asd_par">
The June 2003 03-154 release introduced unix-type pipelines for clustering,
including the BLAST parser mcxdeblast and the mclblastline script.
The April 2004 04-105 release revived binary format, which has been a first
class citizen every since.</p>
<p style="margin-bottom:0" class="asd_par">
With the March 2005 05-090 release mcxsubs finally acquired a sane
specification syntax. The November 2005 05-314 release brought the ability
to stream label input directly into mcl. The subsequent release introduced a
transformation language shared by various mcl siblings that allows arbitrary
progressions of transformations to be applied to either stream values or
matrix values.</p>
<p style="margin-bottom:0" class="asd_par">
Joost van Baal set up the mcl CVS tree and packaged mcl for Debian
GNU/Linux. He completely autotooled the sources, so much so that at first I
found it hard to find them back amidst bootstrap, aclocal.m4, depcomp, and
other beauties.</p>
<p style="margin-bottom:0" class="asd_par">
Jan van der Steen shared his elegant mempool code. Philip Lijnzaad gave
useful comments. Philip, Shawn Hoon, Abel Ureta-Vidal,
and Martin Mokrejs sent helpful bug reports.</p>
<p style="margin-bottom:0" class="asd_par">
Abel Ureta-Vidal and Dinakarpandian Deendayal commented on
and contributed to mcxdeblast and mcxassemble.</p>
<p style="margin-bottom:0" class="asd_par">
Tim Hughes contributed several good bug reports for mcxassemble,
mcxdeblast and zoem (a workhorse for <b>clm format</b>).</p>

<a name="seealso"></a>
<h2>SEE ALSO</h2>
<p style="margin-bottom:0" class="asd_par">
<a class="local sibling" href="mclfaq.html">mclfaq</a> - Frequently Asked Questions.</p>
<p style="margin-bottom:0" class="asd_par">
<a class="local sibling" href="mcxio.html">mcxio</a> - a description of the mcl matrix format.</p>
<p style="margin-bottom:0" class="asd_par">
There are many more utilities. Consult
<a class="local sibling" href="mclfamily.html">mclfamily</a> for an overview of and links to all the documentation
and the utilities in the mcl family.</p>
<p style="margin-bottom:0" class="asd_par">
<b>minimcl</b> is a 200-line perl implementation of mcl. It is shipped
in the mcl distribution and can be found online at
<a class="extern" href="http://micans.org/mcl">http://micans.org/mcl</a>.</p>
<p style="margin-bottom:0" class="asd_par">
mcl's home at <a class="extern" href="http://micans.org/mcl/">http://micans.org/mcl/</a>.</p>

<a name="references"></a>
<h2>REFERENCES</h2>
<p style="margin-bottom:0" class="asd_par">
<a name="gcbfs">[1]</a>
Stijn van Dongen, <i>Graph Clustering by Flow Simulation</i>.
PhD thesis, University of Utrecht, May 2000.<br>
<a class="extern" href="http://www.library.uu.nl/digiarchief/dip/diss/1895620/inhoud.htm">http://www.library.uu.nl/digiarchief/dip/diss/1895620/inhoud.htm</a></p>
<p style="margin-bottom:0" class="asd_par">
<a name="gcvdup">[2]</a>
Stijn van Dongen, <i>Graph Clustering Via a Discrete Uncoupling Process</i>,
SIAM Journal on Matrix Analysis and Applications, 30(1):121-141, 2008.
<a class="extern" href="http://link.aip.org/link/?SJMAEL/30/121/1">http://link.aip.org/link/?SJMAEL/30/121/1</a>
</p>
<p style="margin-bottom:0" class="asd_par">
<a name="cafg">[3]</a>
Stijn van Dongen. <i>A cluster algorithm for graphs</i>.
Technical Report INS-R0010, National Research Institute for Mathematics and
Computer Science in the Netherlands, Amsterdam, May 2000.<br>
<a class="extern" href="http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z">http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z</a></p>
<p style="margin-bottom:0" class="asd_par">
<a name="supfg">[4]</a>
Stijn van Dongen. <i>A stochastic uncoupling process for graphs</i>.
Technical Report INS-R0011, National Research Institute for Mathematics and
Computer Science in the Netherlands, Amsterdam, May 2000.<br>
<a class="extern" href="http://www.cwi.nl/ftp/CWIreports/INS/INS-R0011.ps.Z">http://www.cwi.nl/ftp/CWIreports/INS/INS-R0011.ps.Z</a></p>
<p style="margin-bottom:0" class="asd_par">
<a name="pcfgcmce">[5]</a>
Stijn van Dongen. <i>Performance criteria for graph clustering and Markov
cluster experiments</i>. Technical Report INS-R0012, National Research
Institute for Mathematics and Computer Science in the Netherlands,
Amsterdam, May 2000.<br>
<a class="extern" href="http://www.cwi.nl/ftp/CWIreports/INS/INS-R0012.ps.Z">http://www.cwi.nl/ftp/CWIreports/INS/INS-R0012.ps.Z</a></p>
<p style="margin-bottom:0" class="asd_par">
<a name="eaflsdopf">[6]</a>
Enright A.J., Van Dongen S., Ouzounis C.A.
<i>An efficient algorithm for large-scale detection of protein families</i>,
Nucleic Acids Research 30(7):1575-1584 (2002).</p>

<a name="notes"></a>
<h2>NOTES</h2>
<p style="margin-bottom:0" class="asd_par">
This page was generated from <b>ZOEM</b> manual macros,
<a class="extern" href="http://micans.org/zoem">http://micans.org/zoem</a>. Both html and roff pages can be created
from the same source without having to bother with all the usual conversion
problems, while keeping some level of sophistication in the typesetting.
<a class="local" href="mcl.ps">This</a> is the PostScript derived from the zoem troff
output.</p>
</body>
</html>