This file is indexed.

/usr/share/doc/javacc4/doc/javaccgrm.html is in javacc4-doc 4.0-2.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
<HTML>
<!--

Copyright © 2002 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
California 95054, U.S.A. All rights reserved.  Sun Microsystems, Inc. has
intellectual property rights relating to technology embodied in the product
that is described in this document. In particular, and without limitation,
these intellectual property rights may include one or more of the U.S.
patents listed at http://www.sun.com/patents and one or more additional
patents or pending patent applications in the U.S. and in other countries.
U.S. Government Rights - Commercial software. Government users are subject
to the Sun Microsystems, Inc. standard license agreement and applicable
provisions of the FAR and its supplements.  Use is subject to license terms.
Sun,  Sun Microsystems,  the Sun logo and  Java are trademarks or registered
trademarks of Sun Microsystems, Inc. in the U.S. and other countries.  This
product is covered and controlled by U.S. Export Control laws and may be
subject to the export or import laws in other countries.  Nuclear, missile,
chemical biological weapons or nuclear maritime end uses or end users, whether
direct or indirect, are strictly prohibited.  Export or reexport to countries
subject to U.S. embargo or to entities identified on U.S. export exclusion
lists, including, but not limited to, the denied persons and specially
designated nationals lists is strictly prohibited.

-->
<HEAD>
<title>JavaCC Grammar Files</title>
<!-- Changed by: Michael Van De Vanter, 14-Jan-2003 -->
</HEAD>
<BODY bgcolor="#FFFFFF" >

<H1>JavaCC [tm]: Grammar Files</H1>

This page contains the complete syntax of Java Compiler Compiler [tm]
grammar files with detailed explanations of each construct.
<P>
Tokens in the grammar files follow the same conventions as for the Java programming language.
Hence identifiers, strings, characters, etc. used in the grammars are
the same as Java identifiers, Java strings, Java characters, etc.
<P>
<em>White space</em> in the grammar files also follows the same conventions as
for the Java programming language.  This includes the syntax for comments.  Most comments present in
the grammar files are generated into the generated parser/lexical analyzer.
<P>
Grammar files are preprocessed for Unicode escapes just as Java files
are (i.e., occurrences of strings such as <code>\uxxxx</code> - where <code>xxxx</code> is a hex value -
are converted the the corresponding Unicode character before lexical analysis).
<P>
<em>Exceptions to the above rules:</em>
The Java operators "<code>&lt;&lt;</code>", "<code>&gt;&gt;</code>", "<code>&gt;&gt;&gt;</code>", "<code>&lt;&lt;=</code>",
"<code>&gt;&gt;=</code>", and "<code>&gt;&gt;&gt;=</code>" are left out of JavaCC's input token list
in order to allow convenient nested use of token specifications.
Finally, the following are the additional reserved words in the Java Compiler
Compiler [tm] grammar files.
<P>

<TABLE CELLPADDING="3">
<TR>
<TD ALIGN=LEFT><strong>EOF</strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#IGNORE_CASE">IGNORE_CASE</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#JAVACODE">JAVACODE</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#LOOKAHEAD">LOOKAHEAD</A></strong></TD>
</TR>
<TR>
<TD ALIGN=LEFT><strong><A HREF="#MORE">MORE</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#options">options</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#PARSER_BEGIN">PARSER_BEGIN</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#PARSER_END">PARSER_END</A></strong></TD>
</TR>
<TR>
<TD ALIGN=LEFT><strong><A HREF="#SKIP">SKIP</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#SPECIAL_TOKEN">SPECIAL_TOKEN</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#TOKEN">TOKEN</A></strong></TD>
<TD ALIGN=LEFT><strong><A HREF="#TOKEN_MGR_DECLS">TOKEN_MGR_DECLS</A></strong></TD>
</TR>
</TABLE>

<P>
Any Java entities used in the grammar rules that follow appear italicized
with the prefix <EM>java_</EM> (<EM>e.g.</EM>, <EM>java_compilation_unit</EM>).

<P>
<HR>
<P>

<A NAME="PARSER_BEGIN"></A><A NAME="PARSER_END"></A>
<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod1">javacc_input</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod2">javacc_options</A></TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"PARSER_BEGIN" "(" &lt;IDENTIFIER&gt; ")"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_compilation_unit</EM></TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"PARSER_END" "(" &lt;IDENTIFIER&gt; ")"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod5">production</A> )*</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>&lt;EOF&gt;</TD>
</TR>
</TABLE>
<P>

The grammar file starts with a list of options (which is optional).
This is then followed by a Java compilation unit enclosed between
"PARSER_BEGIN(name)" and "PARSER_END(name)".  After this is a list
of grammar productions.  <A HREF="#prod2">Options</A> and
<A HREF="#prod5">productions</A> are described later.
<P>
The <EM>name</EM> that follows "PARSER_BEGIN" and "PARSER_END" must
be the same and this identifies the name of the generated parser.
For example, if <EM>name</EM> is "MyParser", then the following files
are generated:
<P>
<STRONG>MyParser.java:</STRONG>
The generate parser.
<BR>
<STRONG>MyParserTokenManager.java:</STRONG>
The generated token manager (or scanner/lexical analyzer).
<BR>
<STRONG>MyParserConstants.java:</STRONG>
A bunch of useful constants.
<P>
Other files such as "Token.java", "ParseError.java", etc. are also
generated.  However, these files contain boilerplate code and are
the same for any grammar and may be reused across grammars.
<P>
Between the PARSER_BEGIN and PARSER_END constructs is a regular
Java compilation unit (a compilation unit in Java lingo is the entire
contents of a Java file).  This may be any arbitrary
Java compilation unit so long as it contains a class declaration
whose name is the same as the name of the generated parser ("MyParser"
in the above example).  Hence, in general, this part of the grammar
file looks like:
<P>
<PRE>
    PARSER_BEGIN(parser_name)
    . . .
    class parser_name . . . {
      . . .
    }
    . . .
    PARSER_END(parser_name)
</PRE>
<P>
JavaCC does not perform detailed checks on the compilation unit, so
it is possible for a grammar file to pass through JavaCC and generate
Java files that produce errors when they are compiled.
<P>
If the compilation unit includes a package declaration, this is
included in all the generated files.  If the compilation unit includes
imports declarations, this is included in the generated parser and
token manager files.
<P>
The generated parser file contains everything in the compilation unit
and, in addition, contains the generated parser code that is included at
the end of the parser class.  For the above example, the generated
parser will look like:
<P>
<PRE>
    . . .
    class parser_name . . . {
      . . .
      // generated parser is inserted here.
    }
    . . .
</PRE>
<P>
The generated parser includes a public method declaration corresponding
to each non-terminal (see <A HREF="#prod9">javacode_production</A> and
<A HREF="#prod11">bnf_production</A>) in the grammar file.  Parsing with
respect to a non-terminal is achieved by calling the method corresponding
to that non-terminal.  Unlike yacc, there is no single start symbol in
JavaCC - one can parse with respect to any non-terminal in the grammar.
<P>
The generated token manager provides one public method:
<P>
<PRE>
    Token getNextToken() throws ParseError;
</PRE>
<P>
For more details on how this method may be used, please read
<A HREF="apiroutines.html">the description of the Java Compiler Compiler
API</A>.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod2">javacc_options</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>[ "<A NAME="options">options</A>" "{" ( <A HREF="#prod6">option_binding</A> )* "}" ]</TD>
</TR>
</TABLE>
<P>

The options if present, starts with the reserved word "options" followed
by a list of one or more option bindings within braces.  Each option
binding specifies the setting of one option.  The same option may not be
set multiple times.
<P>
Options may be specified either here in the grammar file, or from
<A HREF="commandline.html">the command line</A>.  If the option is set
from <A HREF="commandline.html">the command line</A>, that takes precedence.
<P>
Option names are not case-sensitive.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod6">option_binding</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"LOOKAHEAD" "=" <EM>java_integer_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"CHOICE_AMBIGUITY_CHECK" "=" <EM>java_integer_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"OTHER_AMBIGUITY_CHECK" "=" <EM>java_integer_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"STATIC" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"DEBUG_PARSER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"DEBUG_LOOKAHEAD" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"DEBUG_TOKEN_MANAGER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"OPTIMIZE_TOKEN_MANAGER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"ERROR_REPORTING" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"JAVA_UNICODE_ESCAPE" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"UNICODE_INPUT" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"IGNORE_CASE" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"USER_TOKEN_MANAGER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"USER_CHAR_STREAM" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"BUILD_PARSER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"BUILD_TOKEN_MANAGER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"TOKEN_MANAGER_USES_PARSER" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"SANITY_CHECK" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"FORCE_LA_CHECK" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"COMMON_TOKEN_ACTION" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"CACHE_TOKENS" "=" <EM>java_boolean_literal</EM> ";"</TD>
</TR>
<TR>
<TD></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"OUTPUT_DIRECTORY" "=" <EM>java_string_literal</EM> ";"</TD>
</TR>
</TABLE>

<UL>
<LI>
<STRONG><A NAME="LOOKAHEAD">LOOKAHEAD</A>:</STRONG>
The number of tokens to look ahead before making a
decision at a choice point during parsing.  The default value is 1.
The smaller this number, the faster the parser.  This number may be
overridden for specific productions within the grammar as described
later.  See the description of
<A HREF="lookahead.html">the lookahead algorithm</A> for complete
details on how lookahead works.
<LI>
<STRONG>CHOICE_AMBIGUITY_CHECK:</STRONG>
This is an integer option whose default value is 2.
This is the number of tokens considered in checking choices of the
form "A | B | ..." for ambiguity.  For example, if there is a common
two token prefix for both A and B, but no common three token prefix,
(assume this option is set to 3) then JavaCC can tell you to use a
lookahead of 3 for disambiguation purposes.  And if A and B have a
common three token prefix, then JavaCC only tell you that you need to
have a lookahead of 3 <EM>or more</EM>.  Increasing this can give you more
comprehensive ambiguity information at the cost of more processing
time.  For large grammars such as the Java grammar, increasing this number
any further causes the checking to take too much time.
<LI>
<STRONG>OTHER_AMBIGUITY_CHECK:</STRONG>
This is an integer option whose default value is 1.
This is the number of tokens considered in checking all other kinds of
choices (i.e., of the forms "(A)*", "(A)+", and "(A)?") for ambiguity.
This takes more time to do than the choice checking, and hence the
default value is set to 1 rather than 2.
<LI>
<STRONG>STATIC:</STRONG>
This is a boolean option whose default value is true.  If
true, all methods and class variables are specified as static in the
generated parser and token manager.  This allows only one parser object to be present,
but it improves the performance of the parser.  To perform multiple
parses during one run of your Java program, you will have to call the
<A HREF="apiroutines.html">ReInit()</A>
method to reinitialize your parser if it is static.
If the parser is non-static, you may use the "new" operator to
construct as many parsers as you wish.  These can all be used
simultaneously from different threads.
<LI>
<STRONG>DEBUG_PARSER:</STRONG>
This is a boolean option whose default value is false.  This
option is used to obtain debugging information from the generated
parser.  Setting this option to true causes the parser to generate
a trace of its actions.  Tracing may be disabled by
calling the method <A HREF="apiroutines.html">disable_tracing()</A>
in the generated parser class.  Tracing may be subsequently enabled
by calling the method <A HREF="apiroutines.html">enable_tracing()</A>
in the generated parser class.
<LI>
<STRONG>DEBUG_LOOKAHEAD:</STRONG>
This is a boolean option whose default value is false.  Setting this
option to true causes the parser to generate all the tracing information
it does when the option DEBUG_PARSER is true, and in addition, also
causes it to generated a trace of actions performed during
<A HREF="lookahead.html">lookahead operation</A>.
<LI>
<STRONG>DEBUG_TOKEN_MANAGER:</STRONG>
This is a boolean option whose default value is false.  This
option is used to obtain debugging information from the generated
token manager.  Setting this option to true causes the token manager to generate
a trace of its actions.  This trace is rather large and should only
be used when you have a lexical error that has been reported to you
and you cannot understand why.  Typically, in this situation, you
can determine the problem by looking at the last few lines of this trace.
<LI>
<STRONG>ERROR_REPORTING:</STRONG>
This is a boolean option whose default value is
true.  Setting it to false causes errors due to parse errors to be
reported in somewhat less detail.  The only reason to set this
option to false is to improve performance.
<LI>
<STRONG>JAVA_UNICODE_ESCAPE:</STRONG>
This is a boolean option whose default value is
false.  When set to true, the generated parser uses
an input stream object that processes Java Unicode escapes
(\u...) before sending characters to the token manager.  By
default, Java Unicode escapes are not processed.
<BR>
This option is ignored if either of options USER_TOKEN_MANAGER,
USER_CHAR_STREAM is set to true.
<LI>
<STRONG>UNICODE_INPUT:</STRONG>
This is a boolean option whose default value is
false.  When set to true, the generated parser uses
uses an input stream object that reads Unicode files.  By default,
ASCII files are assumed.
<BR>
This option is ignored if either of
options USER_TOKEN_MANAGER, USER_CHAR_STREAM is set to true.
<LI>
<STRONG><A NAME="IGNORE_CASE">IGNORE_CASE:</A></STRONG>
This is a boolean option whose default value is false.
Setting this option to true causes the generated token manager to ignore
case in the token specifications and the input files.  This is useful
for writing grammars for languages such as HTML.  It is also possible
to localize the effect of IGNORE_CASE by using
<A HREF="#prod10">an alternate mechanism described later</A>.
<LI>
<STRONG>USER_TOKEN_MANAGER:</STRONG>
This is a boolean option whose default value is
false.  The default action is to generate a token manager
that works on the specified grammar tokens.  If this
option is set to true, then the parser is generated to accept tokens
from any token manager of type "TokenManager" - this interface
is generated into the generated parser directory.
<LI>
<STRONG>USER_CHAR_STREAM:</STRONG>
This is a boolean option whose default value is
false.  The default action is to generate a character stream reader
as specified by the options JAVA_UNICODE_ESCAPE and UNICODE_INPUT.
The generated token manager receives characters
from this stream reader.  If this option is set to true, then the
token manager is generated to read characters from any character
stream reader of type "CharStream.java".  This file is generated
into the generated parser directory.
<BR>
This option is ignored if USER_TOKEN_MANAGER is set to true.
<LI>
<STRONG>BUILD_PARSER:</STRONG>
This is a boolean option whose default value is true.
The default action is to generate the parser file ("MyParser.java"
in the above example).  When set to false, the parser file is
not generated.  Typically, this option is set to false when
you wish to generate only the token manager and use it without
the associated parser.
<LI>
<STRONG>BUILD_TOKEN_MANAGER:</STRONG>
This is a boolean option whose default value is true.
The default action is to generate the token manager file
("MyParserTokenManager.java" in the above example).  When set to
false the token manager file is not generated.  The only reason
to set this option to false is to save some time during parser
generation when you fix problems in the parser part of the grammar
file and leave the lexical specifications untouched.
<LI>
<STRONG>TOKEN_MANAGER_USES_PARSER:</STRONG>
This is a boolean option whose default value is false.
When set to true, the generated token manager will include a field 
called <CODE>parser</CODE> that references the instantiating parser 
instance (of type <CODE>MyParser</CODE> in the above example).
The main reason for having a parser in a token manager is using
some of its logic in lexical actions.
This option has no effect if the STATIC option is set to true.
<LI>
<STRONG>SANITY_CHECK:</STRONG>
This is a boolean option whose default value is true.
JavaCC performs many syntactic and semantic checks on the grammar
file during parser generation.  Some checks such as detection of
left recursion, detection of ambiguity, and bad usage of empty
expansions may be suppressed for faster parser generation by
setting this option to false.  Note that the presence of these
errors (even if they are not detected and reported by setting this
option to false) can cause unexpected behavior from the generated
parser.
<LI>
<STRONG>FORCE_LA_CHECK:</STRONG>
This is a boolean option whose default value is false.
This option setting controls lookahead ambiguity checking performed
by JavaCC.  By default (when this option is false), lookahead
ambiguity checking is performed for all choice points where the
default lookahead of 1 is used.  Lookahead ambiguity checking is
not performed at choice points where there is an
<A HREF="lookahead.html">explicit lookahead specification</A>,
or if the option LOOKAHEAD is set to something other than 1.
Setting this option to true performs lookahead ambiguity checking
at <EM>all</EM> choice points regardless of the lookahead specifications
in the grammar file.
<LI>
<STRONG>COMMON_TOKEN_ACTION:</STRONG>
This is a boolean option whose default value is false.
When set to true, every call to the token manager's method
"getNextToken" (<A HREF="apiroutines.html">see the description of the
Java Compiler Compiler API</A>) will cause a call to a used defined
method "CommonTokenAction" after the token has been scanned in by the
token manager.  The user must define this method within the
<A HREF="#prod12">TOKEN_MGR_DECLS</A> section.
The signature of this method is:
<P>
<PRE>
    void CommonTokenAction(Token t)
</PRE>
<P>
<LI>
<STRONG>CACHE_TOKENS:</STRONG>
This is a boolean option whose default value is false.
Setting this option to true causes the generated parser to lookahead for
extra tokens ahead of time.  This facilitates some performance improvements.
However, in this case (when the option is true), interactive
applications may not work since the parser needs to work synchronously
with the availability of tokens from the input stream.  In such cases,
it's best to leave this option at its default value.
<LI>
<STRONG>OUTPUT_DIRECTORY:</STRONG>
This is a string valued option whose default value is the current
directory.  This controls where output files are generated.
</UL>

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod5">production</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod9">javacode_production</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod10">regular_expr_production</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod11">bnf_production</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod12">token_manager_decls</A></TD>
</TR>
</TABLE>
<P>

There are four kinds of productions in JavaCC.
<A HREF="#prod9">javacode_production</A> and <A HREF="#prod11">bnf_production</A>
are used to define the grammar from which the parser is generated.
<A HREF="#prod10">regular_expr_production</A> is used to define the grammar
tokens - the token manager is generated from this information (as well as from
inline token specifications in the parser grammar).
<A HREF="#prod12">token_manager_decls</A> is used to introduce declarations
that get inserted into the generated token manager.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod9">javacode_production</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"<A NAME="JAVACODE">JAVACODE</A>"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_access_modifier</EM> <EM>java_return_type</EM> <EM>java_identifier</EM> "(" <EM>java_parameter_list</EM> ")"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_block</EM>
</TR>
</TABLE>
<P>

The JAVACODE production is a way to write Java code for some
productions instead of the usual EBNF expansion.  This is useful when
there is the need to recognize something that is not context-free
or for whatever reason is very difficult to write a grammar for.
An example of the use of JAVACODE is shown below.  In this example,
the non-terminal "skip_to_matching_brace" consumes tokens in the input
stream all the way up to a matching closing brace (the opening brace
is assumed to have been just scanned):
<P>
<PRE>
    JAVACODE
    void skip_to_matching_brace() {
      <A HREF="apiroutines.html">Token</A> tok;
      int nesting = 1;
      while (true) {
        tok = <A HREF="apiroutines.html">getToken</A>(1);
        if (tok.kind == LBRACE) nesting++;
        if (tok.kind == RBRACE) {
          nesting--;
          if (nesting == 0) break;
        }
        tok = <A HREF="apiroutines.html">getNextToken</A>();
      }
    }
</PRE>
<P>
Care must be taken when using JAVACODE productions.  While you can
say pretty much what you want with these productions, JavaCC simply
considers it a black box (that somehow performs its parsing task).
This becomes a problem when JAVACODE productions appear at
<A HREF="lookahead.html">choice points</A>.  For example, if the
above JAVACODE production was referred to from the following production:
<P>
<PRE>
  void NT() :
  {}
  {
    skip_to_matching_brace()
  |
    some_other_production()
  }
</PRE>
<P>
Then JavaCC would not know how to choose between the two choices.
On the other hand, if the JAVACODE production is used at a non-choice
point as in the following example, there is no problem:
<P>
<PRE>
  void NT() :
  {}
  {
    "{" skip_to_matching_brace()
  |
    "(" parameter_list() ")"
  }
</PRE>
<P>
When JAVACODE productions are used at choice points, JavaCC will
print a warning message stating this fact.  You will then have to
insert some explicit LOOKAHEAD specifications to help JavaCC.  See
<A HREF="lookahead.html">the minitutorial on LOOKAHEAD</A> for a
detailed guide on such issues.
<P>
The default access modifier for JAVACODE productions is package private.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod11">bnf_production</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_access_modifier</EM> <EM>java_return_type</EM> <EM>java_identifier</EM> "(" <EM>java_parameter_list</EM> ")" ":"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_block</EM>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"{" <A HREF="#prod16">expansion_choices</A> "}"</TD>
</TR>
</TABLE>
<P>

The BNF production is the standard production used
in specifying JavaCC grammars.  Each BNF production has a left hand
side which is a non-terminal specification.  The BNF production then
defines this non-terminal in terms of BNF expansions on the right hand
side.  The non-terminal is written exactly like a declared Java method.
Since each non-terminal is translated into a method
in the generated parser, this style of writing the non-terminal makes
this association obvious.  The name of the non-terminal is the name of
the method, and the parameters and return value declared are the means
to pass values up and down the parse tree.  As will be seen later,
non-terminals on the right hand sides of productions are written as
method calls, so the passing of values up and down the tree are done
using exactly the same paradigm as method call and return.
The default access modifier for BNF productions is public.
<P>
There are two parts on the right hand side of an BNF production.  The
first part is a set of arbitrary Java declarations and code (the Java
block).  This code is generated at the beginning
of the method generated for the Java non-terminal.  Hence, every time
this non-terminal is used in the parsing process, these declarations and
code are executed.  The declarations in this part are visible to all Java
code in actions in the BNF expansions.  JavaCC does not do any processing
of these declarations and code, except to skip to the matching ending
brace, collecting all text encountered on the way.  Hence, a Java compiler
can detect errors in this code that has been processed by JavaCC.
<P>
The second part of the right hand side are the BNF expansions.  This
is described <A NAME="prod16">later</A>.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod10">regular_expr_production</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>[ <A HREF="#newprod1">lexical_state_list</A> ]</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod17">regexpr_kind</A> [ "[" "IGNORE_CASE" "]" ] ":"</TD>
</TR>
<TR>
<TD></TD><TD></TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"{" <A HREF="#prod18">regexpr_spec</A> ( "|" <A HREF="#prod18">regexpr_spec</A> )* "}"</TD>
</TR>
</TABLE>
<P>

A regular expression production is used to define lexical entities
that get processed by the generated token manager.  A detailed description
of how the token manager works is provided in
<A HREF="tokenmanager.html">this minitutorial (click here)</A>.  This
page describes the syntactic aspects of specifying lexical entities,
while <A HREF="tokenmanager.html">the minitutorial</A> describes how
these syntactic constructs tie in with how the token manager actually
works.
<P>
A regular expression production starts with a specification of the
lexical states for which it applies (the
<A HREF="#newprod1">lexical state list</A>).
There is a standard lexical state called "DEFAULT".  If the
<A HREF="#newprod1">lexical state list</A> is omitted, the regular
expression production applies to the lexical state "DEFAULT".
<P>
Following this is a description of what kind of regular expression
production this is (<A HREF="#prod17">see below for what this means</A>).
<P>
After this is an optional "[IGNORE_CASE]".  If this is present, the
regular expression production is case insensitive - it has the same
effect as the
<A HREF="#prod6">IGNORE_CASE</A>
option, except that in this case it applies locally to this regular
expression production.
<P>
This is then followed by a list of regular expression specifications
that describe in more detail the lexical entities of this regular
expression production.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod12">token_manager_decls</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"<A NAME="TOKEN_MGR_DECLS">TOKEN_MGR_DECLS</A>" ":" <EM>java_block</EM></TD>
</TR>
</TABLE>
<P>

The token manager declarations starts with the reserved word
"TOKEN_MGR_DECLS" followed by a ":" and then a set of Java declarations
and statements (the Java block).  These declarations and statements are
written into the generated token manager and are accessible from within
<A HREF="#prod18">lexical actions</A>.  See
<A HREF="tokenmanager.html">the minitutorial on the token manager</A>
for more details.
<P>
There can only be one token manager declaration in a JavaCC grammar file.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="newprod1">lexical_state_list</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" "*" "&gt;"</TD>
</TR>
<TR>
<TD></TD><TD>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" <EM>java_identifier</EM> ( "," <EM>java_identifier</EM> )* "&gt;"</TD>
</TR>
</TABLE>
<P>

The lexical state list describes the set of lexical states for which
the corresponding <A HREF="#prod10">regular expression production</A>
applies.  If this is written as "<*>", the regular expression production
applies to all lexical states.  Otherwise, it applies to all the lexical
states in the identifier list within the angular brackets.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod17">regexpr_kind</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"TOKEN"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"SPECIAL_TOKEN"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"SKIP"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"MORE"</TD>
</TR>
</TABLE>
<P>

This specifies the kind of
<A HREF="#prod10">regular expression production</A>.
There are four kinds:
<P>
<UL>
<LI>
<STRONG><A NAME="TOKEN">TOKEN</A></STRONG>:
The regular expressions in this regular expression production describe
<EM>tokens</EM> in the grammar.  The token manager creates a
<A HREF="apiroutines.html">Token</A> object for each match of such
a regular expression and returns it to the parser.
<P>
<LI>
<STRONG><A NAME="SPECIAL_TOKEN">SPECIAL_TOKEN</A></STRONG>:
The regular expressions in this regular expression production describe
<EM>special tokens</EM>.  Special tokens are like tokens, except that
they do not have significance during parsing - that is the BNF productions
ignore them.  Special tokens are, however, still passed on to the parser
so that parser actions can access them.  Special tokens are passed
to the parser by linking them to neighboring real tokens using the
field "specialToken" in the <A HREF="apiroutines.html">Token</A>
class.  Special tokens are useful in the processing of lexical entities
such as comments which have no significance to parsing, but still
are an important part of the input file.  See
<A HREF="tokenmanager.html">the minitutorial on the token manager</A>
for more details of special token handling.
<P>
<LI>
<STRONG><A NAME="SKIP">SKIP</A></STRONG>:
Matches to regular expressions in this regular expression production
are simply skipped (ignored) by the token manager.
<P>
<LI>
<STRONG><A NAME="MORE">MORE</A></STRONG>:
Sometimes it is useful to gradually build up a token to be passed on
to the parser.  Matches to this kind of regular expression are stored
in a buffer until the next TOKEN or SPECIAL_TOKEN match.  Then all
the matches in the buffer and the final TOKEN/SPECIAL_TOKEN match
are concatenated together to form one TOKEN/SPECIAL_TOKEN that is
passed on to the parser.  If a match to a SKIP regular expression
follows a sequence of MORE matches, the contents of the buffer is
discarded.
</UL>

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod18">regexpr_spec</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod19">regular_expression</A> [ <EM>java_block</EM> ] [ ":" <EM>java_identifier</EM> ]</TD>
</TR>
</TABLE>
<P>

The regular expression specification begins the actual description
of the lexical entities that are part of this
<A HREF="#prod10">regular expression production</A>.
Each regular expression production may contain any number of
regular expression specifications.
<P>
Each regular expression specification contains a regular expression
followed by a Java block (the lexical action) which is optional.
This is then followed by an identifier of a lexical state (which
is also optional).  Whenever this regular expression is matched,
the lexical action (if any) gets executed, followed by any
<A HREF="#prod6">common token actions</A>.  Then the action depending
on the
<A HREF="#prod17">regular expression production kind</A>
is taken.  Finally, if a lexical state is specified, the token
manager moves to that lexical state for further processing (the
token manager starts initially in the state "DEFAULT").

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod16">expansion_choices</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod20">expansion</A> ( "|" <A HREF="#prod20">expansion</A> )*</TD>
</TR>
</TABLE>
<P>

Expansion choices are written as a list of one or more expansions
separated by "|"s.  The set of legal parses allowed by an expansion
choice is a legal parse of any one of the contained expansions.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod20">expansion</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod22">expansion_unit</A> )*</TD>
</TR>
</TABLE>
<P>

An expansion is written as a sequence of expansion units.
A concatenation of legal
parses of the expansion units is a legal parse of the expansion.
<P>
For example, the expansion "{" decls() "}" consists of three expansion
units - "{", decls(), and "}".  A match for the expansion is a concatenation
of the matches for the individual expansion units - in this case, that would
be any string that begins with a "{", ends with a "}", and contains a match
for decls() in between.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod22">expansion_unit</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod21">local_lookahead</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_block</EM></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"(" <A HREF="#prod16">expansion_choices</A> ")" [ "+" | "*" | "?" ]</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"[" <A HREF="#prod16">expansion_choices</A> "]"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>[ <EM>java_assignment_lhs</EM> "=" ] <A HREF="#prod19">regular_expression</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>[ <EM>java_assignment_lhs</EM> "=" ] <EM>java_identifier</EM> "(" <EM>java_expression_list</EM> ")"</TD>
</TR>
</TABLE>
<P>

An expansion unit can be a <A HREF="#prod21">local LOOKAHEAD specification</A>.
This instructs the
generated parser on how to make choices at choice points.  For details
on how LOOKAHEAD specifications work and how to write LOOKAHEAD specifications,
<A HREF="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</A>.
<P>
An expansion unit can be a set of Java declarations and code enclosed
within braces (the Java block).  These are also called <EM>parser
actions</EM>.  This is generated into the method parsing the
non-terminal at the appropriate location.  This block is executed
whenever the parsing process crosses this point successfully.
When JavaCC processes the Java block, it does not perform any detailed
syntax or semantic checking.  Hence it is possible that the Java compiler
will find errors in your actions that have been processed by JavaCC.
<EM>Actions are not executed during
<A HREF="lookahead.html">lookahead evaluation</A>.</EM>
<P>
An expansion unit can be a parenthesized set of one or more
<A HREF="#prod16">expansion choices</A>.  In which case, a legal parse of the expansion
unit is any legal parse of the nested expansion choices.
The parenthesized set of expansion choices can be suffixed (optionally) by:
<UL>
<LI>
<STRONG>"+":</STRONG>
Then any legal parse of the expansion unit is one or more
repetitions of a legal parse of the parenthesized set of
expansion choices.
<LI>
<STRONG>"*":</STRONG>
Then any legal parse of the expansion unit is zero or more
repetitions of a legal parse of the parenthesized set of
expansion choices.
<LI>
<STRONG>"?":</STRONG>
Then a legal parse of the expansion unit is either the
empty token sequence or any legal parse of the nested expansion choices.
An alternate syntax for this construct is to enclose the
expansion choices within brackets "[...]".
</UL>
<P>
An expansion unit can be a <A HREF="#prod19">regular expression</A>.  Then a legal parse
of the expansion unit is any token that matches this regular
expression.  When a regular expression is matched, it creates an
object of type <A HREF="apiroutines.html">Token</A>.  This object
can be accessed by assigning it to a variable by prefixing the
regular expression with "variable =".  In general, you may have any
valid Java assignment left-hand side to the left of the "=".
<EM>This assignment is not performed during
<A HREF="lookahead.html">lookahead evaluation</A>.</EM>
<P>
An expansion unit can be a non-terminal (the last choice in the syntax
above).  In which case, it takes
the form of a method call with the non-terminal name used as the
name of the method.  A successful parse of the non-terminal causes
the parameters placed in the method call to be operated on and a
value returned (in case the non-terminal was not declared to be
of type "void").  The return value can be assigned (optionally) to
a variable by prefixing the regular expression with "variable =".
In general, you may have any
valid Java assignment left-hand side to the left of the "=".
<EM>This assignment is not performed during
<A HREF="lookahead.html">lookahead evaluation</A>.</EM>
Non-terminals may not be used in an expansion in a manner that introduces
left-recursion.  JavaCC checks this for you.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod21">local_lookahead</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"LOOKAHEAD" "(" [ <EM>java_integer_literal</EM> ] [ "," ] [ <A HREF="#prod16">expansion_choices</A> ] [ "," ] [ "{" <EM>java_expression</EM> "}" ] ")"</TD>
</TR>
</TABLE>
<P>

A local lookahead specification is used to influence the way the generated
parser makes choices at the various
<A HREF="lookahead.html">choice points</A>
in the grammar.  A local lookahead specification starts with the reserved
word "LOOKAHEAD" followed by a set of lookahead constraints within parentheses.
There are three different kinds of lookahead constraints - a lookahead limit
(the integer literal), a syntactic lookahead (the expansion choices), and
a semantic lookahead (the expression within braces).  At least one lookahead
constraint must be present.  If more than one lookahead constraint is present,
they must be separated by commas.
<P>
For a detailed description of how lookahead works, please
<A HREF="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</A>.
A brief description of each kind of lookahead constraint is given below:
<P>
<UL>
<LI>
<STRONG>Lookahead Limit:</STRONG>
This is the maximum number of tokens of lookahead that may be used for choice
determination purposes.  This overrides the default value which is specified
by the <A HREF="#prod2">LOOKAHEAD option</A>.  This lookahead limit applies
only to the <A HREF="lookahead.html">choice point</A>
at the location of the local lookahead specification.
If the local lookahead specification is not at a choice point, the lookahead
limit (if any) is ignored.
<P>
<LI>
<STRONG>Syntactic Lookahead:</STRONG>
This is an expansion (or expansion choices) that is used for the purpose of
determining whether or not the particular choice that this local lookahead
specification applies to is to be taken.  If this was not provided, the parser
uses the expansion to be selected during lookahead determination.
If the local lookahead specification is not at a
<A HREF="lookahead.html">choice point</A>, the syntactic
lookahead (if any) is ignored.
<P>
<LI>
<STRONG>Semantic Lookahead:</STRONG>
This is a boolean expression that is evaluated whenever the parser crosses this
point during parsing.  If the expression evaluates to true, the parsing
continues normally.  If the expression evaluates to false and the local
lookahead specification is at a <A HREF="lookahead.html">choice point</A>,
the current choice is not taken and the next choice is considered.
If the expression evaluates to false and the local lookahead specification
is <EM>not</EM> at a choice point, then parsing aborts with a parse error.
Unlike the other two lookahead constraints that are ignored at non-choice
points, semantic lookahead is always evaluated.  In fact, semantic lookahead
is even evaluated if it is encountered during the evaluation of some other
syntactic lookahead check (for more details
<A HREF="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</A>).
</UL>
<P>
<STRONG>Default values for lookahead constraints:</STRONG>
If a local lookahead specification has been provided, but not all lookahead
constraints have been included, then the missing ones are assigned default
values as follows:
<P>
<UL>
<LI>
If the lookahead limit is not provided and if the syntactic lookahead is
provided, then the lookahead limit defaults to the largest integer value
(2147483647).  This essentially implements "infinite lookahead" - namely,
look ahead as many tokens as necessary to match the syntactic lookahead that
has been provided.
<P>
<LI>
If neither the lookahead limit nor the syntactic lookahead has been
provided (which means the semantic lookahead is provided), the lookahead
limit defaults to 0.  This means that syntactic lookahead is not performed
(it passes trivially), and only semantic lookahead is performed.
<P>
<LI>
If the syntactic lookahead is not provided, it defaults to the choice
to which the local lookahead specification applies.  If the local lookahead
specification is not at a choice point, then the syntactic lookahead is
ignored - hence a default value is not relevant.
<P>
<LI>
If the semantic lookahead is not provided, it defaults to the boolean
expression "true".  That is, it trivially passes.
</UL>
<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod19">regular_expression</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" [ [ "#" ] <EM>java_identifier</EM> ":" ] <A HREF="#prod29">complex_regular_expression_choices</A> "&gt;"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" <EM>java_identifier</EM> "&gt;"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" "EOF" "&gt;"</TD>
</TR>
</TABLE>
<P>

There are two places in a grammar files where regular expressions may be
written:
<UL>
<LI>
Within a <A HREF="#prod18">regular expression specification</A>
(part of a <A HREF="#prod10">regular expression production</A>),
<P>
<LI>
As an <A HREF="#prod22">expansion unit</A> with an <A HREF="#prod20">expansion</A>.
When a regular expression is used in this manner, it is as if the regular expression
were defined in the following manner at this location and then referred to by its
label from the expansion unit:
<P>
<PRE>
    &lt;DEFAULT&gt; TOKEN :
    {
      regular expression
    }
</PRE>
<P>
That is, this usage of regular expression can be rewritten using the other
kind of usage.
</UL>
<P>
The complete details of regular expression matching by the token manager is
available in
<A HREF="tokenmanager.html">the minitutorial on the token manager</A>.  The
description of the syntactic constructs follows.
<P>
The first kind of regular expression is a string literal.  The input being
parsed matches this regular expression if the token manager is in a
<A HREF="#prod10">lexical state</A> for which this regular expression applies
and the next set of characters in the input stream is the same (possibly with
case ignored) as this string literal.
<P>
A regular expression may also be a more <A HREF="#prod29">complex regular expression</A>
using which more involved regular expression (than string literals can be defined).
Such a regular expression is placed within angular brackets "&lt;...&gt;", and
may be labeled optionally with an identifier.  This label may be used to refer
to this regular expression from
<A HREF="#prod22">expansion units</A>
or from within other regular expressions.
If the label is preceded by a "#", then this regular expression may not be
referred to from expansion units, but only from within other regular expressions.
When the "#" is present, the regular expression is referred to as a
"private regular expression".
<P>
A regular expression may be a reference to some other labeled regular expression
in which case it is written as the label enclosed in angular brackets "&lt;...&gt;".
<P>
Finally, a regular expression may be a reference to the predefined regular
expression "&lt;EOF&gt;" which is matched by the end of file.
<P>
Private regular expressions are not matched as tokens by the token manager.
Their purpose is solely to facilitate the definition of other more complex
regular expressions.
<P>
Consider the following example defining Java floating point literals:
<P>
<PRE>
TOKEN :
{
  < FLOATING_POINT_LITERAL:
        (["0"-"9"])+ "." (["0"-"9"])* (&lt;EXPONENT>)? (["f","F","d","D"])?
      | "." (["0"-"9"])+ (&lt;EXPONENT>)? (["f","F","d","D"])?
      | (["0"-"9"])+ &lt;EXPONENT> (["f","F","d","D"])?
      | (["0"-"9"])+ (&lt;EXPONENT>)? ["f","F","d","D"]
  >
|
  < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
}
</PRE>
<P>
In this example, the token FLOATING_POINT_LITERAL is defined using the
definition of another token, namely, EXPONENT.  The "#" before the label
EXPONENT indicates that this exists solely for the purpose of defining other
tokens (FLOATING_POINT_LITERAL in this case).  The definition of
FLOATING_POINT_LITERAL is not affected by the presence or absence of the "#".
However, the token manager's behavior is.  If the "#" is omitted, the
token manager will
erroneously recognize a string like E123 as a legal token of kind EXPONENT
(instead of IDENTIFIER in the Java grammar).

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod29">complex_regular_expression_choices</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod30">complex_regular_expression</A> ( "|" <A HREF="#prod30">complex_regular_expression</A> )*</TD>
</TR>
</TABLE>
<P>

Complex regular expression choices is made up of a list of one or more
<A HREF="#prod30">complex regular expressions</A> separated by "|"s.
A match for a complex regular expression choice is a match of any of its
constituent complex regular expressions.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod30">complex_regular_expression</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod31">complex_regular_expression_unit</A> )*</TD>
</TR>
</TABLE>
<P>

A complex regular expression is a sequence of complex regular expression units.
A match for a complex regular expression is a concatenation of matches to
the complex regular expression units.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod31">complex_regular_expression_unit</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"&lt;" <EM>java_identifier</EM> "&gt;"</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod32">character_list</A></TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>|</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>"(" <A HREF="#prod29">complex_regular_expression_choices</A> ")" [ "+" | "*" | "?" ]</TD>
</TR>
</TABLE>
<P>

A complex regular expression unit can be a string literal,  in which case
there is exactly one match for this unit, namely, the string literal itself.
<P>
A complex regular expression unit can be a reference to another regular
expression.  The other regular expression has to be labeled so that it
can be referenced.  The matches of this unit are all the matches of this
other regular expression.  Such references in regular expressions cannot
introduce loops in the dependency between tokens.
<P>
A complex regular expression unit can be a <A HREF="#prod32">character list</A>.
A character list is a way of defining a set of characters.  A match for this
kind of complex regular expression unit is any character that is allowed
by the character list.
<P>
A complex regular expression unit can be a parenthesized set of
complex regular expression choices.  In this case, a legal match of
the unit is any legal match of the nested choices.  The parenthesized
set of choices can be suffixed (optionally) by:
<UL>
<LI>
<STRONG>"+":</STRONG>
Then any legal match of the unit is one or more
repetitions of a legal match of the parenthesized set of
choices.
<LI>
<STRONG>"*":</STRONG>
Then any legal match of the unit is zero or more
repetitions of a legal match of the parenthesized set of
choices.
<LI>
<STRONG>"?":</STRONG>
Then a legal match of the unit is either the
empty string or any legal match of the nested choices.
</UL>
Note that unlike the BNF <A HREF="#prod20">expansions</A>,
the regular expression "[...]" is not equivalent
to the regular expression "(...)?".  This is because the [...]
construct is used to describe <A HREF="#prod32">character lists</A>
in regular expressions.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod32">character_list</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE>[ "~" ] "[" [ <A HREF="#prod33">character_descriptor</A> ( "," <A HREF="#prod33">character_descriptor</A> )* ] "]"</TD>
</TR>
</TABLE>
<P>

A character list describes a set of characters.  A legal match for a
character list is any character in this set.  A character list is a list
of character descriptors separated by commas within square brackets.
Each character descriptor describes a single character or a range of characters
(see <A HREF="#prod33">character descriptor</A> below),
and this is added to the set of characters of the character
list.  If the character list is prefixed by the "~" symbol, the set of
characters it represents is any UNICODE character not in the specified set.

<P>
<HR>
<P>

<TABLE>
<TR>
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod33">character_descriptor</A></TD>
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
<TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM> [ "-" <EM>java_string_literal</EM> ]</TD>
</TR>
</TABLE>
<P>

A character descriptor can be a single character string literal, in which
case it describes a singleton set containing that character; or it is
two single character string literals separated by a "-", in which case, it
describes the set of all characters in the range between and including these
two characters.

<P>

</BODY>
</HTML>