/usr/share/doc/the/html/app7.html is in the-doc 3.3~rc1-2build1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | <HTML>
<HEAD><TITLE>THE Reference - Appendix 7 </TITLE></HEAD>
<BODY BGCOLOR="#F1EDD1" LINK = "#0000FF" VLINK = "#FF0022" ALINK = "#808000">
<CENTER> <img WIDTH="64" HEIGHT="64" HSPACE="20" SRC="the64.png" ALT="THE"> </CENTER>
<A NAME="APPENDIX7"></A>
<H2> APPENDIX 7 - REGULAR EXPRESSIONS IN THE </H2>
<HR>
This appendix contains details on regular expression usage in THE. There are two places where THE uses regular expressions; in targets in commands like <A HREF = "comm.html#LOCATE">LOCATE</A> and <A HREF = "comm.html#ALL">ALL</A> , and in the specification of patterns in THE Language Definition files used for syntax highlighting.<BR>
<P>
THE uses the GNU Regular Expression Library to implement regular expressions. This library has several different regular expression syntaxes that can be used when specifying targets.<BR>
<P>
Note that all pattern specifications used for syntax highlighting always uses the EMACS regular expression syntax.<BR>
<P>
The following table lists the features of each of the regular expression syntaxes that can be set via the <A HREF = "commset.html#SETREGEXP">SET REGEXP</A> command. Each feature in the table is explained later.<BR>
<P>
This appendix is not intended to explain everything about regular expressions. If you want to find out more about GNU Regular Expressions, then view the on-line documentation at <a href="http://hessling-editor.sf.net/doc/regex/">http://hessling-editor.sf.net/doc/regex/</a> .<BR>
<P>
<CENTER><TABLE BORDER=1 CELLSPACING=1 CELLPADDING=2>
<TR><TH>Syntax</TH><TH>Features</TH></TR>
<TR><TD>EMACS<BR></TD><TD>None set<BR></TD></TR>
<TR><TD>AWK<BR><BR><BR><BR><BR><BR><BR></TD><TD>BACKSLASH_ESCAPE_IN_LISTS<BR>DOT_NOT_NULL<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_REFS<BR>NO_BACKSLASH_VBAR<BR>NO_EMPTY_RANGES<BR>UNMATCHED_RIGHT_PAREND_ORD<BR></TD></TR>
<TR><TD>POSIX_AWK<BR><BR><BR><BR><BR><BR><BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>CONTEXT_INDEP_ANCHORS<BR>CONTEXT_INDEP_OPS<BR>NO_BACKSLASH_BRACES<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_VBAR<BR>UNMATCHED_RIGHT_PAREN_ORD<BR>BACKSLASH_ESCAPE_IN_LISTS<BR></TD></TR>
<TR><TD>GREP<BR><BR><BR><BR><BR></TD><TD>BACKSLASH_PLUS_QM<BR>CHAR_CLASSES<BR>HAT_LISTS_NOT_NEWLINE<BR>INTERVALS<BR>NEWLINE_ALT<BR></TD></TR>
<TR><TD>EGREP<BR><BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>HAT_LISTS_NOT_NEWLINE<BR>NEWLINE_ALT<BR>CONTEXT_INDEP_ANCHORS<BR>CONTEXT_INDEP_OPS<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_VBAR<BR></TD></TR>
<TR><TD>POSIX_EGREP<BR><BR><BR><BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>HAT_LISTS_NOT_NEWLINE<BR>NEWLINE_ALT<BR>CONTEXT_INDEP_ANCHORS<BR>CONTEXT_INDEP_OPS<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_VBAR<BR>NO_BACKSLASH_BRACES<BR>INTERVALS<BR></TD></TR>
<TR><TD>SED<BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>BACKSLASH_PLUS_QM<BR></TD></TR>
<TR><TD>POSIX_BASIC<BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>BACKSLASH_PLUS_QM<BR></TD></TR>
<TR><TD>POSIX_MINIMAL_BASIC<BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>LIMITED_OPS<BR></TD></TR>
<TR><TD>POSIX_EXTENDED<BR><BR><BR><BR><BR><BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>CONTEXT_INDEP_ANCHORS<BR>CONTEXT_INDEP_OPS<BR>NO_BACKSLASH_BRACES<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_VBAR<BR>UNMATCHED_RIGHT_PAREN_ORD<BR></TD></TR>
<TR><TD>POSIX_MINIMAL_EXTENDED<BR><BR><BR><BR><BR><BR><BR><BR><BR><BR><BR><BR></TD><TD>CHAR_CLASSES<BR>DOT_NEWLINE<BR>DOT_NOT_NULL<BR>INTERVALS<BR>NO_EMPTY_RANGES<BR>CONTEXT_INDEP_ANCHORS<BR>CONTEXT_INVALID_OPS<BR>NO_BACKSLASH_BRACES<BR>NO_BACKSLASH_PARENS<BR>NO_BACKSLASH_REFS<BR>NO_BACKSLASH_VBAR<BR>UNMATCHED_RIGHT_PAREN_ORD<BR></TD></TR>
<TR><TD></TD><TD></TD></TR></TABLE></CENTER><P>
<B> BACKSLASH_ESCAPE_IN_LISTS </B><P>
If this feature is not set, then \ inside a bracket expression is literal.<BR>
If set, then such a \ quotes the following character.<BR>
<P>
<B> BACKSLASH_PLUS_QM </B><P>
If this feature is not set, then + and ? are operators, and \+ and \? are literals.<BR>
If set, then \+ and \? are operators and + and ? are literals.<BR>
<P>
<B> CHAR_CLASSES </B><P>
If this feature is set, then character classes are supported. They are:<BR>
[:alpha:], [:upper:], [:lower:], [:digit:], [:alnum:], [:xdigit:], [:space:], [:print:], [:punct:], [:graph:], and [:cntrl:].<BR>
If not set, then character classes are not supported.<BR>
<P>
<B> CONTEXT_INDEP_ANCHORS </B><P>
If this feature is set, then ^ and $ are always anchors (outside bracket expressions, of course).<BR>
If this feature is not set, then it depends:<BR>
^ is an anchor if it is at the beginning of a regular expression or after an open-group or an alternation operator;<BR>
$ is an anchor if it is at the end of a regular expression, or before a close-group or an alternation operator.<BR>
<P>
This feature could be (re)combined with CONTEXT_INDEP_OPS, because POSIX draft 11.2 says that * etc. in leading positions is undefined.<BR>
<P>
<B> CONTEXT_INDEP_OPS </B><P>
If this feature is set, then special characters are always special regardless of where they are in the pattern.<BR>
If this feature is not set, then special characters are special only in some contexts; otherwise they are ordinary. Specifically, * + ? and intervals are only special when not after the beginning, open-group, or alternation operator.<BR>
<P>
<B> CONTEXT_INVALID_OPS </B><P>
If this feature is set, then *, +, ?, and { cannot be first in an RE or immediately after an alternation or begin-group operator.<BR>
<P>
<B> DOT_NEWLINE </B><P>
If this feature is set, then . matches newline. If not set, then it does not. <P>
<B> DOT_NOT_NULL </B><P>
If this feature is set, then . does not match NUL. If not set, then it does. <P>
<B> HAT_LISTS_NOT_NEWLINE </B><P>
If this feature is set, nonmatching lists [^...] do not match newline. If not set, they do.<BR>
<P>
<B> INTERVALS </B><P>
If this feature is set, either \{...\} or {...} defines an interval, depending on NO_BACKSLASH_BRACES.<BR>
If not set, \{, \}, {, and } are literals.<BR>
<P>
<B> LIMITED_OPS </B><P>
If this feature is set, +, ? and | are not recognized as operators. If not set, they are.<BR>
<P>
<B> NEWLINE_ALT </B><P>
If this feature is set, newline is an alternation operator. If not set, newline is literal.<BR>
<P>
<B> NO_BACKSLASH_BRACES </B><P>
If this feature is set, then `{...}' defines an interval, and \{ and \} are literals. If not set, then `\{...\}' defines an interval.<BR>
<P>
<B> NO_BACKSLASH_PARENS </B><P>
If this feature is set, (...) defines a group, and \( and \) are literals. If not set, \(...\) defines a group, and ( and ) are literals.<BR>
<P>
<B> NO_BACKSLASH_REFS </B><P>
If this feature is set, then \<digit> matches <digit>. If not set, then \<digit> is a back-reference.<BR>
<P>
<B> NO_BACKSLASH_VBAR </B><P>
If this feature is set, then | is an alternation operator, and \| is literal. If not set, then \| is an alternation operator, and | is literal.<BR>
<P>
<B> NO_EMPTY_RANGES </B><P>
If this feature is set, then an ending range point collating higher than the starting range point, as in [z-a], is invalid.<BR>
If not set, then when ending range point collates higher than the starting range point, the range is ignored.<BR>
<P>
<B> UNMATCHED_RIGHT_PAREN_ORD </B><P>
If this feature is set, then an unmatched ) is ordinary. If not set, then an unmatched ) is invalid.<BR>
<P>
<HR>
<ADDRESS>
The Hessling Editor is Copyright © <A HREF = "http://www.rexx.org/">Mark Hessling</A>, 1990-2016
<<A HREF = "mailto:mark@rexx.org">mark@rexx.org</A>>
<BR>Generated on: 7 Feb 2016
</ADDRESS><HR>
Return to <A HREF = "index.html#TOC"> Table of Contents </A><BR>
</BODY> </HTML>
|