/usr/share/doc/hlins/hlins-doc.html is in hlins 0.39-23.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 | <BR>
<BR>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<HTML>
<HEAD><TITLE>Hlins: Hyper-Link Insertions in HTML documents
Version 0.39</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<META name="GENERATOR" content="hevea 1.06">
</HEAD>
<BODY >
<!--HEVEA command line is: /usr/bin/hevea -->
<!--HTMLHEAD-->
<!--ENDHTML-->
<!--PREFIX <ARG ></ARG>-->
<!--CUT DEF section 1 -->
<H1 ALIGN=center>Hlins: Hyper-Link Insertions in HTML documents<BR>
Version 0.39</H1>
<H3 ALIGN=center><a href="http://www.lri.fr/~treinen">Ralf Treinen</a></H3>
<H3 ALIGN=center>May 1, 2003</H3>
<!--TOC section An Introductory Example-->
<H2><A NAME="htoc1">1</A> An Introductory Example</H2><!--SEC END -->
<EM>Hlins</EM> inserts in a <A HREF="http://www.w3.org/TR/html40">HTML</A>
document the url's (uniform resource locator) for certain names
(normally the names of people), according to a data base associating
url's to names.<BR>
<BR>
First you have to create a data base that associates url's to names,
let's call it <TT>addresses</TT>:
<BLOCKQUOTE>
<PRE>
Donald Knuth = http://www-cs-staff.stanford.edu/...
Leslie Lamport = http://www.research.digital.com/...
</PRE>
</BLOCKQUOTE>
Suppose that you have a HTML document <TT>mytext.html</TT> that
contains text as
<BLOCKQUOTE>
<PRE>
A milestone in the development of digital typesetting was the TeX
system developed by Stanford computer science professor Donald
Knuth, which was used by L. Lamport as a base to build the more user-friendly
(but less powerful) LaTeX system.
</PRE>
</BLOCKQUOTE>
Calling <TT>hlins -db addresses -o newtext.html mytext.html</TT> will
generate a file <TT>newtext.html</TT> that contains now the piece of
text
<BLOCKQUOTE>
<PRE>
A milestone in the development of digital typesetting was the TeX
system developed by Stanford computer science professor <a
href="http://www-cs-staff.stanford.edu/...">Donald Knuth</a>, which
was used by <a
href="http://www.research.digital.com/...">L. Lamport</a> as a base to
build the more user-friendly (but less powerful) LaTeX system.
</PRE></BLOCKQUOTE>
which will eventually be rendered by a browser as something like
<BLOCKQUOTE>
A milestone in the development of digital typesetting was the
TeX system developed by Stanford computer science professor
<A HREF="http://www-cs-staff.stanford.edu/%7eknuth/index.html">Donald
Knuth</A>, which was used by
<A HREF="http://www.research.digital.com/SRC/personal/Leslie_Lamport/home.html">L. Lamport</A>
as a base to build the more user-friendly (but less powerful) LaTeX
system.
</BLOCKQUOTE>
Note that the url insertion knows about abbreviating first names (as
for Leslie Lamport) and works over line breaks (as for Donald Knuth).<BR>
<BR>
<!--TOC section Usage-->
<H2><A NAME="htoc2">2</A> Usage</H2><!--SEC END -->
<PRE>
hlins [options] [inputfile]
</PRE>Hlins can be used in three different modes (see below).
The following general options exist:
<DL COMPACT=compact><DT>
<B><TT>-h</TT></B><B>, <TT>--help</TT></B><DD>
Show summary of options and exit.
<DT><B><TT>-v</TT></B><B>, <TT>--version</TT></B><DD>
Show version of program ad exit.
<DT><B><TT>-q</TT></B><B>, <TT>--quiet</TT></B><DD>
Surpress diagnostic output.
<DT><B><TT>-db</TT></B><B>, <TT>--data-bases</TT></B><B> <I>files</I></B><B> ...</B><DD>
Use <I>files</I> as address data bases.
The string <I>files</I> is a blank-separated list of data base
files, which means that you have to protect the blanks from your shell
when using several data base files. Multiple <CODE>-db</CODE> options are
accepted.
Examples of usage strings in the
<I>csh</I> shell are
<PRE>
hlins -db myaddresses
hlins -db "friends groupmembers"
hlins -db friends -db groupmembers
</PRE>The last two invocations are equivalent.
</DL>
<!--TOC subsection Usage in filter mode-->
<H3><A NAME="htoc3">2.1</A> Usage in filter mode</H3><!--SEC END -->
In filter mode, hlins reads html from one source and writes to a
different target. Input is taken from the <I>inputfile</I> argument
if existent, otherwise from <I>stdin</I>. Output goes by default to
<I>stdout</I>.
<DL COMPACT=compact><DT>
<B><TT>-o</TT></B><B>, <TT>--output-file</TT></B><B> <I>file</I></B><DD>
Write to <I>file</I> instead of standard output.
</DL>
<!--TOC subsection Usage in modify mode-->
<H3><A NAME="htoc4">2.2</A> Usage in modify mode</H3><!--SEC END -->
In modify mode, hlins modifies html files in place.
<DL COMPACT=compact><DT>
<B><TT>-m</TT></B><B>, <TT>--modify-files</TT></B><B> <I>files</I></B><B> ...</B><DD>
Modify the <I>files</I> in-place..
<DT><B><TT>-R</TT></B><B>, <TT>--recursive</TT></B><DD>
recursively descend into directories and operate on all files with
names ending on <TT>.html</TT>. Only effective in with the
<TT>--modify</TT> option.<BR>
<BR>
For instance, ``<TT>hlins -db ... -m /WWW -R</TT>'' makes hlins
operate on your complete <TT>WWW</TT> tree.
<DT><B><TT>-td</TT></B><B>,<TT>--tempdir</TT></B><B> <I>dir</I></B><DD>
When doing in-place modifications of files use the directory
<I>dir</I> to create temporary files. Default is the value of the
<TT>TMPDIR</TT> environment variable, and <TT>/tmp</TT> if
<TT>TMPDIR</TT> is not set.
</DL>
<!--TOC subsection Usage in database list mode-->
<H3><A NAME="htoc5">2.3</A> Usage in database list mode</H3><!--SEC END -->
<DL COMPACT=compact><DT>
<B><TT>--db-to-html</TT></B><DD>
Lists the contents of the databases in html to standard output. This
can be handy to create an adress book.
</DL>
<!--TOC section Secondary Effects on the HTML Text-->
<H2><A NAME="htoc6">3</A> Secondary Effects on the HTML Text</H2><!--SEC END -->
Hlins replaces special characters of HTML (as <CODE>&eacute;</CODE> or
<CODE>&#233</CODE>) by the corresponding ISO-8859-1 character, which is in
this case <CODE>é</CODE>. Hence, you can use Hlins without any database
argument to replace HTML special characters in a HTML document.<BR>
<BR>
In some cases, non-empty sequences of white space characters may be
replaced by one space. However, this happens only when the white space
is part of a prefix of some name in the data base. Anyway, this
replacement is irrelevant for the rendering of HTML documents.<BR>
<BR>
<!--TOC section Address Data Bases-->
<H2><A NAME="htoc7">4</A> Address Data Bases</H2><!--SEC END -->
Every line of the file must be either a comment line or an address
specification. A comment-line is a line that either consists only of
white space, or that starts with the comment-symbol <CODE>#</CODE> (possibly
preceded by white space).<BR>
<BR>
An address specification consists of a name and a url that are
separated by the character <CODE>=</CODE> . Leading white space of the line
is ignored. In the name, the character <CODE>=</CODE> must be written as
<CODE>==</CODE>. <BR>
<BR>
Special characters in the name can be either written in HTML or as 8bit
characters. The number of spaces separating the words of a name is not
relevant.<BR>
<BR>
The syntax of the url is not checked.<BR>
<BR>
<!--TOC section Variants of Names-->
<H2><A NAME="htoc8">5</A> Variants of Names</H2><!--SEC END -->
Several variants of the names in the data base are recognized as
well. To find the variants of a name we first split it at white spaces
into <EM>components</EM>.
<UL><LI>
If a name consists of just one component than it has no variant
other than itself.
<LI>Otherwise, the variants of the name are obtained by considering
all possible combinations of variants of the components. The last
component is treated differently from the other components:
<UL><LI>
If the last component contains the symbol <CODE>-</CODE> then
the name without this <CODE>-</CODE> and everything behind is also
recognized. Hence, if you have an entry for <I>Egon Müller-Meier</I>
then <I>Egon Müller</I> is also recognized.
<LI>A component which is not the last component may be abbreviated,
unless it consists of one only one letter or it terminates on a
dot. The abbreviation of a first name is its first letter followed by
a dot. In case of a word starting with <TT>St</TT> a further
abbreviation is <TT>St</TT> followed by a dot, and a word starting on
<TT>Ch</TT> has additional abbreviation <TT>Ch</TT> followed by a dot.
Composite first names are abbreviated in both components, hence
<TT>Marc-Stephane</TT> becomes <TT>M.-St.</TT> (but not, for instance,
<TT>M.-Stephane</TT>).
<LI>In any case generation of variants is surpressed if you write
the component in angular brackets like <TT><Marc-Stephane></TT>.
This mechanism is used in the <A HREF="hlins-doc.adr">data
base to produce this document</A>, to have matching of
<CODE>Objective Caml</CODE> but to avoid matching of <CODE>O. Caml</CODE>.
</UL>
</UL>
<!--TOC section The Exact Rules of Searching Names-->
<H2><A NAME="htoc9">6</A> The Exact Rules of Searching Names</H2><!--SEC END -->
Names are searched starting from the beginning of the text. If there
are overlapping matches then the match starting at the earlier
position wins. For example, if the data base contains entries for
<CODE>Egon Meier</CODE> and for <CODE>Hans Egon Meier-Müller</CODE> then the second
one matches on input <CODE>Hans Egon Meier-Müller</CODE>. <BR>
<BR>
A match is extended to longer matches if possible. That is, if the
data base contains entries for <CODE>Hans Egon</CODE> and for
<CODE>Hans Egon Meier</CODE> then the second one matches on input
<CODE>Hans Egon Meier</CODE>.<BR>
<BR>
<!--TOC section The Exact Rules of URL Insertion-->
<H2><A NAME="htoc10">7</A> The Exact Rules of URL Insertion</H2><!--SEC END -->
Hlins does not touch any text between
<CODE><a ... href= ...></CODE> and <CODE></a></CODE>. Note that this applies only if
the <CODE><a></CODE> tag contains the <CODE>href</CODE> attribute, that is hlins
<EM>does</EM> look at text inside of <CODE><a name=...></CODE> and
<CODE></a></CODE>. As a consequence, hlins is idempotent, that is if you
apply hlins twice (for instance using the <TT>--modify</TT> option) to
a file you get the same effect than with just one application. Hence, you
can, when you extend your database, safely rerun hlins on your html
files.<BR>
<BR>
The replacment mechanism (including the normalisation of HTML special
charactes) is shortcut for any text inside the following tags:
<UL><LI>
<CODE><head></CODE> ... <CODE></head></CODE>
<LI><CODE><samp></CODE> ... <CODE></samp></CODE>
<LI><CODE><kbd></CODE> ... <CODE></kbd></CODE>
<LI><CODE><pre></CODE> ... <CODE></pre></CODE>
<LI><CODE><div nohlins></CODE> ... <CODE></div></CODE>
</UL>
The rationale is that the first four tags of this list are intended to
mark some kind of verbatim text (see the
<A HREF="htttp:/www.w3.org/TR/html401/">HTML 4.01 specification</A>). The
last one is an escape mechanism in case you have to overrule hlins'
mechanism. Text from the beginning of one of the start
tags to the first occurrence of the corresping end mark is ignored.
The consequence is that among the above list embedded tags of the same
kind are not correctly treated.<BR>
<BR>
Furthermore, text inside angular brackets <CODE><</CODE> and <CODE>></CODE> is not
treated by hlins. <BR>
<BR>
If there are several different url's for a string <I>foundname</I>
then the following rules apply to determine the url inserted:
<OL type=1><LI>
An address specification ``<I>name</I> = <I>url</I>'' where
<I>name</I> matches exactly (modulo white space and HTML special
characters) <I>foundname</I> has priority over a name specification
``<I>name</I> = <I>url</I>'' where <I>foundname</I> is an
abbreviation for <I>name</I>.
<LI>In the list obtained from the above priority rule, the first
match is taken.
</OL>
A warning is issued in case of a conflict, unless the <CODE>--quiet</CODE>
option has been given.<BR>
<BR>
For instance, your data base might contain something like
<BLOCKQUOTE>
<PRE>
Hans Meyer = http://address.for.full.name
H. Meyer = http://address.for.abbreviated.name
</PRE>
</BLOCKQUOTE>
On input <CODE>H. Meyer</CODE>, the second address specification is selected
(and a warning is issued).<BR>
<BR>
<!--TOC section Implementation-->
<H2><A NAME="htoc11">8</A> Implementation</H2><!--SEC END -->
Hlins is written in <a href="http://caml.inria.fr/">Objective Caml</a>.<BR>
<BR>
<!--TOC section License and Installation-->
<H2><A NAME="htoc12">9</A> License and Installation</H2><!--SEC END -->
Hins ins covered by the <A HREF="LICENSE">Gnu General Public License</A>.
See the <A HREF="http://www.lsv.ens-cachan.fr/%7etreinen/hlins">Hlins home page</A> for
binary and source distributions.<BR>
<BR>
<!--TOC section Credits-->
<H2><A NAME="htoc13">10</A> Credits</H2><!--SEC END -->
Thanks to <a href="http://www.lri.fr/~marche">Claude Marché</a> and <a href="http://www.lri.fr/~filliatr">Jean-Christophe Filliâtre</a> for their
remarks and suggestions.<BR>
<BR>
<!--HTMLFOOT-->
<!--ENDHTML-->
<!--FOOTER-->
<HR SIZE=2>
<BLOCKQUOTE><EM>This document was translated from L<sup>A</sup>T<sub>E</sub>X by
</EM><A HREF="http://pauillac.inria.fr/~maranget/hevea/index.html"><EM>H<FONT SIZE=2><sup>E</sup></FONT>V<FONT SIZE=2><sup>E</sup></FONT>A</EM></A><EM>.
</EM></BLOCKQUOTE>
</BODY>
</HTML>
|