/usr/share/doc/lire/dev-manual/ch02s05.html is in lire-devel-doc 2:2.1.1-2.1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | <html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>The Meta-Data Methods</title><meta name="generator" content="DocBook XSL Stylesheets V1.75.2"><link rel="home" href="index.html" title="Lire Developer's Manual"><link rel="up" href="ch02.html" title="Chapter 2. Writing a New DLF Converter"><link rel="prev" href="ch02s04.html" title="Adding a Constructor"><link rel="next" href="ch02s06.html" title="Registering Your DLF Converter with the Lire Framework"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">The Meta-Data Methods</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ch02s04.html">Prev</a> </td><th width="60%" align="center">Chapter 2. Writing a New DLF Converter</th><td width="20%" align="right"> <a accesskey="n" href="ch02s06.html">Next</a></td></tr></table><hr></div><div class="section" title="The Meta-Data Methods"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id402617"></a>The Meta-Data Methods</h2></div></div></div><p>The <span class="interface">Lire::DlfConverter</span> interface
requires two kinds of methods. First, it requires methods
which provide information to the framework on your
converter. Second, it requires methods which will actually
implement the conversion process. It this the format that
this section documents.
</p><div class="section" title="The DLF Converter Name"><div class="titlepage"><div><div><h3 class="title"><a name="id402633"></a>The DLF Converter Name</h3></div></div></div><p>The method <code class="methodname">name()</code> should
returns the name of our DLF converter. It is this name
that is passed to the <span class="command"><strong>lr_log2report</strong></span>
command. This name must be unique among all the converters
registered and it should be restricted to alphanumerical
characters (hyphens, period and underscores can also be
used).
</p><p>We will name our converter
<code class="literal">common_syslog</code>:
</p><pre class="programlisting">
sub name {
return "common_syslog";
}
</pre><p>
</p></div><div class="section" title="Providing Information To Users"><div class="titlepage"><div><div><h3 class="title"><a name="id402670"></a>Providing Information To Users</h3></div></div></div><p>The next two required methods are used to give more
verbose information on your converter to the users. The
converter's <code class="methodname">title()</code> and
<code class="methodname">description()</code> can be use to
display information about your converter from the user
interface or to generate documentation.
</p><p>The <code class="methodname">title()</code> should simply
returns a string:
</p><pre class="programlisting">
sub title {
return "Common Log Format embedded in Syslog DLF Converter";
}
</pre><p>
</p><p>The <code class="methodname">description()</code> method
should returns a <span class="application">DocBook</span>
fragment describing your converter and the log formats it
support. If you don't know
<span class="application">DocBook</span> just restrict yourself
to using the <code class="sgmltag-element">para</code> elements to make
paragraphs:
</p><pre class="programlisting">
sub description {
return <<EOD;
<para>This DLF Converter extracts web server's requests and error
information from a syslog file.
</para>
<para>The requests and errors should be logged under the
<literal>httpd</literal> program name. The errors are mapped to the
<type>syslog</type> schema, the requests are mapped to the
<type>www</type> schema.
</para>
<para>Syslog records from another program than
<literal>httpd</literal> are ignored.
</para>
EOF
}
</pre><p>
</p></div><div class="section" title="Providing Information to the Framework"><div class="titlepage"><div><div><h3 class="title"><a name="id402727"></a>Providing Information to the Framework</h3></div></div></div><p>Two other meta-data methods are used by the framework
itself. The first one specifies to what DLF schemas your
DLF converter is converting to:
</p><pre class="programlisting">
sub schemas {
return ( "www", "syslog" );
}
</pre><p>
In our case, we are converting to the <span class="type">syslog</span>
and <span class="type">www</span> schemas. Like we described it in our
converter's description, we will map the web server's
error message to the <span class="type">syslog</span> schema and the
request logs to the <span class="type">www</span> schema. Other
alternatives would have been to only map the requests
information to <span class="type">www</span> schema or map all the
non-request records to the <span class="type">syslog</span> schema.
The rationale behind the current choice (besides this
being an example) is that it make it convenient to process
one log file to obtain a report containing the requests
and errors from our web server. For that use case, it is
best to ignore the non-web server related stuff.
</p><p>The other method affects how the conversion process
will be handled. <span class="application">Lire</span> offers two mode of conversion, the
line oriented one and the file oriented one. (Both will be
described in the next section). If your log file is
line-oriented (each lines is one log record) like most log
files are, you should use the line-oriented conversion
mode:
</p><pre class="programlisting">
sub handle_log_lines {
return 1;
}
</pre><p>
</p></div><div class="section" title="The Conversion Methods"><div class="titlepage"><div><div><h3 class="title"><a name="id402797"></a>The Conversion Methods</h3></div></div></div><p>The actual conversion process is handled through three
methods: <code class="methodname">init_dlf_converter</code>,
<code class="methodname">finish_conversion()</code> and either
<code class="methodname">process_log_file()</code> or
<code class="methodname">process_log_line()</code> depending on
the conversion mode (as determined by
<code class="methodname">handle_log_lines()</code>'s return value.
</p><div class="section" title="Conversion Initialization"><div class="titlepage"><div><div><h4 class="title"><a name="id402823"></a>Conversion Initialization</h4></div></div></div><p>The method
<code class="methodname">init_dlf_converter()</code> will be
called once before the log file is processed. It should
be use to initialize the state of your converter. Since
our DLF Converter doesn't need any initialization and doesn't
need any configuration, the method is simply empty:
</p><pre class="programlisting">
sub init_dlf_converter {
my ( $self, $process ) = @_;
return;
}
</pre><p>
</p><p>The <code class="varname">$process</code> parameter which is
passed to all the processing methods is an instance of
<code class="classname">Lire::DlfConverterProcess</code>. This
is the object which is driving the conversion process
and it defines several methods which you will use in the
actual conversion process.
</p></div><div class="section" title="Conversion Finalization"><div class="titlepage"><div><div><h4 class="title"><a name="id402856"></a>Conversion Finalization</h4></div></div></div><p>The method
<code class="methodname">finish_conversion()</code> will be
called once after the log file has been completely
processed. This method will be mostly of use to stateful
converter, that is DLF converters which generates DLF
records from more than one line. Since this is not our
case, we simply leave the method empty:
</p><pre class="programlisting">
sub finish_conversion {
my ( $self, $process ) = @_;
return;
}
</pre><p>
</p></div><div class="section" title="The DLF Conversion Process"><div class="titlepage"><div><div><h4 class="title"><a name="id402880"></a>The DLF Conversion Process</h4></div></div></div><p>Whether you are using the file-oriented or
line-oriented conversion mode, the principles are the
same. You extract information from the log file and
creates DLF records from it. Your DLF converter
communicates with the framework by calling methods on
the <code class="classname">Lire::DlfConverterProcess</code>
object which is passed as parameter to your methods.
</p><p>Here is the complete code of our conversion method:
</p><pre class="programlisting">
use Lire::Apache qw/parse_common/;
sub process_log_line {
my ( $self, $process, $line ) = @_;
my $sys_rec = eval { $self->{syslog_parser}->parse( $line ) };
if ( $@ ) {
$process->error( $@, $line );
return;
} elsif ( $sys_rec->{process} ne 'httpd' ) {
$process->ignore_log_line( $line, "not an httpd record" );
return;
} else {
my $common_dlf = {};
eval { parse_common( $sys_rec->{content}, $common_dlf ) };
if ( $@ ) {
$sys_rec->{message} = $sys_rec->{content};
$process->write_dlf( "syslog", $sys_rec );
} else {
$process->write_dlf( "www", $common_dlf );
}
}
}
</pre><p>
</p><p>The first thing that should be noted is that in the
line-oriented conversion mode, the method
<code class="methodname">process_log_line()</code> will be
called once for each line in the log file.
</p><p>Secondly, the actual parsing of the line is done
using two functions: <code class="function">parse_common</code>
and <code class="classname">Lire::Syslog</code>'s
<code class="methodname">parse</code>. These methods simply
uses regular expressions to extract the appropriate
information from the line and put it in an hash
reference. What is important is that these methods
already uses as key names the schema's field names.
</p><p>Finally, you can see that there are four different
methods used on the <code class="varname">$process</code> object to
report different kind of information:
</p><div class="variablelist"><dl><dt><span class="term">Reporting Error</span></dt><dd><p>The example uses the
<code class="function">eval</code> statement to trap
errors during the syslog record parsing. If the
line cannot be parsed as a valid syslog record,
it is an error and it is reported through the
<code class="methodname">error()</code> method. The
first parameter is the error message and the
second one is the line to which the error is
associated. This last parameter is optional.
</p></dd><dt><span class="term">Ignoring Information</span></dt><dd><p>When the syslog event doesn't come from the
<span class="command"><strong>httpd</strong></span> process, we ignore the
line. Ignored line are reported to the framework
by using the
<code class="methodname">ignore_log_line()</code>
method. The first parameter is the line which is
ignored. The second optional parameter gives the
reason why the line was ignored.
</p></dd><dt><span class="term">Creating DLF Records</span></dt><dd><p>Finally, DLF records are created by using
the <code class="methodname">write_dlf()</code> method.
Its first parameter is the schema to which the
DLF record complies. This schema must be one
that is listed by your converter's
<code class="methodname">schemas()</code> method. The
second parameter is the DLF data contained in an
hash reference. The DLF record will be created
by taking for each field in the schema the value
under the same name in the hash. (Since in the
<span class="type">syslog</span> schema, the field which
contains the actual log message is called
<em class="structfield"><code>message</code></em>, this is the
reason we
are assigning the <span class="property">content</span>
value to the <span class="property">message</span> key.)
Missing fields
or fields whose value is
<code class="literal">undef</code> will contains the
special <code class="literal">LR_NA</code> missing value
marker. Keys in the hash that don't map to a
schema's field are simply ignored.
</p><p>In our example, we distinguish between the
server's error message (mapped to the
<span class="type">syslog</span> schema) and the request
information (mapped to the <span class="type">www</span>
schema) based on whether
<code class="function">parse_common</code> succeeded in
parsing the line.
</p></dd><dt><span class="term">Saving Log Line</span></dt><dd><p>Another possibility, not shown in our
example, is to ask that the line be saved for a
later processing. This is mostly of use to
converters who maitains state between lines. In
the cases, it is quite the case that there are
related lines that are missing from the end of
the log file. In that case, you save the line
and they will automatically seen by the next run
of your converter on the same DLF store. This
option is only available in the line-oriented
mode of conversion.
</p></dd></dl></div><p>
</p><div class="section" title="File-Oriented Conversion"><div class="titlepage"><div><div><h5 class="title"><a name="id403097"></a>File-Oriented Conversion</h5></div></div></div><p>The same principles apply when you are using the
file-oriented mode of conversion. This mode will
usually be used for binary log formats or format which
aren't line-oriented like XML.
</p><p>For demonstration purpose, the following code could be
added to transform our line-oriented converter into a
file-oriented one:
</p><pre class="programlisting">
sub handle_log_lines {
return 0;
}
sub process_log_file {
my ( $self, $process, $fh ) = @_;
my $line;
while ( defined( $line = <$fh> ) {
chomp $line;
$self->process_log_line( $process, $line );
}
}
</pre><p>
</p><p>The difference between the above code and using
the line oriented mode is that the framework won't be
aware of the number of log lines processed and your
converter might have troubles when processing log
files which uses a different line-ending convention
than the host you are runnig on. Bottom line is that
you should use the line-oriented conversion mode when
your log format is line oriented.
</p></div></div></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ch02s04.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="ch02.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ch02s06.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Adding a Constructor </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Registering Your DLF Converter with the <span class="application">Lire</span> Framework</td></tr></table></div></body></html>
|