/usr/share/doc/refdb/refdb-manual/re15.html is in refdb-doc 1.0.2-3.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>marc2ris</title><link rel="stylesheet" type="text/css" href="manual.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.78.1" /><link rel="home" href="index.html" title="RefDB handbook" /><link rel="up" href="ch14.html#idp69978912" title="Tools" /><link rel="prev" href="re14.html" title="en2ris" /><link rel="next" href="re16.html" title="med2ris" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">marc2ris</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="re14.html">Prev</a> </td><th width="60%" align="center">Tools</th><td width="20%" align="right"> <a accesskey="n" href="re16.html">Next</a></td></tr></table><hr /></div><div class="refentry"><a id="refentry-marc2ris"></a><div class="titlepage"></div><div class="refnamediv"><a id="marc2ris-name"></a><h2>Name</h2><p>marc2ris — converts MARC bibliographic data to the RIS format</p></div><div class="refsynopsisdiv"><a id="marc2ris-synopsis"></a><h2>Synopsis</h2><div class="cmdsynopsis"><p><code class="command">marc2ris</code> [-e <em class="replaceable"><code>log-destination</code></em>] [-h ] [-l <em class="replaceable"><code>log-level</code></em>] [-L <em class="replaceable"><code>log-file</code></em>] [-m ] [-o <em class="replaceable"><code>outfile</code></em>] [-O <em class="replaceable"><code>outfile</code></em>] [-t <em class="replaceable"><code>input_type</code></em>] [-u <em class="replaceable"><code>t|f</code></em>] <em class="replaceable"><code>file</code></em> </p></div></div><div class="refsect1"><a id="marc2ris-description"></a><h2>Description</h2><p>marc2ris attempts to extract the information useful to RefDB from <acronym class="acronym">MARC</acronym> datasets. <acronym class="acronym">MARC</acronym> (Machine Readable Catalogue Format) is a standard originating from the 1960s and is widely used by libraries and bibliographic agencies. Most libraries that offer Z39.50 access can provide the records in at least one <acronym class="acronym">MARC</acronym> format (like with most other "standards" there's a couple to choose from). Currently the following <acronym class="acronym">MARC</acronym> dialects are supported:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="emphasis"><em>MARC21</em></span></span></dt><dd><p>This is an attempt to consolidate existing MARC variants (mainly USMARC and CANMARC) and will most likely be the format supported by all libraries in the near future. The format is described on the <a class="ulink" href="http://www.loc.gov/marc/" target="_top">Library of Congress MARC pages</a>.</p></dd><dt><span class="term"><span class="emphasis"><em>UNIMARC</em></span></span></dt><dd><p>This is the European equivalent of a standardization attempt. The specification can be found <a class="ulink" href="http://www.ifla.org/VI/3/p1996-1/sec-uni.htm" target="_top">here</a>.</p></dd><dt><span class="term"><span class="emphasis"><em>UKMARC</em></span></span></dt><dd><p>This format is fairly close to the USMARC variant and is mainly used by libraries in the United Kingdom and in Ireland. Libraries supporting this format may switch to MARC21 in the future. Unfortunately there is no online description of this format, but this <a class="ulink" href="www.bl.uk/services/bibliographic/marcchange.pdf" target="_top">PDF document</a> describes the main differences between USMARC and UKMARC.</p></dd></dl></div></div><div class="refsect1"><a id="marc2ris-options"></a><h2>Options</h2><p>By default the script reads USMARC data from stdin and sends RIS data to stdout.</p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="option">-e</code> <em class="replaceable"><code>log-destination</code></em></span></dt><dd><p>log-destination can have the values 0, 1, or 2, or the equivalent strings <span class="emphasis"><em>stderr</em></span>, <span class="emphasis"><em>syslog</em></span>, or <span class="emphasis"><em>file</em></span>, respectively. This value specifies where the log information goes to. <code class="literal">0</code> (zero) means the messages are sent to stderr. They are immediately available on the screen but they may interfere with command output. <code class="literal">1</code> will send the output to the syslog facility. Keep in mind that syslog must be configured to accept log messages from user programs, see the syslog(8) man page for further information. Unix-like systems usually save these messages in <code class="filename">/var/log/user.log</code>. <code class="literal">2</code> will send the messages to a custom log file which can be specified with the <code class="option">-L</code> option.</p></dd><dt><span class="term"><code class="option">-h</code></span></dt><dd><p>Displays help and usage screen, then exits.</p></dd><dt><span class="term"><code class="option">-l</code> <em class="replaceable"><code>log-level</code></em></span></dt><dd><p>Specify the priority up to which events are logged. This is either a number between <code class="literal">0</code> and <code class="literal">7</code> or one of the strings <span class="emphasis"><em>emerg</em></span>, <span class="emphasis"><em>alert</em></span>, <span class="emphasis"><em>crit</em></span>, <span class="emphasis"><em>err</em></span>, <span class="emphasis"><em>warning</em></span>, <span class="emphasis"><em>notice</em></span>, <span class="emphasis"><em>info</em></span>, <span class="emphasis"><em>debug</em></span>, respectively (see also Log level definitions). <code class="option">-1</code> disables logging completely. A low log level like <code class="literal">0</code> means that only the most critical messages are logged. A higher log level means that less critical events are logged as well. <code class="literal">7</code> will include debug messages. The latter can be verbose and abundant, so you want to avoid this log level unless you need to track down problems.</p></dd><dt><span class="term"><code class="option">-L</code> <em class="replaceable"><code>log-file</code></em></span></dt><dd><p>Specify the full path to a log file that will receive the log messages. Typically this would be <code class="filename">/var/log/refdba</code>.</p></dd><dt><span class="term"><code class="option">-m</code></span></dt><dd><p>Switch on additional MARC output. The output data will be the RIS output interspersed with the source MARC data used to generate the output. This is useful to fix conversion errors manually.</p></dd><dt><span class="term"><code class="option">-o</code> <em class="replaceable"><code>file</code></em></span></dt><dd><p>Send output to <span class="emphasis"><em>file</em></span>. If <span class="emphasis"><em>file</em></span> exists, its contents will be overwritten.</p></dd><dt><span class="term"><code class="option">-O</code> <em class="replaceable"><code>file</code></em></span></dt><dd><p>Send output to <span class="emphasis"><em>file</em></span>. If <span class="emphasis"><em>file</em></span> exists, the output will be appended.</p></dd><dt><span class="term"><code class="option">-t</code> <em class="replaceable"><code>input_type</code></em></span></dt><dd><p>Specify the MARC input type. The default is <span class="emphasis"><em>MARC21</em></span>. Other available types are <span class="emphasis"><em>UNIMARC</em></span> and <span class="emphasis"><em>UKMARC</em></span>.</p></dd><dt><span class="term"><code class="option">-u <em class="replaceable"><code>t|f</code></em></code></span></dt><dd><p>Request Unicode output if set to "t" (this is the default). marc2ris attempts to convert the input data into Unicode (unless the dataset explicitly states that it already uses Unicode). If the conversion does not seem to work, set this to "f" as some MARC variants do not state the character encoding explicitly.</p></dd></dl></div></div><div class="refsect1"><a id="marc2ris-configuration"></a><h2>Configuration</h2><p><span class="command"><strong>marc2ris</strong></span> evaluates the file <code class="filename">marc2risrc</code> to initialize itself.</p><div class="table"><a id="idp74251952"></a><p class="title"><strong>Table 14.5. marc2risrc</strong></p><div class="table-contents"><table summary="marc2risrc" border="1"><colgroup><col /><col /><col /></colgroup><thead><tr><th>Variable</th><th>Default</th><th>Comment</th></tr></thead><tbody><tr><td>outfile</td><td>(none)</td><td>The default output file name.</td></tr><tr><td>outappend</td><td>t</td><td>Determines whether output is appended (<em class="replaceable"><code>t</code></em>) to an existing file or overwrites (<em class="replaceable"><code>f</code></em>) an existing file.</td></tr><tr><td>unmapped</td><td>t</td><td>If set to <em class="replaceable"><code>t</code></em>, unknown tags in the input data will be output following a <unmapped> tag; the resulting data can be inspected and then be sent through <span class="command"><strong>sed</strong></span> to strip off these additional lines. If set to <em class="replaceable"><code>f</code></em>, unknown tags will be gracefully ignored.</td></tr><tr><td>logfile</td><td>/var/log/med2ris.log</td><td>The full path of a custom log file. This is used only if logdest is set appropriately.</td></tr><tr><td>logdest</td><td>1</td><td>The destination of the log information. 0 = print to stderr; 1 = use the syslog facility; 2 = use a custom logfile. The latter needs a proper setting of logfile.</td></tr><tr><td>loglevel</td><td>6</td><td>The log level up to which messages will be sent. A low setting (0) allows only the most important messages, a high setting (7) allows all messages including debug messages. -1 means nothing will be logged.</td></tr></tbody></table></div></div><br class="table-break" /></div><div class="refsect1"><a id="marc2ris-data-processing"></a><h2>Data Processing</h2><p>The purpose of the MARC format is entirely different from the purpose of the RIS format, so you shouldn't be too surprised that the import of MARC data is somewhat rough at the edges. The filter apparently deals fine with quite a lot of datasets, but the following shortcomings are known (and more are likely to be discovered by the interested reader):</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Some fields, like 846, are currently ignored completely. This, of course, is bound to change.</p></li><li class="listitem"><p>Author names specified in the natural order, i.e. something like First Middle Last, are not normalized due to the problems with multiple middle or last names. Author names in the inverse order, i.e. something like Last, First Middle, are normalized correctly in most cases. Handling of non-European names is a matter of trial and error.</p></li><li class="listitem"><p>Character set handling is somewhat limited. Only the unaltered input character encoding or UTF-8 are available for the output data.</p></li></ul></div><p>That said, there is still some hope. The <code class="option">-m</code> command line option switches on additional MARC output. That is, the generated output will contain interspersed lines that show the contents of the original MARC fields used to generate the following RIS line or lines. For example, the following output snippet shows how <span class="command"><strong>marc2ris</strong></span> generated the author lines from the MARC input:</p><pre class="programlisting"><marc>empty author field (100)
<marc>:Author(Ind1): 1
<marc>:Author($a): Ershov, A. P.
<marc>:Author($b):
<marc>:Author($c):
<marc>:Author(Ind1): 1
<marc>:Author($a): Knuth, Donald Ervin,
<marc>:Author($b):
<marc>:Author($c):
AU - Ershov,A.P.
AU - Knuth,Donald Ervin</pre><p>If you feel marc2ris does not translate your data appropriately, the easiest way might be to use the <code class="option">-m</code> switch and redirect the output into a file. Then you can analyze the situation and fix the RIS lines as you see fit. Finally you can strip the MARC lines off with a command like:</p><pre class="screen"><code class="prompt">~$ </code>grep -v "<marc>" < withmarc.ris > womarc.ris</pre></div><div class="refsect1"><a id="marc2ris-files"></a><h2>Files</h2><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="filename">PREFIX/etc/refdb/marc2risrc</code></span></dt><dd><p>The global configuration file of marc2ris.</p></dd><dt><span class="term"><code class="filename">$HOME/.marc2risrc</code></span></dt><dd><p>The user configuration file of marc2ris.</p></dd></dl></div></div><div class="refsect1"><a id="marc2ris-see_also"></a><h2>See also</h2><p><span class="emphasis"><em>RefDB</em></span> (7),
<span class="emphasis"><em><a class="link" href="re12.html" title="bib2ris">bib2ris</a></em></span> (1),
<span class="emphasis"><em><a class="link" href="re13.html" title="db2ris">db2ris</a></em></span> (1),
<span class="emphasis"><em><a class="link" href="re14.html" title="en2ris">en2ris</a></em></span> (1),
<span class="emphasis"><em><a class="link" href="re16.html" title="med2ris">med2ris</a></em></span> (1).</p><p><span class="emphasis"><em>RefDB manual (local copy) </em></span> PREFIX/share/doc/refdb-<version>/refdb-manual/index.html</p><p><span class="emphasis"><em>RefDB manual (web) </em></span> <<a class="ulink" href="http://refdb.sourceforge.net/manual/index.html" target="_top">http://refdb.sourceforge.net/manual/index.html</a>></p><p><span class="emphasis"><em>RefDB on the web </em></span> <<a class="ulink" href="http://refdb.sourceforge.net/" target="_top">http://refdb.sourceforge.net/</a>></p></div><div class="refsect1"><a id="marc2ris-author"></a><h2>Author</h2><p>marc2ris was written by Markus Hoenicka <markus@mhoenicka.de>.</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="re14.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="ch14.html#idp69978912">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="re16.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">en2ris </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> med2ris</td></tr></table></div></body></html>
|