/usr/share/doc/refdb/refdb-manual/ch08s09.html is in refdb-doc 1.0.2-3ubuntu1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Character encoding issues</title><link rel="stylesheet" type="text/css" href="manual.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="home" href="index.html" title="RefDB handbook" /><link rel="up" href="ch08.html" title="Chapter 8. Reference management" /><link rel="prev" href="ch08s08.html" title="Create periodical synonyms" /><link rel="next" href="ch08s10.html" title="Use pdfroot" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Character encoding issues</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ch08s08.html">Prev</a> </td><th width="60%" align="center">Chapter 8. Reference management</th><td width="20%" align="right"> <a accesskey="n" href="ch08s10.html">Next</a></td></tr></table><hr /></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="sect1-character-encoding"></a>Character encoding issues</h2></div></div></div><p>The 7-bit ASCII character set originally employed by PCs in the days of yore turned out to be insufficient for languages other than English. Reference data may require characters not included in the ASCII character set. The string sorting order may also follow different rules. RefDB supports national character sets as well as Unicode, which is sort of a superset of all national character sets. As a RefDB user and administrator you'll have to deal with character encoding issues at different levels.</p><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp65678848"></a>Character encodings of databases</h3></div></div></div><p>While it is possible to convert the data during import and export (see the following sections), it is still worthwile to spend a few thoughts about the character encoding used by your reference databases. If possible, use an encoding that ensures a suitable string sorting order for your data. Choosing a proper encoding also avoids unnecessary character encoding conversions when importing or exporting data.</p><p>The available encodings are limited by your database engine:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">SQLite</span></dt><dd><p>SQLite currently supports only ISO-8859-1 (the default) and UTF-8 as a compile-time option. If you install a binary package, it most likely uses ISO-8859-1.</p></dd><dt><span class="term">SQLite3</span></dt><dd><p>SQLite3 uses UTF-8 by default. UTF-16 is supported by the database engine, but not by the libdbi library which RefDB uses to access the engine.</p></dd><dt><span class="term">MySQL</span></dt><dd><p>This database engine supports a fairly large number of encodings, but versions prior to 4.1 allow only one encoding per server instance. That is, all databases have to use the same character encoding. Please see the <a class="ulink" href="http://www.mysql.org" target="_top">MySQL documentation</a> for the growing list of supported encodings</p></dd><dt><span class="term">PostgreSQL</span></dt><dd><p>This database engine supports a variety of encodings as a per-database option. That is, all reference databases may use different encodings. Please see the <a class="ulink" href="http://www.postgresql.org" target="_top">PostgreSQL documentation</a> for a current list of supported encodings.</p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp65690880"></a>Character encodings of imported data</h3></div></div></div><p>We'll have to distinguish two different sorts of data:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">RIS</span></dt><dd><p>This plain-text format does not have a built-in way to declare the character encoding of the data. Instead you have to use the <code class="option">-E</code> option of the <a class="link" href="re11.html#app-c-command-addref" title="addref">addref</a> and <a class="link" href="re11.html#app-c-command-updateref" title="updateref">updateref</a> commands to specify the encoding if it is different from the default (UTF-8).</p><p>Please note that the import filters <a class="link" href="re16.html" title="med2ris">med2ris</a>, <a class="link" href="re14.html" title="en2ris">en2ris</a>, and to a limited extent also <a class="link" href="re15.html" title="marc2ris">marc2ris</a> support on-the-fly character encoding conversion.</p></dd><dt><span class="term">risx and xnote</span></dt><dd><p>These are XML formats that can use the XML way of declaring the encoding. This is done in the processing instructions, which is the first line in a XML file. Due to a limitation of the parser used for importing XML data, only four encodings are accepted by RefDB: UTF-8, UTF-16, ISO-8859-1, US-ASCII. If your data use a different encoding, use the <span class="command"><strong>iconv</strong></span> command line utility (usually a part of the libiconv package) to convert your data to one of the accepted encodings.</p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp65702192"></a>Character encodings of exported data</h3></div></div></div><p>By default, data are exported without a character conversion, i.e. the data will use whatever encoding the database uses. If you want the exported data in a different format, request the encoding with the <code class="option">-E</code> option. This option is accepted by the <a class="link" href="re11.html#app-c-command-getref" title="getref">getref</a> and <a class="link" href="re11.html#app-c-command-getnote" title="getnote">getnote</a> commands of refdbc as well as by the <a class="link" href="ch15.html" title="Chapter 15. Tools for bibliographies">refdbib</a> client. You may request any encoding that your local libiconv installation supports. <span class="command"><strong>man 3 iconv</strong></span> or <span class="command"><strong>man iconv_open</strong></span> should give a clue which encodings are available.</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ch08s08.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="ch08.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ch08s10.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Create periodical synonyms </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Use pdfroot</td></tr></table></div></body></html>
|