/usr/share/doc/festival-doc/html/Available-lexicons.html is in festival-doc 1:2.5.0-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Available lexicons (Festival Speech Synthesis System)</title>
<meta name="description" content="Available lexicons (Festival Speech Synthesis System)">
<meta name="keywords" content="Available lexicons (Festival Speech Synthesis System)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<link href="index.html#Top" rel="start" title="Top">
<link href="Index.html#Index" rel="index" title="Index">
<link href="Index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Lexicons.html#Lexicons" rel="up" title="Lexicons">
<link href="Post_002dlexical-rules.html#Post_002dlexical-rules" rel="next" title="Post-lexical rules">
<link href="Lexicon-requirements.html#Lexicon-requirements" rel="prev" title="Lexicon requirements">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smalllisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>
</head>
<body lang="en">
<a name="Available-lexicons"></a>
<div class="header">
<p>
Next: <a href="Post_002dlexical-rules.html#Post_002dlexical-rules" accesskey="n" rel="next">Post-lexical rules</a>, Previous: <a href="Lexicon-requirements.html#Lexicon-requirements" accesskey="p" rel="prev">Lexicon requirements</a>, Up: <a href="Lexicons.html#Lexicons" accesskey="u" rel="up">Lexicons</a> [<a href="Index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html#Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Available-lexicons-1"></a>
<h3 class="section">13.7 Available lexicons</h3>
<a name="index-lexicon-2"></a>
<p>Currently Festival supports a number of different lexicons. They are
all defined in the file <samp>lib/lexicons.scm</samp> each with a number of
common extra words added to their addendas. They are
</p><dl compact="compact">
<dt>‘<samp>CUVOALD</samp>’</dt>
<dd><a name="index-CUVOALD-lexicon"></a>
<a name="index-Oxford-Advanced-Learners_0027-Dictionary"></a>
<p>The Computer Users Version of Oxford Advanced Learner’s Dictionary is
available from the Oxford Text Archive
<a href="ftp://ota.ox.ac.uk/pub/ota/public/dicts/710">ftp://ota.ox.ac.uk/pub/ota/public/dicts/710</a>. It contains about
70,000 entries and is a part of the BEEP lexicon. It is more consistent
in its marking of stress though its syllable marking is not what works
best for our synthesis methods. Many syllabic ‘<samp>l</samp>’’s, ‘<samp>n</samp>’’s,
and ‘<samp>m</samp>’’s, mess up the syllabification algorithm, making results
sometimes appear over reduced. It is however our current default
lexicon. It is also the only lexicon with part of speech tags that
can be distributed (for non-commercial use).
</p></dd>
<dt>‘<samp>CMU</samp>’</dt>
<dd><a name="index-CMU-lexicon-1"></a>
<p>This is automatically constructed from <samp>cmu_dict-0.4</samp> available
from many places on the net (see <code>comp.speech</code> archives). It is
not in the mrpa phone set because it is American English pronunciation.
Although mappings exist between its phoneset (‘<samp>darpa</samp>’) and
‘<samp>mrpa</samp>’ the results for British English speakers are not very good.
However this is probably the biggest, most carefully specified lexicon
available. It contains just under 100,000 entries. Our distribution
has been modified to include part of speech tags on words we know to be
homographs.
</p></dd>
<dt>‘<samp>mrpa</samp>’</dt>
<dd><a name="index-mrpa-lexicon"></a>
<p>A version of the CSTR lexicon which has been floating about for years.
It contains about 25,000 entries. A new updated free version of
this is due to be released soon.
</p></dd>
<dt>‘<samp>BEEP</samp>’</dt>
<dd><a name="index-BEEP-lexicon-1"></a>
<p>A British English rival for the <samp>cmu_lex</samp>. BEEP has been made
available by Tony Robinson at Cambridge and is available in many
archives. It contains 163,000 entries and has been converted to the
‘<samp>mrpa</samp>’ phoneset (which was a trivial mapping). Although large, it
suffers from a certain randomness in its stress markings, making use of
it for synthesis dubious.
</p></dd>
</dl>
<p>All of the above lexicons have some distribution restrictions (though
mostly pretty light), but as they are mostly freely available we provide
programs that can convert the originals into Festival’s format.
</p>
<a name="index-MOBY-lexicon"></a>
<p>The MOBY lexicon has recently been released into the public domain and
will be converted into our format soon.
</p>
<hr>
<div class="header">
<p>
Next: <a href="Post_002dlexical-rules.html#Post_002dlexical-rules" accesskey="n" rel="next">Post-lexical rules</a>, Previous: <a href="Lexicon-requirements.html#Lexicon-requirements" accesskey="p" rel="prev">Lexicon requirements</a>, Up: <a href="Lexicons.html#Lexicons" accesskey="u" rel="up">Lexicons</a> [<a href="Index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html#Index" title="Index" rel="index">Index</a>]</p>
</div>
</body>
</html>
|