This file is indexed.

/usr/share/doc/pytidylib-doc/html/index.html is in pytidylib-doc 0.3.2~dfsg-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>PyTidyLib: A Python Interface to HTML Tidy &#8212; pytidylib module</title>
    
    <link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    './',
        VERSION:     '',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="index" title="Index" href="genindex.html" />
    <link rel="search" title="Search" href="search.html" />
    <link rel="top" title="pytidylib module" href="#" />
   
  <link rel="stylesheet" href="_static/custom.css" type="text/css" />
  
  <meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />

  </head>
  <body role="document">
  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body" role="main">
            
  <div class="section" id="pytidylib-a-python-interface-to-html-tidy">
<h1>PyTidyLib: A Python Interface to HTML Tidy<a class="headerlink" href="#pytidylib-a-python-interface-to-html-tidy" title="Permalink to this headline"></a></h1>
<p><a class="reference external" href="http://countergram.com/open-source/pytidylib/">PyTidyLib</a> is a Python package that wraps the <a class="reference external" href="http://tidy.sourceforge.net/">HTML Tidy</a> library. This allows you, from Python code, to &#8220;fix&#8221; invalid (X)HTML markup. Some of the library&#8217;s many capabilities include:</p>
<ul class="simple">
<li>Clean up unclosed tags and unescaped characters such as ampersands</li>
<li>Output HTML 4 or XHTML, strict or transitional, and add missing doctypes</li>
<li>Convert named entities to numeric entities, which can then be used in XML documents without an HTML doctype.</li>
<li>Clean up HTML from programs such as Word (to an extent)</li>
<li>Indent the output, including proper (i.e. no) indenting for <code class="docutils literal"><span class="pre">pre</span></code> elements, which some (X)HTML indenting code overlooks.</li>
</ul>
<p>As of the latest PyTidyLib maintenance updates, HTML Tidy itself has currently not been updated since 2008, and it may have trouble with newer HTML. This is just a thin Python wrapper around HTML Tidy, which is a separate project.</p>
<p>As of 0.2.3, both Python 2 and Python 3 are supported with passing tests.</p>
<div class="section" id="naming-conventions">
<h2>Naming conventions<a class="headerlink" href="#naming-conventions" title="Permalink to this headline"></a></h2>
<p><a class="reference external" href="http://tidy.sourceforge.net/">HTML Tidy</a> is a longstanding open-source library written in C that implements the actual functionality of cleaning up (X)HTML markup. It provides a shared library (<code class="docutils literal"><span class="pre">so</span></code>, <code class="docutils literal"><span class="pre">dll</span></code>, or <code class="docutils literal"><span class="pre">dylib</span></code>) that can variously be called <code class="docutils literal"><span class="pre">tidy</span></code>, <code class="docutils literal"><span class="pre">libtidy</span></code>, or <code class="docutils literal"><span class="pre">tidylib</span></code>, as well as a command-line executable named <code class="docutils literal"><span class="pre">tidy</span></code>. For clarity, this document will consistently refer to it by the project name, HTML Tidy.</p>
<p><a class="reference external" href="http://countergram.com/open-source/pytidylib/">PyTidyLib</a> is the name of the Python package discussed here. As this is the package name, <code class="docutils literal"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">pytidylib</span></code> is correct (they are case-insenstive). The <em>module</em> name is <code class="docutils literal"><span class="pre">tidylib</span></code>, so <code class="docutils literal"><span class="pre">import</span> <span class="pre">tidylib</span></code> is correct in Python code. This document will consistently use the package name, PyTidyLib, outside of code examples.</p>
</div>
<div class="section" id="installing-html-tidy">
<h2>Installing HTML Tidy<a class="headerlink" href="#installing-html-tidy" title="Permalink to this headline"></a></h2>
<p>You must have both <a class="reference external" href="http://tidy.sourceforge.net/">HTML Tidy</a> and <a class="reference external" href="http://countergram.com/open-source/pytidylib/">PyTidyLib</a> installed in order to use the functionality described here. There is no affiliation between the two projects. The following briefly outlines what you must do to install HTML Tidy. See the <a class="reference external" href="http://tidy.sourceforge.net/">HTML Tidy</a> web site for more information.</p>
<p><strong>Linux/BSD or similar:</strong> First, try to use your distribution&#8217;s package management system (<code class="docutils literal"><span class="pre">apt-get</span></code>, <code class="docutils literal"><span class="pre">yum</span></code>, etc.) to install HTML Tidy. It might go under the name <code class="docutils literal"><span class="pre">libtidy</span></code>, <code class="docutils literal"><span class="pre">tidylib</span></code>, <code class="docutils literal"><span class="pre">tidy</span></code>, or something similar. Otherwise see <em>Building from Source</em>, below.</p>
<p><strong>OS X:</strong> You may already have HTML Tidy installed. In the Terminal, run <code class="docutils literal"><span class="pre">locate</span> <span class="pre">libtidy</span></code> and see if you get any results, which should end in <code class="docutils literal"><span class="pre">dylib</span></code>. Otherwise see <em>Building from Source</em>, below.</p>
<p><strong>Windows:</strong> (Do not use pre-0.2.0 PyTidyLib.) You may be able to find prebuild DLLs. The DLL sources that were linked to in previous versions of this documentation have since gone 404 without obvious  replacements.</p>
<p>Once you have a DLL (which may be named <code class="docutils literal"><span class="pre">tidy.dll</span></code>, <code class="docutils literal"><span class="pre">libtidy.dll</span></code>, or <code class="docutils literal"><span class="pre">tidylib.dll</span></code>), you must place it in a directory on your system path. If you are running Python from the command-line, placing the DLL in the present working directory will work, but this is unreliable otherwise (e.g. for server software).</p>
<p>See the articles <a class="reference external" href="http://www.computerhope.com/issues/ch000549.htm">How to set the path in Windows 2000/Windows XP</a> (ComputerHope.com) and <a class="reference external" href="http://www.question-defense.com/2009/06/22/modify-a-users-path-in-windows-vista-vista-path-environment-variable/">Modify a Users Path in Windows Vista</a> (Question Defense) for more information on your system path.</p>
<p><strong>Building from Source:</strong> The HTML Tidy developers have chosen to make the source code downloadable <em>only</em> through CVS, and not from the web site. Use the following CVS checkout at the command line:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">cvs</span> <span class="o">-</span><span class="n">z3</span> <span class="o">-</span><span class="n">d</span><span class="p">:</span><span class="n">pserver</span><span class="p">:</span><span class="n">anonymous</span><span class="nd">@tidy</span><span class="o">.</span><span class="n">cvs</span><span class="o">.</span><span class="n">sourceforge</span><span class="o">.</span><span class="n">net</span><span class="p">:</span><span class="o">/</span><span class="n">cvsroot</span><span class="o">/</span><span class="n">tidy</span> <span class="n">co</span> <span class="o">-</span><span class="n">P</span> <span class="n">tidy</span>
</pre></div>
</div>
<p>Then see the instructions packaged with the source code or on the <a class="reference external" href="http://tidy.sourceforge.net/">HTML Tidy</a> web site.</p>
</div>
<div class="section" id="installing-pytidylib">
<h2>Installing PyTidyLib<a class="headerlink" href="#installing-pytidylib" title="Permalink to this headline"></a></h2>
<p>PyTidyLib is available on the Python Package Index:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">pip</span> <span class="n">install</span> <span class="n">pytidylib</span>
</pre></div>
</div>
<p>You can also download the latest source distribution from PyPI manually.</p>
</div>
<div class="section" id="small-example-of-use">
<h2>Small example of use<a class="headerlink" href="#small-example-of-use" title="Permalink to this headline"></a></h2>
<p>The following code cleans up an invalid HTML document and sets an option:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">tidylib</span> <span class="k">import</span> <span class="n">tidy_document</span>
<span class="n">document</span><span class="p">,</span> <span class="n">errors</span> <span class="o">=</span> <span class="n">tidy_document</span><span class="p">(</span><span class="s1">&#39;&#39;&#39;&lt;p&gt;f&amp;otilde;o &lt;img src=&quot;bar.jpg&quot;&gt;&#39;&#39;&#39;</span><span class="p">,</span>
    <span class="n">options</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;numeric-entities&#39;</span><span class="p">:</span><span class="mi">1</span><span class="p">})</span>
<span class="nb">print</span> <span class="n">document</span>
<span class="nb">print</span> <span class="n">errors</span>
</pre></div>
</div>
</div>
<div class="section" id="configuration-options">
<h2>Configuration options<a class="headerlink" href="#configuration-options" title="Permalink to this headline"></a></h2>
<p>The Python interface allows you to pass options directly to HTML Tidy. For a complete list of options, see the <a class="reference external" href="http://tidy.sourceforge.net/docs/quickref.html">HTML Tidy Configuration Options Quick Reference</a> or, from the command line, run <code class="docutils literal"><span class="pre">tidy</span> <span class="pre">-help-config</span></code>.</p>
<p>This module sets certain default options, as follows:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">BASE_OPTIONS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s2">&quot;indent&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>           <span class="c1"># Pretty; not too much of a performance hit</span>
    <span class="s2">&quot;tidy-mark&quot;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>        <span class="c1"># No tidy meta tag in output</span>
    <span class="s2">&quot;wrap&quot;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>             <span class="c1"># No wrapping</span>
    <span class="s2">&quot;alt-text&quot;</span><span class="p">:</span> <span class="s2">&quot;&quot;</span><span class="p">,</span>        <span class="c1"># Help ensure validation</span>
    <span class="s2">&quot;doctype&quot;</span><span class="p">:</span> <span class="s1">&#39;strict&#39;</span><span class="p">,</span>   <span class="c1"># Little sense in transitional for tool-generated markup...</span>
    <span class="s2">&quot;force-output&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>     <span class="c1"># May not get what you expect but you will get something</span>
    <span class="p">}</span>
</pre></div>
</div>
<p>If you do not like these options to be set for you, do the following after importing <code class="docutils literal"><span class="pre">tidylib</span></code>:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">tidylib</span><span class="o">.</span><span class="n">BASE_OPTIONS</span> <span class="o">=</span> <span class="p">{}</span>
</pre></div>
</div>
</div>
<div class="section" id="function-reference">
<h2>Function reference<a class="headerlink" href="#function-reference" title="Permalink to this headline"></a></h2>
<dl class="function">
<dt id="tidylib.tidy_document">
<code class="descclassname">tidylib.</code><code class="descname">tidy_document</code><span class="sig-paren">(</span><em>text</em>, <em>options=None</em>, <em>keep_doc=False</em><span class="sig-paren">)</span><a class="headerlink" href="#tidylib.tidy_document" title="Permalink to this definition"></a></dt>
<dd></dd></dl>

<dl class="function">
<dt id="tidylib.tidy_fragment">
<code class="descclassname">tidylib.</code><code class="descname">tidy_fragment</code><span class="sig-paren">(</span><em>text</em>, <em>options=None</em>, <em>keep_doc=False</em><span class="sig-paren">)</span><a class="headerlink" href="#tidylib.tidy_fragment" title="Permalink to this definition"></a></dt>
<dd></dd></dl>

<dl class="function">
<dt id="tidylib.release_tidy_doc">
<code class="descclassname">tidylib.</code><code class="descname">release_tidy_doc</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#tidylib.release_tidy_doc" title="Permalink to this definition"></a></dt>
<dd></dd></dl>

</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
  <h3><a href="#">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">PyTidyLib: A Python Interface to HTML Tidy</a><ul>
<li><a class="reference internal" href="#naming-conventions">Naming conventions</a></li>
<li><a class="reference internal" href="#installing-html-tidy">Installing HTML Tidy</a></li>
<li><a class="reference internal" href="#installing-pytidylib">Installing PyTidyLib</a></li>
<li><a class="reference internal" href="#small-example-of-use">Small example of use</a></li>
<li><a class="reference internal" href="#configuration-options">Configuration options</a></li>
<li><a class="reference internal" href="#function-reference">Function reference</a></li>
</ul>
</li>
</ul>
<div class="relations">
<h3>Related Topics</h3>
<ul>
  <li><a href="#">Documentation overview</a><ul>
  </ul></li>
</ul>
</div>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="_sources/index.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <div><input type="text" name="q" /></div>
      <div><input type="submit" value="Go" /></div>
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="footer">
      &copy;2009-2016 Jason Stitt.
      
      |
      Powered by <a href="http://sphinx-doc.org/">Sphinx 1.4.9</a>
      &amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.8</a>
      
      |
      <a href="_sources/index.txt"
          rel="nofollow">Page source</a>
    </div>

    

    
  </body>
</html>