/usr/share/doc/gcc-python-plugin-doc/html/gcc-overview.html is in gcc-python-plugin-doc 0.15-4.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Overview of GCC’s internals — gcc-python-plugin 0.15 documentation</title>
<link rel="stylesheet" href="_static/classic.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
VERSION: '0.15',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true,
SOURCELINK_SUFFIX: '.txt'
};
</script>
<script type="text/javascript" src="_static/jquery.js"></script>
<script type="text/javascript" src="_static/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Example scripts" href="examples.html" />
<link rel="prev" title="Requirements" href="basics.html" />
</head>
<body role="document">
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="examples.html" title="Example scripts"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="basics.html" title="Requirements"
accesskey="P">previous</a> |</li>
<li class="nav-item nav-item-0"><a href="index.html">gcc-python-plugin 0.15 documentation</a> »</li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="overview-of-gcc-s-internals">
<h1>Overview of GCC’s internals<a class="headerlink" href="#overview-of-gcc-s-internals" title="Permalink to this headline">¶</a></h1>
<p>To add a new compiler warning to GCC, it’s helpful to have a high-level
understanding of how GCC works, so here’s the 10,000 foot view of how GCC turns
source code into machine code.</p>
<p>The short version is that GCC applies a series of optimization passes to your
code, gradually converting it from a high-level representation into machine
code, via several different internal representations.</p>
<p>Each programming language supported by GCC has a “frontend”, which parses the
source files.</p>
<p>For the case of C and C++, the preprocessor manipulates the code first
before the frontend sees it. You can see the preprocessor output with the
<cite>-E</cite> option.</p>
<p>Exactly what happens in each frontend varies by language: some language
frontends emit language-specific trees, and some convert to a
language-independent tree representation known as <cite>GENERIC</cite>. In any case, we
eventually we reach a representation known as <cite>GIMPLE</cite>. The GIMPLE
representation contains simplified operations, with temporary variables added as
necessary to avoid nested sub-expressions.</p>
<p>For example, given this C code:</p>
<blockquote>
<div><div class="highlight-c"><div class="highlight"><pre><span></span><span class="kt">int</span>
<span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">**</span><span class="n">argv</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"argc: %i</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">argc</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">argc</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"argv[%i]: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">argv</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="p">}</span>
<span class="n">helper_function</span><span class="p">();</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
</div>
</div></blockquote>
<p>we can see a dump of a C-like representation of the GIMPLE form by passing
<cite>-fdump-tree-gimple</cite> to the command-line:</p>
<blockquote>
<div><div class="highlight-bash"><div class="highlight"><pre><span></span>$ gcc -fdump-tree-gimple test.c
$ cat test.c.004t.gimple
</pre></div>
</div>
</div></blockquote>
<p>giving something like this:</p>
<div class="highlight-c"><div class="highlight"><pre><span></span><span class="n">main</span> <span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span> <span class="o">*</span> <span class="n">argv</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="kr">restrict</span> <span class="n">D</span><span class="mf">.3258</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">D</span><span class="mf">.3259</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">D</span><span class="mf">.3260</span><span class="p">;</span>
<span class="kt">char</span> <span class="o">*</span> <span class="o">*</span> <span class="n">D</span><span class="mf">.3261</span><span class="p">;</span>
<span class="kt">char</span> <span class="o">*</span> <span class="n">D</span><span class="mf">.3262</span><span class="p">;</span>
<span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="kr">restrict</span> <span class="n">D</span><span class="mf">.3263</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">D</span><span class="mf">.3264</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
<span class="n">D</span><span class="mf">.3258</span> <span class="o">=</span> <span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="kr">restrict</span><span class="p">)</span> <span class="o">&</span><span class="s">"argc: %i</span><span class="se">\n</span><span class="s">"</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="n">printf</span> <span class="p">(</span><span class="n">D</span><span class="mf">.3258</span><span class="p">,</span> <span class="n">argc</span><span class="p">);</span>
<span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">goto</span> <span class="o"><</span><span class="n">D</span><span class="mf">.2050</span><span class="o">></span><span class="p">;</span>
<span class="o"><</span><span class="n">D</span><span class="mf">.2049</span><span class="o">>:</span>
<span class="n">D</span><span class="mf">.3259</span> <span class="o">=</span> <span class="p">(</span><span class="kt">long</span> <span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)</span> <span class="n">i</span><span class="p">;</span>
<span class="n">D</span><span class="mf">.3260</span> <span class="o">=</span> <span class="n">D</span><span class="mf">.3259</span> <span class="o">*</span> <span class="mi">8</span><span class="p">;</span>
<span class="n">D</span><span class="mf">.3261</span> <span class="o">=</span> <span class="n">argv</span> <span class="o">+</span> <span class="n">D</span><span class="mf">.3260</span><span class="p">;</span>
<span class="n">D</span><span class="mf">.3262</span> <span class="o">=</span> <span class="o">*</span><span class="n">D</span><span class="mf">.3261</span><span class="p">;</span>
<span class="n">D</span><span class="mf">.3263</span> <span class="o">=</span> <span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="kr">restrict</span><span class="p">)</span> <span class="o">&</span><span class="s">"argv[%i]: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="n">printf</span> <span class="p">(</span><span class="n">D</span><span class="mf">.3263</span><span class="p">,</span> <span class="n">D</span><span class="mf">.3262</span><span class="p">);</span>
<span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="o"><</span><span class="n">D</span><span class="mf">.2050</span><span class="o">>:</span>
<span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o"><</span> <span class="n">argc</span><span class="p">)</span> <span class="k">goto</span> <span class="o"><</span><span class="n">D</span><span class="mf">.2049</span><span class="o">></span><span class="p">;</span> <span class="k">else</span> <span class="k">goto</span> <span class="o"><</span><span class="n">D</span><span class="mf">.2051</span><span class="o">></span><span class="p">;</span>
<span class="o"><</span><span class="n">D</span><span class="mf">.2051</span><span class="o">>:</span>
<span class="n">helper_function</span> <span class="p">();</span>
<span class="n">D</span><span class="mf">.3264</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">return</span> <span class="n">D</span><span class="mf">.3264</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
</div>
<p>It’s far easier to see the GIMPLE using:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>./gcc-with-python examples/show-gimple.py test.c
</pre></div>
</div>
<p>which generates bitmaps showing the “control flow graph” of the functions in
the file, with source on the left-hand side, and GIMPLE on the right-hand side:</p>
<blockquote>
<div><div class="figure">
<a class="reference internal image-reference" href="_images/sample-gimple-cfg.png"><img alt="image of a control flow graph in GIMPLE form" src="_images/sample-gimple-cfg.png" style="width: 850.5px; height: 538.5px;" /></a>
</div>
</div></blockquote>
<p>Each function is divided into “basic blocks”. Each basic block consists of a
straight-line sequence of code with a single entrypoint and exit: all branching
happens between basic blocks, not within them. The basic blocks form a
“control flow graph” of basic blocks, linked together by edges. Each block
can contain a list of <a class="reference internal" href="gimple.html#gcc.Gimple" title="gcc.Gimple"><code class="xref py py-class docutils literal"><span class="pre">gcc.Gimple</span></code></a> statements.</p>
<p>You can work with this representation from Python using <a class="reference internal" href="cfg.html#gcc.Cfg" title="gcc.Cfg"><code class="xref py py-class docutils literal"><span class="pre">gcc.Cfg</span></code></a></p>
<p>Once the code is in GIMPLE form, GCC then attempts a series of optimizations on
it.</p>
<p>Some of these optimizations are listed here:
<a class="reference external" href="http://gcc.gnu.org/onlinedocs/gccint/Tree-SSA-passes.html">http://gcc.gnu.org/onlinedocs/gccint/Tree-SSA-passes.html</a></p>
<p>If you’re looking to add new compiler warnings, it’s probably best to hook
your code into these early passes.</p>
<p>The GIMPLE representation actually has several forms:</p>
<blockquote>
<div><ul class="simple">
<li>an initial “high gimple” form, potentially containing certain high-level
operations (e.g. control flow, exception handling)</li>
<li>the lower level gimple forms, as each of these operations are rewritten
in lower-level terms (turning control flow from jumps into a CFG etc)</li>
<li>the SSA form of GIMPLE. In Static Single Assignment form, every variable
is assigned to at most once, with additional versions of variables added
to help track the impact of assignments on the data flowing through
a function. See <a class="reference external" href="http://gcc.gnu.org/onlinedocs/gccint/SSA.html">http://gcc.gnu.org/onlinedocs/gccint/SSA.html</a></li>
</ul>
</div></blockquote>
<p>You can tell what form a function is in by looking at the flags of the current
pass. For example:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="k">if</span> <span class="n">ps</span><span class="o">.</span><span class="n">properties_provided</span> <span class="o">&</span> <span class="n">gcc</span><span class="o">.</span><span class="n">PROP_cfg</span><span class="p">:</span>
<span class="c1"># ...then this gcc.Function ought to have a gcc.Cfg:</span>
<span class="n">do_something_with_cfg</span><span class="p">(</span><span class="n">fn</span><span class="o">.</span><span class="n">cfg</span><span class="p">)</span>
<span class="k">if</span> <span class="n">ps</span><span class="o">.</span><span class="n">properties_provided</span> <span class="o">&</span> <span class="n">gcc</span><span class="o">.</span><span class="n">PROP_ssa</span><span class="p">:</span>
<span class="c1"># ...then we have SSA data</span>
<span class="n">do_something_with_ssa</span><span class="p">(</span><span class="n">fn</span><span class="p">)</span>
</pre></div>
</div>
<p>Here’s our example function, after conversion to GIMPLE SSA:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>./gcc-with-python examples/show-ssa.py test.c
</pre></div>
</div>
<div class="figure">
<a class="reference internal image-reference" href="_images/sample-gimple-ssa-cfg.png"><img alt="image of a control flow graph in GIMPLE SSA form" src="_images/sample-gimple-ssa-cfg.png" style="width: 866.5px; height: 550.5px;" /></a>
</div>
<p>You can see that the local variable <cite>i</cite> has been split into three versions:</p>
<blockquote>
<div><ul class="simple">
<li><cite>i_4</cite>, assigned to in block 2</li>
<li><cite>i_11</cite>, assigned to at the end of block 3</li>
<li><cite>i_1</cite>, assigned to at the top of block 4.</li>
</ul>
</div></blockquote>
<p>As is normal with SSA, GCC inserts fake functions known as “PHI” at the start
of basic blocks where needed in order to merge the multiple possible values of
a variable. You can see one in our example at the top of the loop in block 4:</p>
<div class="highlight-c"><div class="highlight"><pre><span></span><span class="n">i_1</span> <span class="o">=</span> <span class="n">PHI</span> <span class="o"><</span><span class="n">i_4</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">i_11</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span><span class="o">></span>
</pre></div>
</div>
<p>where i_1 either gets the value of i_4, or of i_11, depending on whether we
reach here via block 2 (at the start of the iteration) or block 3 (continuing
the “for” loop).</p>
<p>After these optimizations passes are done, GCC converts the GIMPLE SSA
representation into a lower-level representation known as Register Transfer
Language (RTL). This is probably too low-level to be of interest to those
seeking to add new compiler warnings: at this point it’s attempting to work
with the available opcodes and registers on the target CPU with the aim of
generating efficient machine code.</p>
<p>See <a class="reference external" href="http://gcc.gnu.org/onlinedocs/gccint/RTL.html">http://gcc.gnu.org/onlinedocs/gccint/RTL.html</a></p>
<p>The RTL form uses the same Control Flow Graph machinery as the GIMPLE
representation, but with RTL expressions within the basic blocks.</p>
<p>Once in RTL, GCC applies a series of further optimizations, before finally
generating assembly language (which it submits to <cite>as</cite>, the GNU assembler):
<a class="reference external" href="http://gcc.gnu.org/onlinedocs/gccint/RTL-passes.html">http://gcc.gnu.org/onlinedocs/gccint/RTL-passes.html</a>
You can see the assembly language using the <cite>-S</cite> command line option.</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>$ ./gcc -S test.c
$ cat test.s
</pre></div>
</div>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<h4>Previous topic</h4>
<p class="topless"><a href="basics.html"
title="previous chapter">Requirements</a></p>
<h4>Next topic</h4>
<p class="topless"><a href="examples.html"
title="next chapter">Example scripts</a></p>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="_sources/gcc-overview.rst.txt"
rel="nofollow">Show Source</a></li>
</ul>
</div>
<div id="searchbox" style="display: none" role="search">
<h3>Quick search</h3>
<form class="search" action="search.html" method="get">
<div><input type="text" name="q" /></div>
<div><input type="submit" value="Go" /></div>
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="examples.html" title="Example scripts"
>next</a> |</li>
<li class="right" >
<a href="basics.html" title="Requirements"
>previous</a> |</li>
<li class="nav-item nav-item-0"><a href="index.html">gcc-python-plugin 0.15 documentation</a> »</li>
</ul>
</div>
<div class="footer" role="contentinfo">
© Copyright 2011-2017, David Malcolm.
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.6.
</div>
</body>
</html>
|