/usr/share/doc/python-pygments-doc/html/_sources/docs/tokens.txt is in python-pygments-doc 2.0.1+dfsg-1.1+deb8u1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 | .. -*- mode: rst -*-
==============
Builtin Tokens
==============
.. module:: pygments.token
In the :mod:`pygments.token` module, there is a special object called `Token`
that is used to create token types.
You can create a new token type by accessing an attribute of `Token`:
.. sourcecode:: pycon
>>> from pygments.token import Token
>>> Token.String
Token.String
>>> Token.String is Token.String
True
Note that tokens are singletons so you can use the ``is`` operator for comparing
token types.
As of Pygments 0.7 you can also use the ``in`` operator to perform set tests:
.. sourcecode:: pycon
>>> from pygments.token import Comment
>>> Comment.Single in Comment
True
>>> Comment in Comment.Multi
False
This can be useful in :doc:`filters <filters>` and if you write lexers on your
own without using the base lexers.
You can also split a token type into a hierarchy, and get the parent of it:
.. sourcecode:: pycon
>>> String.split()
[Token, Token.Literal, Token.Literal.String]
>>> String.parent
Token.Literal
In principle, you can create an unlimited number of token types but nobody can
guarantee that a style would define style rules for a token type. Because of
that, Pygments proposes some global token types defined in the
`pygments.token.STANDARD_TYPES` dict.
For some tokens aliases are already defined:
.. sourcecode:: pycon
>>> from pygments.token import String
>>> String
Token.Literal.String
Inside the :mod:`pygments.token` module the following aliases are defined:
============= ============================ ====================================
`Text` `Token.Text` for any type of text data
`Whitespace` `Token.Text.Whitespace` for specially highlighted whitespace
`Error` `Token.Error` represents lexer errors
`Other` `Token.Other` special token for data not
matched by a parser (e.g. HTML
markup in PHP code)
`Keyword` `Token.Keyword` any kind of keywords
`Name` `Token.Name` variable/function names
`Literal` `Token.Literal` Any literals
`String` `Token.Literal.String` string literals
`Number` `Token.Literal.Number` number literals
`Operator` `Token.Operator` operators (``+``, ``not``...)
`Punctuation` `Token.Punctuation` punctuation (``[``, ``(``...)
`Comment` `Token.Comment` any kind of comments
`Generic` `Token.Generic` generic tokens (have a look at
the explanation below)
============= ============================ ====================================
The `Whitespace` token type is new in Pygments 0.8. It is used only by the
`VisibleWhitespaceFilter` currently.
Normally you just create token types using the already defined aliases. For each
of those token aliases, a number of subtypes exists (excluding the special tokens
`Token.Text`, `Token.Error` and `Token.Other`)
The `is_token_subtype()` function in the `pygments.token` module can be used to
test if a token type is a subtype of another (such as `Name.Tag` and `Name`).
(This is the same as ``Name.Tag in Name``. The overloaded `in` operator was newly
introduced in Pygments 0.7, the function still exists for backwards
compatiblity.)
With Pygments 0.7, it's also possible to convert strings to token types (for example
if you want to supply a token from the command line):
.. sourcecode:: pycon
>>> from pygments.token import String, string_to_tokentype
>>> string_to_tokentype("String")
Token.Literal.String
>>> string_to_tokentype("Token.Literal.String")
Token.Literal.String
>>> string_to_tokentype(String)
Token.Literal.String
Keyword Tokens
==============
`Keyword`
For any kind of keyword (especially if it doesn't match any of the
subtypes of course).
`Keyword.Constant`
For keywords that are constants (e.g. ``None`` in future Python versions).
`Keyword.Declaration`
For keywords used for variable declaration (e.g. ``var`` in some programming
languages like JavaScript).
`Keyword.Namespace`
For keywords used for namespace declarations (e.g. ``import`` in Python and
Java and ``package`` in Java).
`Keyword.Pseudo`
For keywords that aren't really keywords (e.g. ``None`` in old Python
versions).
`Keyword.Reserved`
For reserved keywords.
`Keyword.Type`
For builtin types that can't be used as identifiers (e.g. ``int``,
``char`` etc. in C).
Name Tokens
===========
`Name`
For any name (variable names, function names, classes).
`Name.Attribute`
For all attributes (e.g. in HTML tags).
`Name.Builtin`
Builtin names; names that are available in the global namespace.
`Name.Builtin.Pseudo`
Builtin names that are implicit (e.g. ``self`` in Ruby, ``this`` in Java).
`Name.Class`
Class names. Because no lexer can know if a name is a class or a function
or something else this token is meant for class declarations.
`Name.Constant`
Token type for constants. In some languages you can recognise a token by the
way it's defined (the value after a ``const`` keyword for example). In
other languages constants are uppercase by definition (Ruby).
`Name.Decorator`
Token type for decorators. Decorators are synatic elements in the Python
language. Similar syntax elements exist in C# and Java.
`Name.Entity`
Token type for special entities. (e.g. `` `` in HTML).
`Name.Exception`
Token type for exception names (e.g. ``RuntimeError`` in Python). Some languages
define exceptions in the function signature (Java). You can highlight
the name of that exception using this token then.
`Name.Function`
Token type for function names.
`Name.Label`
Token type for label names (e.g. in languages that support ``goto``).
`Name.Namespace`
Token type for namespaces. (e.g. import paths in Java/Python), names following
the ``module``/``namespace`` keyword in other languages.
`Name.Other`
Other names. Normally unused.
`Name.Tag`
Tag names (in HTML/XML markup or configuration files).
`Name.Variable`
Token type for variables. Some languages have prefixes for variable names
(PHP, Ruby, Perl). You can highlight them using this token.
`Name.Variable.Class`
same as `Name.Variable` but for class variables (also static variables).
`Name.Variable.Global`
same as `Name.Variable` but for global variables (used in Ruby, for
example).
`Name.Variable.Instance`
same as `Name.Variable` but for instance variables.
Literals
========
`Literal`
For any literal (if not further defined).
`Literal.Date`
for date literals (e.g. ``42d`` in Boo).
`String`
For any string literal.
`String.Backtick`
Token type for strings enclosed in backticks.
`String.Char`
Token type for single characters (e.g. Java, C).
`String.Doc`
Token type for documentation strings (for example Python).
`String.Double`
Double quoted strings.
`String.Escape`
Token type for escape sequences in strings.
`String.Heredoc`
Token type for "heredoc" strings (e.g. in Ruby or Perl).
`String.Interpol`
Token type for interpolated parts in strings (e.g. ``#{foo}`` in Ruby).
`String.Other`
Token type for any other strings (for example ``%q{foo}`` string constructs
in Ruby).
`String.Regex`
Token type for regular expression literals (e.g. ``/foo/`` in JavaScript).
`String.Single`
Token type for single quoted strings.
`String.Symbol`
Token type for symbols (e.g. ``:foo`` in LISP or Ruby).
`Number`
Token type for any number literal.
`Number.Bin`
Token type for binary literals (e.g. ``0b101010``).
`Number.Float`
Token type for float literals (e.g. ``42.0``).
`Number.Hex`
Token type for hexadecimal number literals (e.g. ``0xdeadbeef``).
`Number.Integer`
Token type for integer literals (e.g. ``42``).
`Number.Integer.Long`
Token type for long integer literals (e.g. ``42L`` in Python).
`Number.Oct`
Token type for octal literals.
Operators
=========
`Operator`
For any punctuation operator (e.g. ``+``, ``-``).
`Operator.Word`
For any operator that is a word (e.g. ``not``).
Punctuation
===========
.. versionadded:: 0.7
`Punctuation`
For any punctuation which is not an operator (e.g. ``[``, ``(``...)
Comments
========
`Comment`
Token type for any comment.
`Comment.Multiline`
Token type for multiline comments.
`Comment.Preproc`
Token type for preprocessor comments (also ``<?php``/``<%`` constructs).
`Comment.Single`
Token type for comments that end at the end of a line (e.g. ``# foo``).
`Comment.Special`
Special data in comments. For example code tags, author and license
information, etc.
Generic Tokens
==============
Generic tokens are for special lexers like the `DiffLexer` that doesn't really
highlight a programming language but a patch file.
`Generic`
A generic, unstyled token. Normally you don't use this token type.
`Generic.Deleted`
Marks the token value as deleted.
`Generic.Emph`
Marks the token value as emphasized.
`Generic.Error`
Marks the token value as an error message.
`Generic.Heading`
Marks the token value as headline.
`Generic.Inserted`
Marks the token value as inserted.
`Generic.Output`
Marks the token value as program output (e.g. for python cli lexer).
`Generic.Prompt`
Marks the token value as command prompt (e.g. bash lexer).
`Generic.Strong`
Marks the token value as bold (e.g. for rst lexer).
`Generic.Subheading`
Marks the token value as subheadline.
`Generic.Traceback`
Marks the token value as a part of an error traceback.
|