/usr/share/perl5/TM/Tau.pm is in libtm-perl 1.56-7.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 | package TM::Tau;
use TM::Tau::Filter;
use base qw(TM::Tau::Filter);
use Data::Dumper;
=pod
=head1 NAME
TM::Tau - Topic Maps, Tau Expressions
=head1 SYNOPSIS
use TM::Tau;
# read a map from an XTM file
$tm = new TM::Tau ('test.xtm'); # or
$tm = new TM::Tau ('file:test.xtm'); # or
$tm = new TM::Tau ('file:test.xtm >'); # or
$tm = new TM::Tau ('file:test.xtm > null:');
# read it now and write it back to the file when object goes out of scope
$tm = new TM::Tau ('test.xtm > test.xtm');
# create empty map at start and then let it automatically flush onto file
$tm = new TM::Tau ('null: > test.xtm'); # or
$tm = new TM::Tau ('> test.xtm');
# read-in at the start (i.e. constructor time) and then flush it back
$tm = new TM::Tau ('> test.xtm >');
# load and merge maps at constructor time
$tm = new TM::Tau ('file:test.xtm + http://..../test.atm');
# load map and filter it with a constraint at constructor time
$tm = new TM::Tau ('mymap.atm * myontology.ont');
# convert between different formats
$tm = new TM::Tau ('test.xtm > test.atm');
=head1 DESCRIPTION
When you need to make maps persistent, then you can resort to either using the prefabricated
packages L<TM::Materialized::*>, or you can build your own persistent forms using any of the
available synchronizable traits. In either case your application will have to invoke methods like
C<sync_in> and C<sync_out> to copy content from the resource into memory and back.
While this gives you great flexibility, in some cases your needs may be much simpler:
=over
=item consumer model:
A map should be sourced into memory when the map object is created.
A typical use case is a web server application which accesses the map on disk with every request and
which returns parts of the map to an HTTP client.
=item producer model:
A map is created first in memory and is flushed onto disk at destruction time.
One example here is a script which extracts content from a relational database, puts it into a map
in memory. At the end all map content is copied onto disk.
=item maintainer model:
A map is sourced from the disk at map object creation time, you update it
and it will be flushed back to the same disk location at object destruction.
Your application may be started with with new content to be put into an existing map. So first the
map will be loaded, the new content added, and after that the map will be written back from where it
came.
=item translator model:
A map is sourced from the disk, is translated into some other representation
and is written back to disk to another location or format.
As an example, you might want to convert between XTM and CTM format.
=item filter model:
A map is sourced from some backend, is transformed and/or filtered before being
used.
Your application could be one which only needs a particular portion of the map. So before processing
the map is filtered down to the necessary parts.
=item integration model:
One or more maps are sourced from backends and are merged before
processing.
If you want to provide a consolidated view over several different data resources, you could first
bring them all into topic map form, and then merge them before handing it to the application.
=back
What is common to all these cases is that there is a I<breath-in> phase when the map object is
constructed, and a I<breath-out> phase when it is destroyed. In between theses phases the map
object is just a normal instance of L<TM>.
=head1 TAU EXPRESSIONS
=head2 Overview
To control what happens in these two phases, this package provides a simple expression language,
call B<Tau>. With it you can control
=over
=item * where maps are supposed to come from, or go to,
Here the language provides a URI mechanism for addressing, such as
file:tm.atm
or
http://topicmaps/some/map.xtm
=item * when (or how) they should be merged,
To merge two (manifested or virtual) topic maps together the C<+> operator can be used
file:tm.atm + http://topicmaps/some/map.xtm
=item * when (or how) they should be transformed,
To transform product data to only something a customer is supposed to see, the C<*> can be used:
product_data.atm * file:customer_view.tmql
=item * when (or whether at all) they should be loaded oder saved
=back
B<NOTE>: Later versions of this package will heavily overload the operators to also operate on other
objects.
=head2 Syntax
The Tau expression language supports two binary operators, C<+> and C<*>. The C<+> operator
intuitively puts things together, the C<*> applies the right-hand operand to the left-hand operand
and behaves as a transformer or a filter. The exact semantics depends on the operands. In any case,
the C<*> binds stronger than the C<+>, and that precedence order can be overridden with parentheses.
The parser understands the following syntax for Tau expression:
tau_expr -> mul_expr
mul_expr -> source { ('>' | '*') filter }
source -> '(' add_expr ')' | primitive
add_expr -> mul_expr { '+' mul_expr }
filter -> '(' filter ')' | primitive
primitive -> uri [ module_spec ]
module_spec -> '{' name '}'
Terms in quotes are terminals, terms inside {} can appear any number of times (also zero), terms
inside [] are optional. All other terms are non-terminals.
B<NOTE>: Filters are planned to be composite, hence the optional bracketing in the grammar.
=cut
#== tau expressions =======================================================
$::RD_HINT = 1;
our $tau_grammar = q{
{
my $sources;
my $filters;
my $ms;
use Data::Dumper;
sub _mk_node {
my $uri = shift;
my $spec = shift;
my $first = shift || 0;
my $last = shift || 0;
my $node;
$uri = ( $first ? 'io:stdin' : 'io:stdout' ) if $uri eq '-'; # decide what - actually should mean
if (ref ($spec)) { # if it is a list, then we have filter with traits
$node = new TM::Tau::Filter (url => $uri, baseuri => $uri ); # in any case this will be a filter
bless $node, 'TM::Tau' if $last; # but if it is the last in the row, then a TM::Tau
foreach my $trait (@{ $spec }) { # the rest of the list are traits
eval {
Class::Trait->apply ( $node => $trait => { exclude => [ 'mtime', 'sync_out', 'source_in' ] } ); # which we add now
}; die "cannot apply trait '$trait' for URI '$uri' ($@)" if $@;
}
} else { # otherwise it is a simple module
my $module = $spec; # take that
eval "use $module"; # try to load it on the fly
eval { # try to
$node = $module->new (url => $uri, baseuri => $uri ); # instantiate an object
};
die "cannot load '$module' for URI '$uri' ($@)" if $@;
die "cannot instatiate object for '$module' ($@)" unless $node;
}
return $node;
}
sub _mk_tree {
my $spec = shift;
my $top = shift || 0; # are we at the top?
#warn "mktree: ". Dumper $spec;
my $t; # here we collect the tree
while (my $m = shift @$spec) { # walk through the mul_expr's
my $c; # find a new chain member
if (ref ($m) eq 'ARRAY') { # this means that this operand (can only be the first) is an add_expr
my $d1 = _mk_tree (shift @{$m}); # take the first and make it a node
while (my $d2 = _mk_tree (shift @{$m})) { # if there are more things to add
use TM::Tau::Federate;
$d1 = new TM::Tau::Federate (left => $d1, # build a federation
right => $d2,
url => 'what:ever');
}
$c = $d1; # tuck it away for the end of the loop
} elsif (ref ($m) eq 'HASH') { # this is just a primitive source/filter
$c = _mk_node (%$m, # create a source/filter node
!defined $t, # this is the first in a chain, so we have no $t yet
$top && ! @$spec); # let it also know whether this is the top-top-top, so last, last, last
} else {
die "now this is bad";
}
if ($t) { # we know there was something in the chain already and c is a filter
$c->left ($t);
$t = $c;
} else {
$t = $c;
}
}
return $t;
}
}
startrule : { $sources = $arg[0]; $filters = $arg[1]; } # collect parameters
tau_expr
tau_expr : mul_expr { $return = _mk_tree ($item[1], 1); } # a tau expr is a filter
mul_expr : source ( '*' filter )(s?) { $return = [ $item[1], @{$item[2]} ]; }
source : '(' add_expr ')' { $return = $item[2]; }
| primitive[$sources]
add_expr : <leftop: mul_expr '+' mul_expr>
filter : '(' filter ')' { $return = $item[2]; } # we allow arbitrary ()-nesting here, but
| primitive[$filters] # a filter cannot be composite (yet)
primitive : <rulevar: $schemes = $arg[0]>
primitive : /[^\s()>\*\{\}]+/ module(?)
{
#warn "using schemes ".Dumper ($schemes)." for $item[1]";
my $uri = $item[1];
if (@{$item[2]} && $item[2]->[0]) { # its defined and there is a module specification
$return = { $uri, $item[2]->[0] }; # take that
} else { # no module, so we have to guess via schemes
$return = undef;
foreach my $s (keys %$schemes) { # look at all of them
if ($uri =~ /$s/) { # if it matches
$return = { $uri, $schemes->{$s} };
last; # if we found something, we stop
}
}
die "expression parser: undefined scheme '$uri' detected" unless $return; # loop exhausted and nothing found => bad
}
}
module : '{' /[\w:]*/ '}' { $return = $item[2]; }
};
my $parser; # will be compiled once when it is needed and then will be kept, this is faster
sub _parse {
my $tau = shift;
my $ms = shift;
use Parse::RecDescent;
# $::RD_TRACE = 1;
# $::RD_HINT = 1;
# $::RD_WARN = 1;
$parser ||= new Parse::RecDescent ($tau_grammar) or $TM::log->logdie (scalar __PACKAGE__ . ": problem in tau grammar");
my $f = $parser->startrule (\$tau, 1, \%sources, # predefined sources
\%filters) # add the currently known filters
;
$TM::log->logdie (scalar __PACKAGE__ . ": found unparseable '$tau'") if $tau =~ /\S/s ;
return $f;
}
=pod
The (pre)parser supports the following shortcuts (I hate unnecessary typing):
=over
=item *
"whatever" is interpreted as "(whatever) > -"
=item *
"whatever >" is interpreted as "(whatever) > -"
=item *
"> whatever" is interpreted as "- > (whatever)"
=item *
"< whatever >" is interpreted as "whatever > whatever", sync_in => 0
=item *
"> whatever <" is interpreted as "whatever > whatever", sync_out => 0
=item *
"> whatever >" is interpreted as "whatever > whatever"
=item *
"< whatever <" is interpreted as "whatever > whatever", sync_in => 0, sync_out => 0
=item *
The URI C<-> as source is interpreted as STDIN (via the L<TM::Serializable::AsTMa> trait).
Unless you override that.
=item *
The URI C<-> as filter is interpreted as STDOUT (via the L<TM::Serializable::Dumper> trait).
Unless you override that.
=back
=head2 Examples
# memory-only map
null: > null:
# read at startup, sync out when map goes out of scope
file:test.atm > file:test.atm
# copy AsTMa= to XTM
file:test.atm > file:test.xtm
# using a dedicated driver to load a map, store it onto a file
dns:my.dns.server { My::DNS::Driver } > file:dns_snapshot.atm
# this will only work if the My::DNS::Driver supports to materialize
# the whole map
# read a map and compute the statistics
file:test.atm * http://psi.tm.bond.edu.au/queries/1.0/statistics
=head2 Map Source URLs
URIs are used to address maps. An XTM map, for example, stored in the file system might be addressed
as
file:mydir/somemap.xtm
for a relative URL (relative to an application's current working directory), or via an
absolute URI such as
http://myserver/somemap.atm
The package supports all those access methods (file:, http:, ...) which L<LWP> supports.
=head2 Drivers
Obviously a different deserializer package has to be used for an XTM file than for an AsTMa or LTM
file. Some topic map content may be in a TM backend database, some content may only exist virtually,
being emulated by a dedicated package. While you may be mostly fine with system defaults, in some
cases you may want to have precise control on how files and other external sources are to be
interpreted. By their nature, drivers for sources must be subclasses of L<TM>.
A similar consideration applies to filters. Also here the specified URI determines which filter
actually has to be applied. It also can define where the content eventually is stored to. Drivers
for filters must be either subclasses of L<TM::Tau::Filter>, or alternatively must be a trait
providing a method C<sync_out>.
=head2 Binding by Schemes (implicit)
When a Tau expression is parsed, the parser tries to identify which driver to use for which part of
that composite map denoted by the expression. For this purpose a pattern matching approach is used
to map regular expression patterns to driver package names. If you would like to learn about the
current state of affairs do a
use Data::Dumper;
print Dumper \%TM::Tau::sources;
print Dumper \%TM::Tau::filters;
Obviously, there is a distinction made between the namespace of resources (residing data) and
filters (and transformers).
Each entry in any of the hashes contains as key a regular expression and as value the name of the
driver to be used. That key is matched against the parsed URI and the first match wins. Since the
keys in a hash are not naturally ordered, that is undefined.
At any time you can override values there:
$TM::Tau::sources{'null:'} = 'TM';
$TM::Tau::sources{'tm:server\.com'} = 'My::Private::TopicMap::Driver';
or delete existing ones. The only constraint is that the driver package must already be C<require>d
into your Perl program.
During parsing of a Tau expression, two cases are distinguished:
=over
=item *
If the URI specifies a I<source>, then this URI will be matched against the regexps in the
C<TM::Tau::sources> hash. The value of that entry will be used as class name to instantiate an
object whereby one component (C<uri>) will be passed as parameter like this:
I<$this_class_name>->new (uri => I<$this_uri>, baseuri => I<$this_uri>)
This class should be a subclass of L<TM>.
=item *
If the URI specifies a I<filter>, then you have two options: Either you use as entry the name of a
subclass of L<TM::Tau::Filter>. Then an object is created like above. Alternatively, the entry is a
list reference containing names of traits. Then a generic L<TM::Tau::Filter> node is generated first
and each of the traits are applied like this:
Class::Trait->apply ( $node => I<$trait> => {
exclude => [ 'mtime',
'sync_out',
'source_in' ]
} );
=back
If there is no match, this results in an exception.
=cut
our %sources = (
'^null:$' => 'TM::Materialized::Null',
'^(file|ftp|http):.*\.atm$' => 'TM::Materialized::AsTMa',
'^(file|ftp|http):.*\.ltm$' => 'TM::Materialized::LTM',
'^(file|ftp|http):.*\.ctm$' => 'TM::Materialized::CTM',
'^file:/tmp/.*' => 'TM::Materialized::AsTMa',
'^(file|ftp|http):.*\.xtm$' => 'TM::Materialized::XTM',
'^inline:.*' => 'TM::Materialized::AsTMa',
'^io:stdin$' => 'TM::Materialized::AsTMa',
'^-$' => 'TM::Materialized::AsTMa', # in "- > whatever:xxx" the - is the map coming via STDIN
);
our %filters = ( # TM::Tau::Filter::* packages are supposed to register there
'^null:$' => [ 'TM::Serializable::Dumper' ],
'^(file|ftp|http):.*\.atm$' => [ 'TM::Serializable::AsTMa' ],
'^(file|ftp|http):.*\.ltm$' => [ 'TM::Serializable::LTM' ],
'^(file|ftp|http):.*\.xtm$' => [ 'TM::Serializable::XTM' ],
'^(file|ftp|http):.*\.ctm$' => [ 'TM::Serializable::CTM' ],
'^-$' => [ 'TM::Serializable::Dumper' ], # in "whatever > -" the - is an empty filter
'^io:stdout$' => [ 'TM::Serializable::Dumper' ], # stdout can be a URL for a filter
);
# make sure all registered packages have been loaded
use TM;
use TM::Tau::Filter;
=pod
=head2 Binding by Package Pragmas (Explicit)
Another way to define which package should be used for a particular map
is to specify this directly in the I<tau> expression:
http://.../map.xtm { My::BrokenXTM }
In this case the resource is loaded and is processed using
C<My::BrokenXTM> as package to parse it (see L<TM::Materialized::Stream> on how to write
such a driver).
=head1 INTERFACE
=head2 Constructor
The constructor accepts a string following the I<Tau expression> L</Syntax>. If that string is
missing, C<< null: >> will be assumed. An appropriate exception will be raised if the syntax is
violated or one of the mentioned drivers is not preloaded.
Examples:
# map only existing in memory
my $map = new TM::Tau;
# map will be loaded as result of this tau expression
my $map = new TM::Tau ('file:music.atm * file:beatles.tmql');
Apart from the Tau expression the constructor optionally interprets a hash with the following keys:
=over
=item C<sync_in> (default: C<1>)
If non-zero, in-synchronisation at constructor time will happen, otherwise it is suppressed. In that
case you can trigger in-synchronisation explicitly with the method C<sync_in>.
=item C<sync_out> (default: C<1>)
If non-zero, out-synchronisation at destruction time will happen, otherwise it is suppressed.
=back
Example:
my $map = new TM::Tau ('test.xtm',
sync_in => 0); # dont want to let it happen now
.... # time passes
$map->sync_in; # but now is a good time
=cut
sub new {
my $class = shift;
my $tau = shift || "null:";
my %opts = @_;
#warn "cano0 '$tau'";
# providing defaults
$opts{sync_in} = 1 unless defined $opts{sync_in};
$opts{sync_out} = 1 unless defined $opts{sync_out};
$_ = $tau; # we do a number of things now
# canonicalization, phase 0: remove leading/trailing blanks
s/^\s*//;
s/\s*$//;
# canonicalization, phase I: reduce the ><><>< crazyness to A > B
if (/^<(.*)>$/) {
$_ = "($1) * ($1)";
$opts{sync_in} = 0; $opts{sync_out} = 1;
} elsif (/^>(.*)<$/) {
$_ = "($1) * ($1)";
$opts{sync_in} = 1; $opts{sync_out} = 0;
} elsif (/^>(.*)>$/) {
$_ = "($1) * ($1)";
$opts{sync_in} = $opts{sync_out} = 1;
} elsif (/^<(.*)<$/) {
$_ = "($1) * ($1)";
$opts{sync_in} = $opts{sync_out} = 0;
} elsif (/^(.*)>$/) { # > - default
$_ = "($1) * -";
} elsif (/^>(.*)$/) { # - > default
$_ = "- * $1";
} elsif (/^(.*?)>(.*?)$/) { # there is a > somewhere in between
$_ = "( $1 ) * ( $2 )";
} else { # no > to be see anywhere
$_ = "($_) * -";
}
#warn "cano2 '$_'";
my $self = _parse ($_); # DIRTY, but then not
#warn "============> ". ref ($self->left) . " <-- left -- " . ref ($self);
#warn "base of top ".$self->{baseuri}." xxx";
$self->{sync_in} = $opts{sync_in}; # same here
$self->{sync_out} = $opts{sync_out};
$self->sync_in if $self->{sync_in}; # if user wants to sync at constructor time, lets do it
return $self;
}
=pod
=head1 SEE ALSO
L<TM>, L<TM::Tau::Filter>
=head1 AUTHOR
Copyright 200[0-68], Robert Barta E<lt>drrho@cpan.orgE<gt>, All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl
itself. http://www.perl.com/perl/misc/Artistic.html
=cut
our $VERSION = '1.15';
our $REVISION = '$Id: Tau.pm,v 1.13 2006/12/05 09:50:38 rho Exp $';
1;
__END__
|