This file is indexed.

/usr/include/trilinos/Phalanx_DoxygenDocumentation.hpp is in libtrilinos-phalanx-dev 12.12.1-5.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
// @HEADER
// ************************************************************************
//
//        Phalanx: A Partial Differential Equation Field Evaluation 
//       Kernel for Flexible Management of Complex Dependency Chains
//                    Copyright 2008 Sandia Corporation
//
// Under terms of Contract DE-AC04-94AL85000, there is a non-exclusive
// license for use of this work by or on behalf of the U.S. Government.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// 1. Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
//
// 3. Neither the name of the Corporation nor the names of the
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Questions? Contact Roger Pawlowski (rppawlo@sandia.gov), Sandia
// National Laboratories.
//
// ************************************************************************
// @HEADER


#ifndef PHX_DOXYGEN_DOCUMENTATION_HPP
#define PHX_DOXYGEN_DOCUMENTATION_HPP

/*!

\mainpage

\section main_index Index

- \ref overview
- \ref news
- \ref user_guide
- \ref faq
- \ref bugs
- \ref history
- \ref authors
- \ref copyright
- \ref questions

\section overview Overview

Phalanx is a local graph-based field evaluation toolkit.  While the
intended use case is for solving general partial differential
equations (PDEs), there is NO specific code implemented for PDEs in
Phalanx.  It can be applied to any system that requires function
evaluation. In terms of PDE discretization schemes it can be used for
finite element, finite difference and finite volume.

Phalanx is a local node evaluation tool. Phalanx relies on the Kokkos
package for performance portability and provides a simple performant
interface to multicore and manycore architectures.  While its main use
is for large scale parallel high performance computing, the MPI
communication (possibly required for ghosting) must be handled by
other packages in the toolchain (separation of concerns).  Users can
handle this manually as Phalanx places no requirements on this but we
recomend the Tpetra package in Trilinos be leveraged.

The main goal of Phalanx is to decompose a complex problem into a
number of simpler problems with managed dependencies to support rapid
development and extensibility.  Through the use of template
metaprogramming concepts, Phalanx supports arbitrary user defined data
types and evaluation types. This allows for extreme flexibility in
integration with user applications and provides extensive support for
embedded technology such as automatic differentiation for sensitivity
analysis, optimization, and uncertainty quantification. This approach,
coupled with the template capabilities of C++ offers a number of
unique and powerful capabilities:

<ul>
<li> Fast Integration with Flexible and Extensible Models:

     <ul> 
     
     <li> Increased flexibility because each simpler piece of the
     decomposed problem becomes an extension point that can be swapped
     out with different implementations.

     <li> Easier to implement new code because each piece is simpler,
     more focused and easier to test in isolation.

     <li> Easier for users to add their own models.  While Phalanx is
     designed for maximum flexibility and efficiency, the interfaces
     exposed to users are very simple, requiring minimal training
     and/or knowledge of the C++ language to use.

     <li> Through the use of template metaprogramming concepts,
     Phalanx supports arbitrary user defined data types.  This allows
     for unprecedented flexibility for direct integration with user
     applications and opens the door to embedded technology.
   
     </ul>

<li> Support for Advanced Embedded Technology: Phalanx is fully
compatible with advanced embedded technologies. Embedded technology
consists of replacing the default scalar type (i.e., "double" values)
in a code with an object that overloads the typical mathematical
operators.  By replacing the scalar type, we can reuse the same code
base to produce different information.  For example, if we were to
compute a function \f$ F(x) \in R^n\f$, where \f$ F: R^n \rightarrow
R^n \f$ and \f$ x \in R^n \f$.  We could automatically compute the
sensitivities of the function, such as the Jacobian, \f$ J_{ij} =
\frac{\partial F_i}{\partial x_j} \f$ merely by replacing the scalar
type of double with an automatic differentiation (AD) object.  With
the C++ template mechanism, this is a trivial task.  Embedded
technology provides:

     <ul> 

     <li> Applications can reuse the exact same code base written to
     evaluate a function yet produce new information including
     arbitrary precision, sensitivities (via embedded automatic
     differentiation), and uncertainty quantification.  

     <li> By reusing the same code for both function and sensitivity
     evaluation, developers avoid having to hand code time consuming
     and error prone analytic derivatives and ensure that the equation
     sets are consistent.

     </ul>

<li> Consistent field evaluations via dependency chain management:
When users switch models, the dependencies for fields evaluated by the
model can change.  This can force field to be evaluated in a different
order based on the new dependencies.  Phalanx will automatically
perform the sorting and ordering of the evaluator routines.  For
example, if density were a field to be evaluated, different models
could have different dependencies.  Model A could depend on
temperature and pressure, whille model B could depend on temperature,
pressure, and species mass fractions of a multicomponent mixture.  The
order of evaluation of fields could change based on the change in
dependencies.  Using Model B will require that the species mass
fractions be evaluated and available for the density evaluation while
Model A does not even require them.  This is a simple example, but the
dependency chain can become quite complex when solving thousands of
coupled PDE equations.

<li> Efficient evaluation of field data: Phalanx was designed to
support the concept of worksets, although this is NOT REQUIRED.  A
workset is an arbitrarily sized block of cells (or egdesor faces or
nodes) to be evaluated on the local processor.  Phalanx evaluates all
fields of interest at once for each block of cells.  By using the
contiguous allocator and correctly sizing the workset to fit in
processor cache (if possible), one can arrange all fields to exist in
a contiguous block of memory, independent of the data type objects
used to store the data (one must still be careful of data alignement
issues).  By keeping all fields in cache, the code should run much
faster.  This workset idea will also, in the future, allow for
multi-core distrubution of the cell evaluations.  

NOTE: with the transition to the Kokkos::View as the underlying data
type, the contiguous memory allocation is no longer supported as
Kokkos will control data placement to maximize hardware locality
(NUMA).

</ul>

Phalanx is a hammer.  It's use should be carefully considered.  We
recommend its use when writing a general PDE framework where one needs
support for flexibility in equation sets, discretizations, and
material models.  It should not be used for a fixed set of equations
and a single discretization that never changes (Although the template
metaprogramming support could still be leveraged).  There are some
drawbacks to using Phalanx that should be considered:

- A potential performance loss due to fragmentation of the over-all
algorithm (e.g., several small loops instead of one large loop).
Since the granularity of the directed acyclic graph is controlled by
the user, a poor choice in functional decomposition is possible.  This
could result in inefficient use of memory and costly kernel launches
if there is little work in the evaluator.

- A potential loss of visibility of the original, composite problem
 (since the code is scattered into multiple places).  

Managing these trade-offs can result in application code that both
performs well and supports rapid development and extensibility.

\section news News

Phalanx has just completed the transition from shards::Array to the
Kokkos::View.  Leveraging Kokkos, Phalanx is now performance portable
to GPUs.  The examples and user guide, however, have not been updated
to reflect the transition.  Please look at the MultiDimensionalArray
example and the unit tests for examples on how to use Phalanx.  We
will update the user guide and other examples asap.

\section bugs Reporting Bugs and Making Enhancement Requests

  To reports bugs or make enhancement requests, visit the <A HREF="http://software.sandia.gov/bugzilla/">Trilinos Bugzilla (Bug Tracking) Database</A>, and use the following instructions.
      <UL>
      <LI>Click on "Enter a new bug report"
      <LI>Choose "Phalanx"
      <LI>Either login or create a new account
      <LI>Submit your bug report
      </UL>

\section history History

Phalanx grew out of the Variable Manger in the Charon code and the Expression Manager in the Aria code at Sandia National Laboratires.  It is an attempt at merging the two capabilities.

\section authors Authors and Contributors

The following have contributed to the design through ideas, discussions, and/or code development of Phalanx:

  - Roger Pawlowski (Lead Developer), SNL
  - Irina Demeshko, SNL
  - Eric Phipps, SNL
  - Pat Notz, SNL

\section copyright Copyright

\verbinclude copyright.txt

\section questions For All Questions and Comments...
  
   Please contact Roger Pawlowski (rppawlo@sandia.gov).

*/

/* ************************************************************************ */
/* ************************************************************************ */

/*! \page user_guide Users Guide

\section user_guide_index Index

- \ref user_guide_getting_started
- \ref user_guide_domain_model
- \ref user_guide_mdarray_domain_model
- \ref performance
- \ref user_guide_step1
- \ref user_guide_step2
- \ref user_guide_step3
- \ref user_guide_step4
- \ref user_guide_step5
- \ref user_guide_step6
- \ref questions

\section user_guide_getting_started Getting Started

\subsection ug_dummy_1 A. Understand Templates
Phalanx is a complex package that make heavy use of the C++ templating
mechanism.  We recommend that users of the Phalanx package first
familiarize themselves with C++ templates.  An excellent reference is
"C++ Templates: The Complete Guide" by Vandevoorde and Josuttis, 2003.
While users do not need to understand template metaprogramming to use
Phalanx, the concepts underlying many operations in Phalanx can be
found in "C++ Template Metaprogramming" by Abrahams and Gurtovoy,
2004 and the Boost template metaprogramming library (MPL).

Once Phalanx is integrated into a code by a template savvy developer,
the only operation application users should perform is to extend the
evaluation routines to new equations and new models by adding new
Evaluators.  We realize that application developers and application
users typically have distictly different skill sets. Much design work
has been invested in making the Evaluator implementation very clean
and as template free as possible.  Therefore, users of the code who
only write new Evaluators DO NOT need to know templates.

\subsection ug_dummy_2 B. Learn the Phalanx Nomenclature
Users should then learn the nomeclature used in the package defined in
the \ref user_guide_domain_model.

\subsection ug_dummy_3 C. Tutorial

The main concept of Phalanx is to evaluate fields typically for
solving PDEs (although it is not limited to this).  We demonstrate the
integration process using the simple example found in
phalanx/example/MultiDimensionalArray.  Suppose that we want to solve
the heat equation over the physical space \f$ \Omega \f$:

\f[
  \nabla \cdot (-k \nabla T) + s = 0
\f] 

where \f$ T \f$ is the temparature (and our degree of freedom), \f$ k \f$ is the thermal conductivity, and \f$s\f$ is a nonlinear source term.  We pose this in terms of a conservation law system:

\f[
  \nabla \cdot (\mathbf{q}) + s = 0
\f]

where \f$ \mathbf{q} = -k \nabla T \f$ is the heat flux.  The specific discretization technique whether finite element (FE) or finite volume (FV) will ask for \f$\mathbf{q}\f$ and \f$s\f$ at points on the cell.  Phalanx will evaluate \f$\mathbf{q}\f$ and \f$s\f$ at those points and return them to the discretization driver.

Using finite elements, we pose the problem in variational form:  Find \f$ u \in {\mathit{V^h}} \f$ and \f$ \phi \in {\mathit{S^h}} \f$ such that:

\f[
  - \int_{\Omega} \nabla \phi \cdot \mathbf{q} d\Omega 
  + \int_{\Gamma} \phi \mathbf{n} \cdot \mathbf{q} 
  + \int_{\Omega} \phi s d\Omega = 0 
\f]

Phalanx will evaluate \f$\mathbf{q}\f$ and \f$s\f$ at the quadrature
points of the cells and pass them off to the integrator such as <a
href="http://trilinos.sandia.gov/packages/intrepid">Intrepid</a>.

This is a trivial example, but the dependency chains can grow quite
complex if performing something such as a chemically reacting flow
calculation coupled to Navier-Stokes and energy conservation.

Follow the steps below to integrate Phalanx into your application. The example code shown in the steps comes from the energy flux example in the directory "phalanx/example/EnergyFlux".  Note that many classes are named with the word "My" such as MyWorkset, MyTraits, and MyFactory traits.  Any object that starts with the word "My" denotes that this is a user defined class.  The user must implement this class specific to their application.  All Evaluator derived objects are additionally implemented by the user even though they do not follow the convention of starting with the word "My".

- \ref user_guide_step1
- \ref user_guide_step2
- \ref user_guide_step3
- \ref user_guide_step4
- \ref user_guide_step5
- \ref user_guide_step6

\section user_guide_domain_model Phalanx Domain Model

<ul>
<li><b>Cell</b>

Partial differential equations are solved in a domain.  This domain is
discretized into cells (also called elements for the finite element
method).  This library assumes that the block of cells being iterated
over is of the same type!  If different evaluators (i.e. different
material properties) are required in different blocks of cells, a new
FieldMangager must be used for each unique block of elements.  This is
required for efficiency.

<li><b>Parallel and Serial Architectures</b> 

Phalanx can be used on both serial and multi-processor architectures.
The library is designed to perform "local" evalautions on a "local"
set of cells.  The term local means that all cell and field data
required for an evaluation is on the processor that the evaluation is
executed.  So for parallel runs, the cells are distributed over the
processors and a FieldManager is built on each processor to evaluate
only the cells that are assigned to that processor.  If there is any
data distributed to another processor that is required for the
evaluation, the user must handle pulling that information on to the
evaluation processor.  The design of Phalanx will also allow for users
to take advantage of multi-core architectures through a variety of
implementations.

<li><b>%Workset</b>

For performance and memory limitations, the evaluation of fields can
be divided into worksets.  While Phalanx does not require the use of
worksets, we recomend their use for controlling the memory footprint (this
can be important when offloading the kernels onto an accelerator and
even for host node memory). The goal of using worksets is to fit all
the required fields into the processor cache so that an evaluation is
not slowed down by paging memory.  Suppose we have a cell-based
discretization with 2020 cells to evaluate on a 4 node machine.  We
might distribute the load so that 505 cells are on each MPI processor.
Now the user must figure out the workset size.  This is the number of
cells to per evaluation call so that the field memory will fit into
cache.  If we have 505 cells on a processor, suppose we find that only
50 cells at a time will fit into cache.  Then we will create a
FieldManager with a size of 50 cells.  This number is specified in the
construction of data layouts.  During the call to
postRegistrationSetup(), the FieldManager will allocate workspace
storage for all fields relevant to the evaluation.

For our example, there will be 11 worksets.  The first 10 worksets
will have the 50 cell maximum and the final workset will have the 5
remaining cells.  The evaluation routine is called for each workset,
where workset information can be passed in through the evaluate call:

\code
    std::vector<WorksetData> workset_data;
        .
        .
    // Initialize workset data 
        .
        .
    for (std::size_t i = 0; i < workset_data.size(); ++i) {
      field_manager.evaluateFields<MyTraits::Residual>(workset_data[i]);
	.
        .  
      // use evaluated fields 
        .
        .
    }
\endcode

Note that the call to evaluateFields() takes a workset data object in
this example.  The field_manager is templated on a traits class that
allows the user to define the actual object passed into the
evaluatFields call.  This does not have to be a workset_data object
but can be any class or struct the user defines (even void).

Note that you do not have to use the workset idea.  In this cell-based
discretization example, one could just evaluate all local elements in
one loop (equivalent to a single workset).  Be aware that this can
result in a possibly large performance hit.

Phalanx, in fact, does not restrict you to cell based iteration.  You
can iterate over any entity type such as edge or face structures.

<li><b>Consistent Evaluation</b>

Phalanx was imtended to perform consistent evaluations.  By consistent,
we mean that all dependencies of a field evaluation are current with
respect to the current degree of freedom values.  For example, suppose
we need to evaluate the the energy flux.  This has dependencies on the
diffusivity, and the temperature gradient.  Each of these quantities
in turn depends on the temperature.  So before the diffusivity and
temperature gradient are evaluated, the temperature must be evaluated.
Before the energy flux can be evaluated, the density, diffusivity, and
temperature gradient must be evaluated.  Phalanx forces an ordered
evaluation that updates fields in order to maintain consistency of the
dependency chain.  Without this, one might end up with lagged values
being used from a previous evaluate call.

This does not rule out the use of semi-implicit or operator split
schemes.  In fact these have been demonstrated in Phalanx.  It just
rquires that users store any lagged history and provide a field
evaluator to pull that history in.

<li><b>Scalar Type</b>

A scalar type, typically the template argument ScalarT in Phalanx
code, is the type of scalar used in an evaluation.  It is typically a
double or float, but can be special object types for extended
precision, complex values, and special embedded methods data types
such as sensitivity analysis.  For example, for sensitivity analysis,
a double scalar type is replaced with a foward automatic
differentiation object (FAD) or a reverse automatic differentaion
object (RAD) to produce sensitivity information.  Whatever type is
used, the standard mathematical operators are overloaded for the
particular embedded technology.  For an example of this, see the 
<a href="http://trilinos.sandia.gov/packages/sacado">Sacado Automatic
Differentiation Library</a>. Some sample scalar types include:

<ul>
<li> float
<li> double
<li> Sacado::Fad::DFad<double> (for sensitivity analysis)
</ul>


<li><b>Data Type</b>

The data type is a deprecated concept used in the original Phalanx
implementation.  It is now equivalent to the scalar type.  You might
see use older code specifying a DataT template parameter, but this is
now equivalent the ScalarT template parameter.

<li><b>Evaluation Type</b>

The evaluation type, typically the template argument EvalT in Phalanx code, defines a unique type of evaluation to perform.  The user is free to choose/create the evaluation types - they implement their own class or struct for their own evaluation types.  An EvaluationContainer is allocated for each evaluation type specified in the users traits class.  Examples that we usually use include:

<ul>
<li> Residual (simple function evaluation)
<li> Jacobian (function sensitivities with respect to state variables)
<li> Tangent (parameter sensitivity)
</ul>

The evaluation type must be associated with one default scalar type
and can optionally support additional scalar types.  The scalar type
usually determines what is being evaluated. For example, to evaluate
the equation residuals, the scalar type is usually a double, a float
or an extended precision type.  To evaluate a Jacobian, the scalar
type could be a forward automatic differentiation object,
Sacado::Fad::DFAD<double>.  By introducing the evaluation type in
Phalanx, the same scalar type can be used for different evaluation
types and can be specialized accordingly.  For example computing the
Jacobian and computing parameter sensitivities both could use the
Sacado::Fad::DFAD<double> scalar type.

<li><b>Storage</b>

A DataContainer object stores all fields of a particular data type.
Each EvaluationContainer holds a vector of DataContainers, one
DataContainer for each vaid data type that is associated with that
particular evaluation type.  One EvaluationContainer is constructed
for each evaluation type.

NOTE: this concept was needed for memory allocation in the first
generation Phalanx library when all fields for an evaluation type were
stored contiguously in memory.  Since the Kokkos transition, memory in
now controlled by the Kokkos::View.  We could remove this concept and
corresponding code in the future.  This object is not used by the user
and is an underlying implementatino detail.  It's removal will NOT
result in changes to user code.

<li><b>Data Layout</b>

The DataLayout object is used to define layout of the multidimensional
array used to store Phalanx Fields.  It provides the size of each rank
in the multidimensional array and provides a unique string name to
distinguish fields with the same name that might exist on different
layouts.  For example, supposed we have written an evaluator the
computes the "Density" field for a set of points in the cell.  Now we
want to evaluate the density at a different set of points in the cell.
We might have a "Density" field in a cell associated with a set of
integration points (quadrature/integration points in finite elements)
and another field associated with the nodes (nodal basis degree of
freedom points in finite elements).  We use the same field name (so we
can reuse the same Evaluator), "Density", but use two different
DataLayouts, one for integration points and one for nodal point.  Now
a FieldTag comparison will differentiate the fields due to the
different DataLayout.

<li><b>Field Tag</b>

The FieldTag is a description of a field.  It is templated on the data type, DataT.  It is used to identify a field stored in the field manager.  It contains a unique identifier (an stl std::string) and a pointer to a data layout object.  If two FieldTags are equal, the DataLayout, the data type, and the string name are exactly the same.

<li><b>Field</b>

A Field is a set of values for a particular quantity of interest that must be evaluated.  It is templated on the data type, DataT.  It consists of a FieldTag and a reference counted smart pointer (Teuchos::RCP<DataT>) to the data array where the values are stored.  

<li><b>Evaluator</b>

An Evaluator is an object that evaluates a set of Fields.  It contains two vectors of Fields, one set is the fields it will evaluate and one set is the fields it depends on for the evaluation.  For example to evaluate a density field that is a function of temperature and pressure, the field the evaluator will evaluate (the first field set) is a density field, and the set of fields it requires to meet its dependencies are the temperature and pressure fields (the second field set).  The evaluator is templated on the evaluation type and the traits class.  

<li><b>Evaluator Manager</b>

The main object that stores all Fields and Evaluators.  The evaluator manager (EM) sorts the evaluators and determines which evaluators to call and the order necessary to ensure consistency in the fields.  The EM also allocates the memory for storage of all fields so that if we want to force all fields to be in a contiguous block, we have that option available.  Users can write their own allocator for memory management.

<li><b>Multidimensional Array</b>

Instead of using the concept of an algebraic type that the user implements, it may be easier to use a multidimensional array.  For example, suppose we have ten cells, and in each cell there are four points (specifically quadrature points) where we want to store a 3x3 matrix for the stress tensor.  Using the concepts of algebraic types, we would use a user defined matrix in a Phalanx Field:

\code
  PHX::MDA::Layout<Cell,QP> layout(10,4);
  PHX::Field< MyTensor<double> > stress("Stress",layout);
\endcode

However, Phalanx also implements the idea of a multidimensional array with optional compile time checking on rank accessors.

\code
  PHX::MDA::Layout<Cell,QP,Dim,Dim> layout(10,4,3,3);
  PHX::MDField<double,Cell,QP,Dim,Dim> stress("Stress",layout);
\endcode

Here, the "Cell", "QP", and "Dim" objects are small structs that allow users to describe the ordinates associated with the multidimensional array.

The benefits of using the multidimensional array are that (1) checking of the rank accessor at either compile time or runtime (runtime checking is only enabled for debug builds for efficiency) prevent coding errors and (2) the documentation of the ordinals is built into the code - no relying on comments that go out of date.

The EnergyFlux example is reimplemented using the multidimensional array instead of the algebric types in the directory Trilinos/packages/phalanx/example/MultiDimensionalArray/.

  Our recomendation is to use the multidimensional array version as future codes plan to use this object.  The PHX::MDField is fully compatible with Intrepid whereas the PHX::Field is not.  We keep the PHX::Field object around for performance measurements since it directly accesses the Teuchos::ArrayRCP object where the PHX::MDField uses a shards::Array object.  If the compiler optimizes correctly, there should be no difference in performance.  By testing both the Field and MDField, we can test compiler optimization.   

</ul>

\section performance Performance

Some recomendations for efficient code:
<ul>

<li> <b>Use worksets</b> This may eliminate cache misses. 

<li> <b>Enable compiler optimization:</b>  The Field and MDField classes use inlined bracket operators for data access.  That means if you build without optimization some compilers will produce very slow code.  To see if your compiler is optimizing away the bracket operator overhead, run the test found in the directory "phalanx/test/Performance/BracketOperator".  If the timings are the same between a raw pointer array and the Field and MDField classes, your compiler has removed the overhead.   For example, on gnu g++ 4.2.4, compiling with -O0 shows approximately 2x overhead for bracket accessors, while -O2 shows about a 10% overhead, while -O3 completely removes the overhead.

       <li> <b>Algebraic Types:</b> Implementing your own algebraic types, while convenient for users, can introduce overhead as opposed to using raw arrays or the multidimensional array.  The tests found in "phalanx/test/Performance/AlgebraicTypes" demonstrate some of this overhead.  Expression templates should remove this overhead, but in our experience, this seems to be very compiler and implementation dependent.  If you build in tvmet support, you can compare expression templates, our "dumb" implementation for vectors and matrices, and our multidimensional array against raw array access.  Our testing on gnu compilers shows an overhead of about 20-25% when using "dumb" objects and the expresion templates as opposed to raw arrays.  We recommend using raw arrays with the multi-dimensional array (MDField) for fastest runtimes.

       <li> <b>Use Contiguous Allocator:</b> Phalanx has two allocators, one that uses the "new" command for each separate field, and one that allocates a single contiguous array for ALL fields.  If cache performance is the limiting factor, the contiguous allocator could have a big effect on performance.  Additionally, alignment issues can play a part in the allocators depending on how you implement your algrbraic types.  Our ContiguousAllocator allows users to choose the alignment based on a template parameter.  Typically, this is double.

<li> <b>Limit the number of Fields and/or Evaluators:</b>  The more evaluators used in your code, the more the loop sturcutre is broken up.  You go from a single loop to a bunch of small loops.  This can have an effect on the overall performance.  Users should also be judicious on choosing Fields.  Only select Fields as a place to introduce variability in your models or for reuse.  For example, if you require density in a single place in the code and there is only one single model that won't change, do not make it a field.  Just evaluate it once where needed.  But if you need density in multiple providers or you want to swap models at runtime, then it should be a field.  This prevents having to recompute the model or recompile your code to switch models.  This is usually not an issue as long as the amount of work in each evaluator is larger than vtable lookup to make the evaluate call.  In our experience, we have never observed this behaviour.  We point it our here just in case.

<li> <b>Slow compilation times:</b>  As the number of Evaluators in a code grows, the compilation times can become very long.  Making even a minor change can result in the code recompiling all Evaluator code.  Therefore, we recommend using explicit template instantiation.  An example can be found in Trilinos/packages/phalanx/example/FEM_Nonlinear.  All evaluators use explicit template instantiation if phalanx is built with explicit template instation enabled (in the cmake build system, use the flag -D Phalanx__EXPLICIT_TEMPLATE_INSTANTIATION=ON).

</ul>

\section user_guide_mdarray_domain_model Multi-Dimensional Array Domain Model

The multidimensional array was designed for interoperability between a number of Trilinos packages including shards, phalanx and intrepid.  These codes each have their own implementations but follow a basic set of requirements so that functions templated on the array type can use any of the multidimensional array implementations.  The required functions are:

<ul>
<li> ScalarT& operator[] - the bracket operator
<li> ScalarT& operator(1,2,3,...,N) - accessor operators for each rank N array.
<li> size_type rank() - integer number of ordinates
<li> size_type dimension(size_type ordinate) - size of ordinate
<li> size_type size() - total size of array
</ul>

Phalanx implements the multidimensional array in the PHX::MDField class and supports arrays with up to 8 ranks (one more than Fortran arrays support).  More information can be found in the shards library.

\section user_guide_step1 Step 1: Configuring, Building, and installing Phalanx

\subsection ug_step1_general A. General Library Requirements
Phalanx is distributed as a package in the <a href="http://trilinos.sandia.gov">Trilinos Framework</a>.  It can be enabled as part of a trilinos build with the configure option "-D Trilinos_ENABLE_Phalanx=ON".  Phalanx currently has direct dependencies on the following third party libraries:

- <b>Requires</b> the <a href="http://trilinos.sandia.gov/packages/teuchos">Teuchos</a> utilities library, part of the <a href="http://trilinos.sandia.gov/">Trilinos Framework</a>.  This will automatically be enabled when you enable the phalanx library.
 
 - <b>Requires</b> the <a href="http://trilinos.sandia.gov/packages/sacado">Sacado Automatic Differentiation Library</a>, part of the <a href="http://trilinos.sandia.gov/">Trilinos Framework</a>.  This will automatically be enabled when you enable the phalanx library.

\subsection ug_step1_performance B. Performance Example Requirements

 - <b>Optional:</b> Some performance tests run comparisons against <a href="http://tvmet.sourceforge.net/">TVMET: Tiny Vector Matrix library using Expression Templates</a>.  This is to get a feel for how our "dumb" vector matrix objects perform compared to expression templates.  You must enable the tvmet TPL add the path to the TVMET library during Trilinos configuration.  An example configuration file can be found in Trilinos/packages/phalanx/build_scripts/build_phalanx_gcc.sh.

TVMET is optional and hidden behind an ifdef.  The performance tests will be built regardless of whether tvmet is enabled/disabled. 

\subsection ug_step1_fem C. Nonlinear Finite Element Example Requirements

To build the example problem in "phalanx/example/FEM_Nonlinear", distributed vector, distributed sparse matrix, and corresponding linear solvers are required.  The following <b>Trilinos</b> packages need to be enabled to build the FEM_Nonlinear example:

 - <b>Requires:</b> the <a href="http://trilinos.sandia.gov/packages/epetra">Epetra Library</a>, supplies the linear algebra data structures for spares matrices.  Can be enabled during the Trilinos configure with the flag "-D Trilinos_ENABLE_Epetra=ON". 

 - <b>Requires:</b> the <a href="http://trilinos.sandia.gov/packages/ifpack">Ifpack Library</a>, supplies incomplete factorization preconditioners.  Can be enabled during the Trilinos configure with the flag "-D Trilinos_ENABLE_Ifpack=ON". 

 -  <b>Requires:</b> the <a href="http://trilinos.sandia.gov/packages/belos">Belos Library</a>, supplies block GMRES iterative linear solver.  Can be enabled during the Trilinos configure with the flag "-D Trilinos_ENABLE_Belos=ON".  

This example will be disabled if the above packages are not enabled.

\subsection db D. Configure Trilinos/Phalanx
The general instructions for building trilinos can be found at <a href="http://trilinos.sandia.gov/documentation.html">Trilinos Documentation Page</a>.  Of particular importance are the Overview, User Guide, and Tutorial documents.  At a minimum you must enable the Teuchos, Sacado, and Phalanx packages.  An example configure script is:

\verbinclude reconfigure.linux

Once configure is run, build and install the library with the command:

\code
make install
\endcode

\section user_guide_step2 Step 2: Determine Types

Users must next determine the evaluation types and data types they will require in their simulation.  Please see the section \ref user_guide_domain_model for detailed explanation of types.  Following the example in the phalanx/example/EnergyFlux directory, we will be requiring two evaluation types, one for the residual evaluation of the discretized PDE equation and one for the corresponding Jacobian.  Additional evaluation types might be for the parameter sensitivites and uncertainty quantification.  

Once the evaluation types are chosen, users must decide on a default scalar type and on all data types that are valid for each evaluation type.  The data types should all be templated on the default scalar type for the particular evaluation type, but this is not a requirement (expert users can violate this for performance reasons).  Uses must implement their own data types or get them from a separate library.  For sensitivities, the trilinos package <a href="http://trilinos.sandia.gov/packages/sacado">Sacado</a> should be used.  For uncertainty quantification, the Trilinos package <a href="http://trilinos.sandia.gov/packages/stokhos">Stokhos</a> should be used.

In our example, the user has written an implementation of vector (MyVector) and matrix (MyTensor) classes found in the file AlgebraicTypes.hpp.  They are templated on a scalar type and can be used for all evaluation types. 

<ul>
<li> Residual - Default scalar type will be "double".  The data types will be:
<ul>
<li> double
<li> MyVector<double>
<li> MyTensor<double>
</ul>
<li> Jacobian - Default scalar type will be "Sacado::Fad::DFad<double>".  This scalar type will carry both the residual information as well as sensitivity information.  The data types will be:
<ul>
<li> Sacado::Fad::DFad<double>
<li> MyVector< Sacado::Fad::DFad<double> >
<li> MyTensor< Sacado::Fad::DFad<double> >
</ul>
</ul>

Typical examples of algebraic types include vectors, matrices, or higher order tensor objects, but in reality can be any struct/class that the user desires.  Remember to template the objects on the scalar type if possible.  An example of a very inefficient Vector and Matrix implementation (operator overloading without expression templates) can be found in AlgebraicTypes.hpp.

\section user_guide_step3 Step 3: Write the Traits Object

The Traits object is a struct that defines the evaluation types, the data types for each evaluation type, the allocator type, and the user defined types that will be passed through evaluator calls.  We will go through each of these defninitions.

The basic class should derive from the PHX::TraitsBase object.

<b> The Traits struct must define each of the following typedef members of the struct:</b>

- \b EvalTypes - an mpl::vector of user defined evaluation types.  Each evaluation type must have a typedef member called ScalarT that provides the default scalar type.  This is used to automate the building of evaluators for each evaluation type using the EvaluatorFactory.
      
- \b EvalToDataMap - an mpl::map.  The key is an evaluation type and the value is an mpl::vector of valid data types for that particular evaluation type.
      
- \b Allocator type - type that defines the allocator class to use to allocate the memory for data storage.
      
- \b SetupData - A user defined type to be passed in to the postRegistrationSetup() call.  Allows users to pass in arbitrary data to the evaluators during setup.

- \b EvalData - A user defined type to be passed in to the evaluateFields() call.  Allows users to pass in arbitrary data.

- \b PreEvalData - A user defined type to be passed in to the preEvaluate() call.  Allows users to pass in arbitrary data.

- \b PostEvalData - A user defined type to be passed in to the postEvaluate() call.  Allows users to pass in arbitrary data.

\subsection td1 A. Basic Class
The basic outline of the traits struct is:
\code
struct MyTraits : public PHX::TraitsBase {
    .
    .
    .
};
\endcode

Inside this struct we need to implement all the typedefs listed above.  The example we will follow is in the file Traits.hpp in "phalanx/example/EnergyFlux" directory.

\subsection td2 B. EvalTypes
First we need to know the evaluation types.  Each evaluation type must include at least a typedef'd default scalar type argument as a public member called ScalarT.  Here is the example code:

\code
  struct MyTraits : public PHX::TraitsBase {
    
    // ******************************************************************
    // *** Scalar Types
    // ******************************************************************
    
    // Scalar types we plan to use
    typedef double RealType;
    typedef Sacado::Fad::DFad<double> FadType;
    
    // ******************************************************************
    // *** Evaluation Types
    // ******************************************************************
    struct Residual { typedef RealType ScalarT; };
    struct Jacobian { typedef FadType ScalarT;  };
    typedef Sacado::mpl::vector<Residual, Jacobian> EvalTypes;

     .
     .
     .
\endcode 

The typedefs RealType and FadType are done only for convenience.  They are not actually required but cut down on the typing.  Only the EvalTypes typedef is required in the code above.

\subsection td3 C. EvaltoDataMap
Next we need to link the data types to the evaluation type.  Note that one could use the same data type in multiple evaluation types:

\code
     .
     .
     .
    // Residual (default scalar type is RealType)
    typedef Sacado::mpl::vector< RealType, 
				 MyVector<RealType>,
				 MyTensor<RealType> 
    > ResidualDataTypes;
  
    // Jacobian (default scalar type is Fad<double>)
    typedef Sacado::mpl::vector< FadType,
				 MyVector<FadType>,
				 MyTensor<FadType> 
    > JacobianDataTypes;
     .
     .
     .
\endcode

\subsection td4 D. Allocator
Define the Allocator type to use.  Phalanx comes with two allocators, but the user can write their own allocator class if these aren't sufficient.  The Phalanx allocator classes are:
- PHX::NewAllocator: uses the C++ "new" command to allocate each field on the heap separately.
- PHX::ContiguousAllocator: allocates a single contiguous block of memory on the heap for all fields regardless of the type.  This allows us to fit a subset of elements into cache to speed up the evaluation.

Code using the NewAllocator is:
\code
     .
     .
     .
    // ******************************************************************
    // *** Allocator Type
    // ******************************************************************
    typedef PHX::NewAllocator Allocator;
     .
     .
     .
\endcode

\subsection td5 E. EvalData,PreEvalData,PostEvalData
  Users can pass their own data to the postRegistrationSetup(), evaluateFields(), preEvaluate() and postEvaluate() methods of the PHX::FiledManager class.  In this example, the user passes in a struct that they have written called MyEvalData.  This contains information about the cell workset.  The user is not required to write their own object.  They could just pass in a null pointer if they don't need auxiliary information passed into the routine.  This is demonstrated in the SetupSetup, PreEvalData, and PostEvalData.  A void* is set for the data member. 
\code
     .
     .
     .
    // ******************************************************************
    // *** User Defined Object Passed in for Evaluation Method
    // ******************************************************************
    typedef void* SetupData;
    typedef const MyEvalData& EvalData;
    typedef void* PreEvalData;
    typedef void* PostEvalData;

  };
\endcode

\section user_guide_step4 Step 4: Specialize the PHX::TypeString Object
For debugging information, Phalanx makes a forward declaration of the PHX::TypeString object.  This must be specialized for each evaluation type and each data type so that if there is a run-time error, phalanx can report detailed information on the problem.  We could have used the typeinfo from the stl, but the name() method is not demangled on every platform, so it can make debugging a challenge.  The specialized classes can go into their own file or can be added to the traits file above depending on how you use the Traits class.  During linking, if the compiler complains about multiple defninitions of your specialized traits classes, separate the traits implementation into their own .cpp file.

\code
namespace PHX {

  // ******************************************************************
  // ******************************************************************
  // Debug strings.  Specialize the Evaluation and Data types for the
  // TypeString object in the phalanx/src/Phalanx_TypeString.hpp file.
  // ******************************************************************
  // ******************************************************************

  // Evaluation Types
  template<> struct TypeString<MyTraits::Residual> 
  { static const std::string value; };

  template<> struct TypeString<MyTraits::Jacobian> 
  { static const std::string value; };

  const std::string TypeString<MyTraits::Residual>::value = 
    "Residual";

  const std::string TypeString<MyTraits::Jacobian>::value = 
    "Jacobian";

  // Data Types
  template<> struct TypeString<double> 
  { static const std::string value; };

  template<> struct TypeString< MyVector<double> > 
  { static const std::string value; };

  template<> struct TypeString< MyTensor<double> > 
  { static const std::string value; };

  template<> struct TypeString< Sacado::Fad::DFad<double> > 
  { static const std::string value; };

  template<> struct TypeString< MyVector<Sacado::Fad::DFad<double> > > 
  { static const std::string value; };

  template<> struct TypeString< MyTensor<Sacado::Fad::DFad<double> > > 
  { static const std::string value; };

  const std::string TypeString<double>::value = 
    "double";

  const std::string TypeString< MyVector<double> >::value = 
    "MyVector<double>";

  const std::string TypeString< MyTensor<double> >::value = 
    "MyTensor<double>";

  const std::string TypeString< Sacado::Fad::DFad<double> >::
  value = "Sacado::Fad::DFad<double>";
  
  const std::string TypeString< MyVector<Sacado::Fad::DFad<double> > >::
  value = "Sacado::Fad::DFad< MyVector<double> >";

  const std::string TypeString< MyTensor<Sacado::Fad::DFad<double> > >::
  value = "Sacado::Fad::DFad< MyTensor<double> >";
}
\endcode
  
\section user_guide_step5 Step 5: Write your Evaluators

Before Writing your evaluators, you must decide on how you plan to build the evaluators.  In most cases you will want to build one Evaluator for each evaluation type.  Phalanx provides an automated factory called the PHX::EvaluatorFactory that will build an evaluator for each evaluation type automatically, but this places a restriction on the constructor of all evaluators built this way.  If you plan to use the automated builder, the constructor for the Evaluator must contain only one argument - a Teuchos::ParameterList.  This paramter list must contain a key called "Type" with an integer value corresponding to the type of Evaluator object to build.  The parameterlist can contain any other information the user requires for proper construction of the Evaluator.  You are not restricted to using the automated factory for every Evalautor.  You can selectively use the automated factory where convenient.

For each field, you will need an evaluator.  Evaluators can evlauate multiple fields at the same time.  You can derive from the base class PHX::Evaluator, or you can derive from class PHX::EvaluatorWithBaseImpl that has most of the methods already implemented so that the same support code is not replicted in each evaluator.  We STRONGLY recommend deriving from the class PHX::EvaluatorWithBaseImpl.

An example for evaluating the density field is:

\code
#ifndef PHX_EXAMPLE_VP_DENSITY_HPP
#define PHX_EXAMPLE_VP_DENSITY_HPP

#include "Phalanx_config.hpp"
#include "Phalanx_Evaluator_WithBaseImpl.hpp"
#include "Phalanx_Evaluator_Derived.hpp"
#include "Phalanx_Field.hpp"

template<typename EvalT, typename Traits>
class Density : 
  public PHX::EvaluatorWithBaseImpl<Traits>,
  public PHX::EvaluatorDerived<EvalT, Traits> {
  
public:
  
  Density(const Teuchos::ParameterList& p);
  
  void postRegistrationSetup(typename Traits::SetupData d,
			     PHX::FieldManager<Traits>& vm);
  
  void evaluateFields(typename Traits::EvalData ud);
  
private:
  
  typedef typename EvalT::ScalarT ScalarT;

  double constant;

  PHX::Field<ScalarT> density;
  PHX::Field<ScalarT> temp;

  std::size_t data_layout_size;

};

#include "Evaluator_Density_Def.hpp"

#endif
\endcode

Note that if you want to use the automated factory PHX::EvaluatorFactory to build an object of each evaluation type, you must derive from the PHX::EvaluatorDerived class as shown in the example above.  This allows the variable manager to store a vector of base object pointers for each evaluation type in a single stl vector.

Also note that we pull the scalar type, ScalarT, out of the evaluation type.

The implementation is just as simple:
\code
// **********************************************************************
template<typename EvalT, typename Traits> Density<EvalT, Traits>::
Density(const Teuchos::ParameterList& p) :
  density("Density", p.get< Teuchos::RCP<PHX::DataLayout> >("Data Layout") ),
  temp("Temperature", p.get< Teuchos::RCP<PHX::DataLayout> >("Data Layout") )
{ 
  this->addEvaluatedField(density);
  this->addDependentField(temp);
  this->setName("Density");
}

// **********************************************************************
template<typename EvalT, typename Traits>
void Density<EvalT, Traits>::
postRegistrationSetup(typename Traits::SetupData d,
		      PHX::FieldManager<Traits>& vm)
{
  this->utils.setFieldData(density,vm);
  this->utils.setFieldData(temp,vm);

  data_layout_size = density.fieldTag().dataLayout().size();
}

// **********************************************************************
template<typename EvalT, typename Traits>
void Density<EvalT, Traits>::evaluateFields(typename Traits::EvalData d)
{ 
  std::size_t size = d.num_cells * data_layout_size;
  
  for (std::size_t i = 0; i < size; ++i)
    density[i] =  temp[i] * temp[i];
}
\endcode

The constructor pulls out data from the parameter list to set the correct data layout.  Additionally, it tells the FieldManager what fields it will evaluate and what fields it requires/depends on to perform the evaluation.

The postRegistrationSetup method gets pointers from the FieldManager to the array for storing data for each particular field.

  Writing evaluators can be tedious.  We have invested much time in minimizing the amount of code a user writes for a new evaluator.  Our experience is that you can literally have hundreds of evaluators.  So we have added macros to hide the boilerplate code in each evaluator.  Not only does this streamline/condense the code, but it also hides much of the templating.  So if your user base is uncomfortable with C++ templates, the macro definitions could be very helpful.  The definitions are found in the file Phalanx_Evaluator_Macros.hpp.  The same evaluator shown above is now implemented using the macro definitions:

Class declaration:
\code
#ifndef PHX_EXAMPLE_VP_DENSITY_HPP
#define PHX_EXAMPLE_VP_DENSITY_HPP

#include "Phalanx_Evaluator_Macros.hpp"
#include "Phalanx_Field.hpp"

PHX_EVALUATOR_CLASS(Density)

  double constant;

  PHX::Field<ScalarT> density;
  PHX::Field<ScalarT> temp;

  std::size_t data_layout_size;

PHX_EVALUATOR_CLASS_END

#include "Evaluator_Density_Def.hpp"

#endif
\endcode

Class definition:
\code
//**********************************************************************
PHX_EVALUATOR_CTOR(Density,p) :
  density("Density", p.get< Teuchos::RCP<PHX::DataLayout> >("Data Layout") ),
  temp("Temperature", p.get< Teuchos::RCP<PHX::DataLayout> >("Data Layout") )
{ 
  this->addEvaluatedField(density);
  this->addDependentField(temp);
  this->setName("Density");
}

//**********************************************************************
PHX_POST_REGISTRATION_SETUP(Density,data,fm)
{
  this->utils.setFieldData(density,fm);
  this->utils.setFieldData(temp,fm);

  data_layout_size = density.fieldTag().dataLayout().size();
}

//**********************************************************************
PHX_EVALUATE_FIELDS(Density,d)
{ 
  std::size_t size = d.num_cells * data_layout_size;
  
  for (std::size_t i = 0; i < size; ++i)
    density[i] =  temp[i] * temp[i];
}
\endcode

The evaluators for the example problem in "phalanx/example/EnergyFlux" have been rewritten using the macro definitions and can be found in the directory "phalanx/test/Utilities/evaluators".  

Finally, since writing even the above code contains much boilerplate, we have written a python script that will generate the above skeleton files for you.  All you need provide is the class name and the filename.  The script is called phalanx_create_evaluator.py and can be found in the "scripts" directory.  A "make install" will place the script in the "bin" directory.  To generate a skeleton for the above function, you would execute the following command at the prompt:

\code
> ./phalanx_create_evaluator.py -c -n Density Evaluator_Density
\endcode

\section user_guide_step6 Step 6: Implement the FieldManager in your code
  Adding the FieldManager to your code is broken into steps.  You must build each Evaluator for each field type, register the evaluators with the FieldManager, and then call the evaluate routines.  Continuing from our example:

\subsection fmd1 A. Building your Evaluators 
Users can build their own Evaluators and register them with the FieldManager or they can use our automated factory, PHX::EvaluatorFactory to handle this for them.  Normally, users will want to build an evaluator for each evaluation type.  The factory makes this very easy.  Additionally, you are not restricted to using the automated factory for every Evalautor.  You can selectively use the automated factory where convenient.  The next two sections describe how to build the evaluators.

\subsubsection fmd1s1 A.1 Building and Registering Evaluators Manually.
  To build an Evaluator manually, all you need to do is allocate the Evaluator on the heap using a reference counted smart pointer (Teuchos::RCP) to point ot the object.  This will ensure proper memory management of the object.  Here is an example of code to build a Density evaluator for each evaluation type and register it with the corresponding manager.

\code
// Create a FieldManager
FieldManager<MyTraits> fm;

// Constructor requirements
RCP<ParameterList> p = rcp(new ParameterList);
p->set< RCP<DataLayout> >("Data Layout", qp);
     
// Residual
Teuchos::RCP< Density<MyTraits::Residual,MyTraits> > residual_density = 
  Teuchos::rcp(new Density<MyTraits::Residual,MyTriats>(p));

fm.registerEvaluator<MyTraits::Residual>(residual_density);

// Jacobian
Teuchos::RCP< Density<MyTraits::Jacobian,MyTraits> > jacobian_density = 
  Teuchos::rcp(new Density<MyTraits::Jacobian,MyTriats>(p));

fm.registerEvaluator<MyTraits::Residual>(jacobian_density);
\endcode

As one can see, this becomes very tedious if there are many evaluation types.  It is much better to use the automated factory to build one for each evalaution type.  Where this method is useful is if you are in a class already templated on the evaluation type, and would like to build and register an evaluator in that peice of code.

\subsubsection fmd1s2 A.2 Using the Automated Factory

  Phalanx provides an automated builder PHX::EvaluatorFactory that will create an object of each evaluation type.  The following requirements are placed on each and every Evaluator that a user writes if they plan to use the automated factory:
<ol>

<li>  The constructor of your Evaluator must take exactly one argument that is a Teuchos::ParamterList object.  A Teuchos::ParameterList allows you to pass an arbitrary number of objects with arbitrary types.

<li>  In the Teuchos::ParamterList, you must have one key called "Type" that is associated with an integer value that uniquely corresponds to an Evaluator object written by the user.  This integer is defined in the users factory traits object described below.

<li>  The Evaluator must derive from the PHX::EvaluatorDerived class as shown in the example above.  This allows the variable manager to store a vector of base object pointers for each evaluation type in a single stl vector yet be able to return the derived class.

</ol>


To build an PHX::EvaluatorFactory, you must provide a factory traits class that gives the factory a list of its evaluator types.  An example of the factory traits class is the FactoryTraits object in the file FactoryTraits.hpp:

\code
#ifndef EXAMPLE_FACTORY_TRAITS_HPP
#define EXAMPLE_FACTORY_TRAITS_HPP

// mpl (Meta Programming Library) templates
#include "Sacado_mpl_vector.hpp"

// User Defined Evaluator Types
#include "Evaluator_Constant.hpp"
#include "Evaluator_Density.hpp"
#include "Evaluator_EnergyFlux_Fourier.hpp"
#include "Evaluator_FEInterpolation.hpp"
#include "Evaluator_NonlinearSource.hpp"


#include "Sacado_mpl_placeholders.hpp"
using namespace Sacado::mpl::placeholders;

template<typename Traits>
struct MyFactoryTraits {
  
  static const int id_constant = 0;
  static const int id_density = 1;
  static const int id_fourier = 2;
  static const int id_feinterpolation = 3;
  static const int id_nonlinearsource = 4;

  typedef Sacado::mpl::vector< Constant<_,Traits>,             // 0
 			       Density<_,Traits>,              // 1
 			       Fourier<_,Traits>,              // 2
 			       FEInterpolation<_,Traits>,      // 3
 			       NonlinearSource<_,Traits>       // 4
  > EvaluatorTypes;

};

#endif

\endcode 

Since the factory is built at compile time, we need to link a run-time choice to the compile-time list of object types.  Thus the user must provide the static const int identifiers that are unique for each type that can be constructed.  Users can ignore the "_" argument in the mpl vector.  This is a placeholder argument that allows us to iterate over and instert each evaluation type into the factory traits.

Now let's build the evaluators.  The following code can be found in the Example.cpp file in the directory "phalanx/example/EnergyFlux".  We create a ParameterList for each evaluator and call the factory constructor.  Since we are using the factory, we need to specify the "Type" argument set to the integer of the corresponding Evaluator in the factory traits that we wish to build:

\code
      RCP<DataLayout> qp = rcp(new FlatLayout("QP", 4));
      RCP<DataLayout> node = rcp(new FlatLayout("NODE", 4));

      // Parser will build parameter list that determines the field
      // evaluators to build
      map<string, RCP<ParameterList> > evaluators_to_build;
      
      { // Temperature
	RCP<ParameterList> p = rcp(new ParameterList);
	int type = MyFactoryTraits<MyTraits>::id_constant;
	p->set<int>("Type", type);
	p->set<string>("Name", "Temperature");
	p->set<double>("Value", 2.0);
	p->set< RCP<DataLayout> >("Data Layout", node);
	evaluators_to_build["DOF_Temperature"] = p;
      }
      { // Density
	RCP<ParameterList> p = rcp(new ParameterList);
	int type = MyFactoryTraits<MyTraits>::id_density;
	p->set<int>("Type", type);
	p->set< RCP<DataLayout> >("Data Layout", qp);
	evaluators_to_build["Density"] = p;
      }

      { // Constant Thermal Conductivity
	RCP<ParameterList> p = rcp(new ParameterList);
	int type = MyFactoryTraits<MyTraits>::id_constant;
	p->set<int>("Type", type);
	p->set<string>("Name", "Thermal Conductivity");
	p->set<double>("Value", 2.0);
	p->set< RCP<DataLayout> >("Data Layout", qp);
	evaluators_to_build["Thermal Conductivity"] = p;
      }
      
      { // Nonlinear Source
	RCP<ParameterList> p = rcp(new ParameterList);
	int type = MyFactoryTraits<MyTraits>::id_nonlinearsource;
	p->set<int>("Type", type);
	p->set< RCP<DataLayout> >("Data Layout", qp);
	evaluators_to_build["Nonlinear Source"] = p;
      }

      { // Fourier Energy Flux
	RCP<ParameterList> p = rcp(new ParameterList);
	int type = MyFactoryTraits<MyTraits>::id_fourier;
	p->set<int>("Type", type);
	p->set< RCP<DataLayout> >("Data Layout", qp);
	evaluators_to_build["Energy Flux"] = p;
      }

      { // FE Interpolation
	RCP<ParameterList> p = rcp(new ParameterList);

	int type = MyFactoryTraits<MyTraits>::id_feinterpolation;
	p->set<int>("Type", type);

	p->set<string>("Node Variable Name", "Temperature");
	p->set<string>("QP Variable Name", "Temperature");
	p->set<string>("Gradient QP Variable Name", "Temperature Gradient");

	p->set< RCP<DataLayout> >("Node Data Layout", node);
	p->set< RCP<DataLayout> >("QP Data Layout", qp);

	evaluators_to_build["FE Interpolation"] = p;
      }

      // Build Field Evaluators for each evaluation type
      EvaluatorFactory<MyTraits,MyFactoryTraits<MyTraits> > factory;
      RCP< vector< RCP<Evaluator_TemplateManager<MyTraits> > > > 
	evaluators;
      evaluators = factory.buildEvaluators(evaluators_to_build);
 
      // Create a FieldManager
      FieldManager<MyTraits> fm;

      // Register all Evaluators 
      registerEvaluators(evaluators, fm);

\endcode

The map "evaluators_to_build" has a key of type "std::string".  This key is irrelevant to the execution of the code.  It is an identifer that can help users debug code or search for a specific provider in the list.  What you put in the key is for your own use.

You are free to register evaluators until the time you call the method postRegistrationSetup() on the field manager.  Once this is called, you can not register any more evaluators.  You can make multiple registration calls to add more providers - there is no limit on the number of calls.

\subsection fmd2 B. Request which Fields to Evaluate

You must tell the FieldManager which quantities it should evaluate.  This step can occur before, after, or in-between the registration of Evaluators.  You must request variables separately for each evaluation type.  This is required since since data types can exist in multiple evaluation types.

\code
      // Request quantities to assemble RESIDUAL PDE operators
      {
	typedef MyTraits::Residual::ScalarT ResScalarT;

	Tag< MyVector<ResScalarT> > energy_flux("Energy_Flux", qp);
	fm.requireField<MyTraits::Residual>(energy_flux);

	Tag<ResScalarT> source("Nonlinear Source", qp);
	fm.requireField<MyTraits::Residual>(source);
      }

      // Request quantities to assemble JACOBIAN PDE operators
      {
	typedef MyTraits::Jacobian::ScalarT JacScalarT;

	Tag< MyVector<JacScalarT> > energy_flux("Energy_Flux", qp);
	fm.requireField<MyTraits::Jacobian>(energy_flux);

	Tag<JacScalarT> source("Nonlinear Source", qp);
	fm.requireField<MyTraits::Jacobian>(source);
      }
\endcode

You are free to request fields to evaluate until the time you call the method postRegistrationSetup() on the field manager.  Once this is called, you can not request any more fields. 

\subsection fmd3 C. Call FieldManager::postRegistrationSetup()

Once the evaluators are registered with the FieldManager and it knows which field it needs to provide, call the postRegistrationSetup() method on the FieldManager.  This method requires that the user specify the maximum number of cells for each evaluation - the workset size.  This number should be selected so that all fields can fit in the cache of the processor if possible.  

\code
      // Assume we have 102 cells on processor and can fit 20 cells in cache
      const std::size_t num_local_cells = 102;
      const std::size_t workset_size = 20;

      fm.postRegistrationSetup(workset_size);
\endcode

The postRegistrationSetup() method causes the following actions to take place in the FieldManager:

<ol>
<li> Based on the requested fields in \ref fmd2, the FieldManager will trace through the evaluators to determine which evaluators to call and the order in which they need to be called to achieve a consistent evaluation. Not all evaluators that are registered will be used.  They will only be called to satisfy dependencies of the required fields.
<li> Once the dependency chain is known, we can pull together a flat list of all fields that will be used.  Now the FieldManager will allocate memory to store all fields.  It will use the Allocator object from the users traits class to accomplish this.  By unifying the allocation into a single object, we can force all fields to be allocated in a single contiguous block of memory if desired.
<li> Once the memory for field data is allocated, we must set the pointer to that memory block inside each Field or MDField object in each evaluator.  The FieldManager does this by calling the postRegistrationSetup() method on each evaluator that is required for the computation. 
</ol>

Note: you can no longer register evaluators or request fields to evaluate once postRegistrationSetup() method is called.

\subsection fmd6 D. Setup Worksets

If the user plans to use worksets, these objects are typically passed in through the member of the evaluate call as defined in your traits class.  In our example, we chose to pass a user defined class called a MyWorkset in to the evaluate call.  Since we plan to use worksets, each object of MyWorkset contains information about the workset.  Here is the MyWorkset implementation: 

\code
#ifndef PHX_EXAMPLE_MY_WORKSET_HPP
#define PHX_EXAMPLE_MY_WORKSET_HPP

#include "Phalanx_config.hpp" // for std::vector
#include "Cell.hpp"

struct MyWorkset {
  
  std::size_t local_offset;

  std::size_t num_cells;
  
  std::vector<MyCell>::iterator begin;

  std::vector<MyCell>::iterator end;

};

#endif
\endcode

The user has written a MyCell class that contains data for each specific local cell.  MyWorkset contains iterators to the beginning and end of the chunk of cells for this particular workset.  The local_offset is the starting index into the local cell array.  The num_cells is the number of cells in the workset.   The MyWorkset objects are created one for each workset via the following code:

\code
      // Create Workset information: Cells and EvalData objects
      std::vector<MyCell> cells(num_local_cells);
      for (std::size_t i = 0; i < cells.size(); ++i)
	cells[i].setLocalIndex(i);
      std::vector<MyWorkset> worksets;

      std::vector<MyCell>::iterator cell_it = cells.begin();
      std::size_t count = 0;
      MyWorkset w;
      w.local_offset = cell_it->localIndex();
      w.begin = cell_it;
      for (; cell_it != cells.end(); ++cell_it) {
	++count;
	std::vector<MyCell>::iterator next = cell_it;
	++next;
	
	if ( count == workset_size || next == cells.end()) {
	  w.end = next;
	  w.num_cells = count;
	  worksets.push_back(w);
	  count = 0;

	  if (next != cells.end()) {
	    w.local_offset = next->localIndex();
	    w.begin = next;
	  }
	}
      }
\endcode

Note that you do not have to use the workset idea.  You could just pass in workset size equal to the number of local cells on the processor or you could use a workset size of one cell and wrap the evaluate call in a loop over the number of cells.  Be aware that this can result in a possible performance hit.

\subsection fmd4 E. Call evaluate()

Finally, users can call the evlauate routines and the pre/post evaluate routines if required.

\code
      fm.preEvaluate<MyTraits::Residual>(NULL);

      // Process all local cells on processor by looping over worksets
      for (std::size_t i = 0; i < worksets.size(); ++i) {

	fm.evaluateFields<MyTraits::Residual>(worksets[i]);
	  
	// Use workset values
                .
                .
                .
      }

      fm.postEvaluate<MyTraits::Residual>(NULL);
\endcode

\subsection fmd5 F. Accessing Data

Accessing field data is achieved as follows:

\code
      Field< MyVector<double> > ef("Energy_Flux", qp);

      fm.getFieldData<MyVector<double>,MyTraits::Residual>(ef);
\endcode

You do not need to use the Field objects to access field data.  You can get the reference counted smart pointer to the data array directly by using the field tag:

\code
      RCP<DataLayout> qp = rcp(new FlatLayout("QP", 4));
      PHX::FieldTag<double> s("Nonlinear Source", qp);
      Teuchos::ArrayRCP<double> source_values;
      fm.getFieldData<double,MyTraits::Residual>(s,source_values);
\endcode

*/

/* ************************************************************************ */
/* ************************************************************************ */

/*! \page faq Frequently Asked Questions

\section Questions
\ref faq1

\ref faq2

\ref faq3

\ref faq4

\ref faq5

\ref faq6

\ref faq7

\ref faq8

\ref faq9

\section faq1 1. Why name it Phalanx?  
The phalanx was one of the most dominant military formations of the Greek armies in the classical period.  It was a strictly ordered formation.  The Phalanx software library figures out ordered dependencies of field evaluators.   A second more obscure reference relates to the US Navy.  The Phalanx software package was designed to provide nonlinear functionality for the <a href="http://trilinos.sandia.gov/packages/intrepid">Intrepid</a> itegration library.  Intrepid was the name of an aircraft carrier during in WW II.  Modern US aircraft carriers are protected by a Close-In Weapons System (CIWS) named the Phalanx.  Finally, the PI of this project is an avid strategy warfare gamer and leans towards military references.

\section faq2 2. Why does the evaluation type exist?
This allows for multiple evaluations that used the same Scalar type, but different evaluators specific to the CalculationType.  For example, evaluating the Jacobian (derivative of the residual with respect to the degrees of freedom) and the parameter sensitivities (derivative of the residual with respect to the parameters) both use the same Sacado data type.  They both want derivatives but with respect to a different set of entities.

\section faq3 3. Why do evaluation types have arbitrary data types.  Shouldn't all fields of an evaluation type require the same scalar type?
In 99% of the use cases, the answer is yes!  By changing the scalar type for a specific type of computation, you can break the dependency chain and get incorrect answers.  For example, if computing a Jacobian, you would use a FAD scalar type instead of a double scalar type for the evaluation.  By mixing FAD and double fields in an evaluation type, you will lose derivative components and get incorrect sensitivities.  But there are some use cases where one might want to use different data types.  If the user knows that certain evaluation routines have no derivative components, they can bypass the AD chain by switching to doubles for that part of the computation.  The is an expert only user mode, but we allow for it.

\section faq4 4. Why are the evaluators templated on the evaluation type?
Each evaluator must be templated on the evaluation type to be able to (1) allow for the expert use mode described in FAQ \ref faq3; (2) use the automated factory class (PHX::EvaluatorFactory) to build the evaluator for each evalution type.  

\section faq5 5. Why is the MDALayout object not templated on the ArrayOrder type?
This introduces many complications in the comparison operator (operator==) during a comparison of a reverse and natural data layout with the same (but reversed index types).  Should these objects be the same?  Probably, but this might introduce some confusion.  So in an effort to keep things as clean as possible, there is no differentiation based on forward or reverse (Fortran) array ordering.  If we were to introduce this, then we would have to add another method to the DataLayout base class to allow for checking the ordering.  If mixing natural and reverse ordering becomes common, we may revisit this decision.

\section faq6 6. My code seems to be running slow.  How can I speed it up?
See the section on \ref performance in the Users Guide that gives recomendations on how to maximize the efficiency of your code.

\section faq7 7. Compilation take a long time when minor changes are made to an evaluator.  How can I speed this up?
Explicit template instantiation can be used to speed up evaluator development builds.  See the section on \ref performance in the Users Guide.  

\section faq8 8. Valgrind gives a memory leak error in all evaluators when using the PHX::ContiguousAllocator with Sacado::DFad based fields.  What is the problem?
The contiguous allocator is allocating space for all variables in an evaluation type in a single contiguous chunk of memory.  We allow for multiple scalar types in an evaluation type so the array is allocated using the type char and then based on scalar type sizes (using sizeof method) along with proper alignment, pointers to blocks of memory are handed to the fields using a reinterpret_cast.  DFad uses new to allocate the derivative arrays, so this memory is not contiguous.  The problem is that the field array memory is deleted as a char instead of it's various data types, so all DFad derivative arrays are leaked during the destruction.  There are two ways to alleviate this: (1) write a special Allocator to call the destructor of the field type for each field.  (2) Have Sacado implement a version of DFad that allows the user to allocate the DFad array and pass in the pointer during construction.  This will also require a new allocator.  Choice (2) will be handed in Phalanx in a later release.  Our current suggestion is that one should use DFad only if using the PHX::NewAllocator and use SFad if using the PHX::ContiguousAllocator. 

\section faq9 9. The evaluators contain boilerplate.  Is there an easy way to automate creating the evaluators?
YES!  There is a python script that will automatically generate the skeleton for evaluators.  It is found in the "scripts" directory of phalanx and is called "phalanx_create_evaluator.py".  It will be install in the "bin" directory when trilinos is installed.  Example use can be found at the end of section \ref user_guide_step5 of the \ref user_guide.

*/

/* ************************************************************************ */
/* ************************************************************************ */

/*! \page developer_notes Developer Notes

\section dnotes_notz Description

This is a description Pat Notz called his elevator speech on Phalanx:

\verbatim
The {NAME} package is a library to help decompose a complex problem
into a set of simpler problems with managed dependencies. Two benefits
of decomposing a problem using {NAME} are (1) increased flexibility
because each simpler piece becomes an extension point that can be
swapped out with different implementations and (2) easier to craft
code because each piece is simpler, more focused and easier to test in
isolation.  The counter-benefits to these are (1) a potential
performance loss due to fragmentation of the over-all algorithm (e.g.,
several small loops instead of one large loop) and (2) a potential
loss of visibility of the original, composite problem (since the code
is scattered into multiple places).  Managing these trade-offs can
result in application code that both performs well and supports rapid
development and extensibility.  
\endverbatim

\section dnotes_contiguous_allocator Contiguous Allocator

Simple concept for chunk-layout of heterogeneous members with proper
data alignment.

A 'malloc'ed chunk of memory is always suitable to hold any data type,
or array of data types (do a 'man malloc').

\verbatim
        unsigned char * chunk = malloc( chunk_size );
\endverbatim

If a 'double' is to occupy a portion of the chunk then is must be
aligned such that:

\verbatim
        double * d = (double *)( chunk + offset_d );
\endverbatim

where:

\verbatim
        0 == offset_d % sizeof(double);
\endverbatim

The key point is for the offset of each member within the chunk to
have the correct alignment, and of course not overlap any other
member.

\section dnotes_old_description Old email with description of variable manager 

Email description of Phalanx (formerly known as the variable manager):

Phalanx is a tool for controlling the evaluation of
data used in the assembly of a residual/Jacobian fill for PDE solvers.
I implemented the one in Charon and one also exists in Aria (although it
is used for a different purpose).  The idea behind the VM is to allow
for fast changes to dependency trees.  The VM provides variable
quantities.  A variable consists of a name (string), a type (scalar,
vector, tensor - all templated for ad), and the place that it lives
(quadrature point, node, edge, face, element, etc...).  A variable is
evaluated using a provider.  The provider contains lists of variables it
provides (evlauates) and a list of variables it requires to perform the
evaluation (its dependencies).  The VM figures out the correct order to
call the providers to get the dependency chain correct.

I developed this for Charon to address the following issues:

1. Determine the order of variable evaluation so that the correct
evaluation is performed: For example, to compute the density, it could
be a constant, it could be a function of temperature or it could be a
function of temperature and pressure.  A factory creates the density
provider and from this, the manager automatically figures out the
correct evaluation order - whether T and/or P will have to be
evaluated before the density.  This is an easy example, but there are
much more complex dependency chains.

2. Provides a single place to store all evaluation variables so that the
procedure by which the variable is evaluated is not known by the objects
using the variable.  For example, in Charon a quantity of interest is
the velocity components.  The velocity could be a degree of freedom or
it could be a user specified constant, or it could be a user specified
field (read from an exodus file).  Before we had the variable manager,
everywhere in Charon that used velocity, there were repeated loops that
looked like:

\verbatim
if (velocity is DOF)
  // get values from dof structure
else if (velocity is auxiliary data)
  // go get from aux data structure
else if (...)
\endverbatim

There were hard coded loops in multiple places that had to be updated
every time we added a new way to specify velocity.   Now, when velocity
is required, you just go to the VM and get it.  No questions asked.

3. By having a single place to store the variables, we eliminated the
reevaluation of temporary quantities that are used in different PDE
operators and different equations.  Prior to using the variable manager,
the reuse of variables like density was ignored. It was recomputed for
each operator evaluation.  After implementing the variable manager, my
residual evaluation times were reduced by a factor of 7 for reacting
flow problems!

4. By having the evaluation order, one can perform automatic
differentiation without any library like saccado.  When the user writes
a provider for a function, they can also write a local Jacobian for the
individual function.  Then by using the chain rule, the local element
Jacobian can be assembled by traversing the dependency tree generated by
the variable manager.  Ariawrote their VM to do this, since Sacado was
not around when they started.  Aria does use Sacado for user defined
functions, but uses the VM for internal functions.

There are other benefits, but you get the idea.  A primary goal of my
current ASC funding is to generalize the variable manager in Charon
(there are many things I can do better now to make it more efficient and
easier to use) as a trilinos package to be used with Intrepid to aid in
element assembly for nonlinear equation sets.

  */

/* ************************************************************************ */
/* ************************************************************************ */

/*! \page junk Junk

  \todo Nothing pending!

*/

/* ************************************************************************ */
/* ************************************************************************ */

#endif