This file is indexed.

/usr/lib/NAST-iEr/README is in nast-ier 20101212+dfsg-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
## NAST-iEr  ##

The NAST-iEr alignment utility aligns a single raw nucleotide sequence against one or more NAST-formatted sequences. 

The alignment algorithm involves global dynamic programming alignment to a fixed template sequence(s) without any end-gap penalty (think Pearson's align0 program with a fixed template sequence containing arbitrary gap positions). Non-GATC characters are not recognized and so are treated as 'A' characters within the alignment algorithm.


## INSTALLATION
The following software tools must be separately installed and made available via your standard PATH setting.

       megablast:  http://www.ncbi.nlm.nih.gov/BLAST/download.shtml

       cdbtools:  http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/download.pl?ftp_dir=software&file_dir=cdbfasta/cdbfasta.tar.gz

Build NAST-iEr from source like so:

        gcc NAST-iEr.c -o NAST-iEr





## Running NAST-iEr

Example data is provided in the sample_data/ directory.

Examine and execute the 'runMe.sh' to demonstrate NAST-iEr alignment of a single query sequence against a collection of reference sequences provided in NAST format.


## Converting an entire database of query sequences into NAST format

A wrapper to the NAST-iEr utility is provided by the script 'run_NAST-iEr.pl'.  

This script converts each sequence within a database of unaligned 16S sequences to NAST format by doing the following:

-performs a megablast search against a database of reference sequences, identifying the top number of hits.
-extracts these top hits from a NAST-formatted version of this reference sequence database
-runs NAST-iEr to convert the query sequence to NAST format.

Before running this wrapper script, be sure to configure the default settings by indicating the path to the reference 16S database (both in FASTA and NAST format).

Once configured, converting unaligned sequences to NAST format is as simple as running:

     run_NAST-iEr.pl --query_FASTA  mySequences.fasta  >  mySequences.NAST



Questions, comments, etc, contact: bhaas@broad.mit.edu  Brian Haas