/usr/share/httrack/html/options.html

<html xmlns="http://www.w3.org/1999/xhtml" lang="en">

<head>
	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
	<meta name="description" content="HTTrack is an easy-to-use website mirror utility. It allows you to download a World Wide website from the Internet to a local directory,building recursively all structures, getting html, images, and other files from the server to your computer. Links are rebuiltrelatively so that you can freely browse to the local site (works with any browser). You can mirror several sites together so that you can jump from one toanother. You can, also, update an existing mirror site, or resume an interrupted download. The robot is fully configurable, with an integrated help" />
	<meta name="keywords" content="httrack, HTTRACK, HTTrack, winhttrack, WINHTTRACK, WinHTTrack, offline browser, web mirror utility, aspirateur web, surf offline, web capture, www mirror utility, browse offline, local  site builder, website mirroring, aspirateur www, internet grabber, capture de site web, internet tool, hors connexion, unix, dos, windows 95, windows 98, solaris, ibm580, AIX 4.0, HTS, HTGet, web aspirator, web aspirateur, libre, GPL, GNU, free software" />
	<title>HTTrack Website Copier - Offline Browser</title>

	<style type="text/css">
	<!--

body {
	margin: 0;  padding: 0;  margin-bottom: 15px;  margin-top: 8px;
	background: #77b;
}
body, td {
	font: 14px "Trebuchet MS", Verdana, Arial, Helvetica, sans-serif;
	}

#subTitle {
	background: #000;  color: #fff;  padding: 4px;  font-weight: bold; 
	}

#siteNavigation a, #siteNavigation .current {
	font-weight: bold;  color: #448;
	}
#siteNavigation a:link    { text-decoration: none; }
#siteNavigation a:visited { text-decoration: none; }

#siteNavigation .current { background-color: #ccd; }

#siteNavigation a:hover   { text-decoration: none;  background-color: #fff;  color: #000; }
#siteNavigation a:active  { text-decoration: none;  background-color: #ccc; }


a:link    { text-decoration: underline;  color: #00f; }
a:visited { text-decoration: underline;  color: #000; }
a:hover   { text-decoration: underline;  color: #c00; }
a:active  { text-decoration: underline; }

#pageContent {
	clear: both;
	border-bottom: 6px solid #000;
	padding: 10px;  padding-top: 20px;
	line-height: 1.65em;
	background-image: url(images/bg_rings.gif);
	background-repeat: no-repeat;
	background-position: top right;
	}

#pageContent, #siteNavigation {
	background-color: #ccd;
	}


.imgLeft  { float: left;   margin-right: 10px;  margin-bottom: 10px; }
.imgRight { float: right;  margin-left: 10px;   margin-bottom: 10px; }

hr { height: 1px;  color: #000;  background-color: #000;  margin-bottom: 15px; }

h1 { margin: 0;  font-weight: bold;  font-size: 2em; }
h2 { margin: 0;  font-weight: bold;  font-size: 1.6em; }
h3 { margin: 0;  font-weight: bold;  font-size: 1.3em; }
h4 { margin: 0;  font-weight: bold;  font-size: 1.18em; }

.blak { background-color: #000; }
.hide { display: none; }
.tableWidth { min-width: 400px; }

.tblRegular       { border-collapse: collapse; }
.tblRegular td    { padding: 6px;  background-image: url(fade.gif);  border: 2px solid #99c; }
.tblHeaderColor, .tblHeaderColor td { background: #99c; }
.tblNoBorder td   { border: 0; }


// -->
</style>

</head>

<table width="76%" border="0" align="center" cellspacing="0" cellpadding="0" class="tableWidth">
	<tr>
	<td><img src="images/header_title_4.gif" width="400" height="34" alt="HTTrack Website Copier" title="" border="0" id="title" /></td>
	</tr>
</table>
<table width="76%" border="0" align="center" cellspacing="0" cellpadding="3" class="tableWidth">
	<tr>
	<td id="subTitle">Open Source offline browser</td>
	</tr>
</table>
<table width="76%" border="0" align="center" cellspacing="0" cellpadding="0" class="tableWidth">
<tr class="blak">
<td>
	<table width="100%" border="0" align="center" cellspacing="1" cellpadding="0">
	<tr>
	<td colspan="6"> 
		<table width="100%" border="0" align="center" cellspacing="0" cellpadding="10">
		<tr> 
		<td id="pageContent"> 
<!-- ==================== End prologue ==================== -->

<h2 align="center"><em>Options</em></h2>

<ul>
  <li>Filters: <a href="filters.html">how to use them</a></li>
  <br><small>Here you can find informations on filters: how to accept all gif files in a mirror, for example</small>
  <br><br>
  <li>List of options</li>
</ul>

<tt>
<pre>

w  mirror with automatic wizard
This is the default scanning option, the engine automatically scans links according to the default options, and filters defined. It does not prompt a message when a "foreign" link is reached.

W  semi-automatic mirror with help-wizard (asks questions)
This option lets the engine ask the user if a link must be mirrored or not, when a new web has been found.

g  just get files (saved in the current directory)
This option forces the engine not to scan the files indicated - i.e. the engine only gets the files indicated.

i  continue an interrupted mirror using the cache
This option indicates to the engine that a mirror must be updated or continued.

rN  recurse get with limited link depth of N
This option sets the maximum recurse level. Default is infinite (the engine "knows" that it should not go out of current domain)

a   stay on the same address
This is the default primary scanning option, the engine does not go out of domains without permissions (filters, for example)

d   stay on the same principal domain
This option lets the engine go on all sites that exist on the same principal domain.
Example: a link located at www.someweb.com that goes to members.someweb.com will be followed.

l   stay on the same location (.com, etc.)
This option lets the engine go on all sites that exist on the same location.
Example: a link located at www.someweb.com that goes to www.anyotherweb.com will be followed.
Warning: this is a potentially dangerous option, limit the recurse depth with r option.

e   go everywhere on the web
This option lets the engine go on any sites.
Example: a link located at www.someweb.com that goes to www.anyotherweb.org will be followed.
Warning: this is a potentially dangerous option, limit the recurse depth with r option.

n   get non-html files 'near' an html file (ex: an image located outside)
This option lets the engine catch all files that have references on a page, but that exist outside the web site.
Example: List of ZIP files links on a page.

t   test all URLs (even forbidden ones)
This option lets the engine test all links that are not caught.
Example: to test broken links in a site

x   replace external html links by error pages
This option tells the engine to rewrite all links not taken into warning pages.
Example: to browse offline a site, and to warn people that they must be online if they click to external links.

sN  follow robots.txt and meta robots tags
This option sets the way the engine treats "robots.txt" files. This file is often set by webmasters to avoir cgi-bin directories, or other irrevelant pages.
Values: 
  s0  Do not take robots.txt rules
  s1  Follow rules, if compatible with internal filters
  s2  Always follow site's rules

bN  accept cookies in cookies.txt
This option activates or unactivates the cookie
  b0 do not accept cookies
  b1 accept cookies

S   stay on the same directory
This option asks the engine to stay on the same folder level.
Example: A link in /index.html that points to /sub/other.html will not be followed

D   can only go down into subdirs
This is the default option, the engine can go everywhere on the same directoy, or in lower structures

U   can only go to upper directories
This option asks the engine to stay on the same folder level or in upper structures

B   can both go up&down into the directory structure
This option lets the engine to go in any directory level

Y   mirror ALL links located in the first level pages (mirror links)
This option is activated for the links typed in the command line
Example: if you have a list of web sites in www.asitelist.com/index.html, then all these sites will be mirrored

NN  name conversion type (0 *original structure 1,2,3 html/data in one directory)
  N0 Site-structure (default)
  N1 Html in web/, images/other files in web/images/
  N2 Html in web/html, images/other in web/images
  N3 Html in web/,  images/other in web/
  N4 Html in web/, images/other in web/xxx, where xxx is the file extension (all gif will be placed onto web/gif, for example)
  N5 Images/other in web/xxx and Html in web/html

  N99 All files in web/, with random names (gadget !)

  N100 Site-structure, without www.domain.xxx/
  N101 Identical to N1 exept that "web" is replaced by the site's name
  N102 Identical to N2 exept that "web" is replaced by the site's name
  N103 Identical to N3 exept that "web" is replaced by the site's name
  N104 Identical to N4 exept that "web" is replaced by the site's name
  N105 Identical to N5 exept that "web" is replaced by the site's name
  N199 Identical to N99 exept that "web" is replaced by the site's name

  N1001 Identical to N1 exept that there is no "web" directory
  N1002 Identical to N2 exept that there is no "web" directory
  N1003 Identical to N3 exept that there is no "web" directory (option set for g option)
  N1004 Identical to N4 exept that there is no "web" directory
  N1005 Identical to N5 exept that there is no "web" directory
  N1099 Identical to N99 exept that there is no "web" directory

LN  long names
  L0 Filenames and directory names are limited to 8 characters + 3 for extension
  L1 No restrictions (default)

K   keep original links (e.g. http://www.adr/link) (K0 *relative link)
This option has only been kept for compatibility reasons

pN  priority mode:
  p0 just scan, don't save anything (for checking links)
  p1 save only html files
  p2 save only non html files
  p3 save all files
  p7 get html files before, then treat other files

cN  number of multiple connections (*c8)
Set the numer of multiple simultaneous connections

O   path for mirror/logfiles+cache (-O path_mirror[,path_cache_and_logfiles])
This option define the path for mirror and log files
Example: -P "/user/webs","/user/logs"

P   proxy use (-P proxy:port or -P user:pass@proxy:port)
This option define the proxy used in this mirror
Example: -P proxy.myhost.com:8080

F   user-agent field (-F \"user-agent name\
This option define the user-agent field
Example: -F "Mozilla/4.5 (compatible; HTTrack 1.2x; Windows 98)"

mN maximum file length for a non-html file
This option define the maximum size for non-html files
Example: -m100000

mN,N'  for non html (N) and html (N')
This option define the maximum size for non-html files and html-files
Example: -m100000,250000

MN maximum overall size that can be uploaded/scanned
This option define the maximum amount of bytes that can be downloaded
Example: -M1000000

EN maximum mirror time in seconds (60=1 minute, 3600=1 hour)
This option define the maximum time that the mirror can last
Example: -E3600

AN maximum transfer rate in bytes/seconds (1000=1kb/s max)
This option define the maximum transfer rate
Example: -A2000

GN pause transfer if N bytes reached, and wait until lock file is deleted
This option asks the engine to pause every time N bytes have been transferred, and restarts when the lock file "hts-pause.lock" is being deleted
Example: -G20000000

u check document type if unknown (cgi,asp..)
This option define the way the engine checks the file type
  u0 do not check
  u1 check but /
  u2 check always

RN number of retries, in case of timeout or non-fatal errors (*R0)
This option sets the maximum number of tries that can be processed for a file

o *generate output html file in case of error (404..) (o0 don't generate)
This option define whether the engine has to generate html output file or not if an error occured

TN timeout, number of seconds after a non-responding link is shutdown
This option define the timeout
Example: -T120

JN traffic jam control, minimum transfert rate (bytes/seconds) tolerated for a link
This option define the minimum transfer rate
Example: -J200

HN host is abandonned if: 0=never, 1=timeout, 2=slow, 3=timeout or slow
This option define whether the engine has to abandon a host if a timeout/"too slow" error occured

&P extended parsing, attempt to parse all links (even in unknown tags or Javascript)
This option activates the extended parsing, that attempt to find links in unknown Html code/javascript

j *parse Java Classes (j0 don't parse)
This option define whether the engine has to parse java files or not to catch included files

I *make an index (I0 don't make)
This option define whether the engine has to generate an index.html on the top directory

X *delete old files after update (X0 keep delete)
This option define whether the engine has to delete locally, after an update, files that have been deleted in the remote mirror, or that have been excluded

C *create/use a cache for updates and retries (C0 no cache)
This option define whether the engine has to generate a cache for retries and updates or not

k  store all files in cache (not useful if files on disk)
This option define whether the engine has to store all files in cache or not

V execute system command after each files ($0 is the filename: -V \"rm \\$0\
This option lets the engine execute a command for each file saved on disk

q  quiet mode (no questions)
Do not ask questions (for example, for confirm an option)

Q  log quiet mode (no log)
Do not generate log files

v  verbose screen mode
Log files are printed in the screen

f *log file mode
Log files are generated into two log files

z  extra infos log
Add more informations on log files

Z  debug log
Add debug informations on log files


--mirror  <URLs> *make a mirror of site(s) 
--get <URLs>  get the files indicated, do not seek other URLs
--mirrorlinks <URLs>  test links in pages (identical to -Y)
--testlinks   <URLs>  test links in pages
--spider  <URLs>  spider site(s), to test links (reports Errors & Warnings)
--update  <URLs>  update a mirror, without confirmation
--skeleton<URLs>  make a mirror, but gets only html files

--http10  force http/1.0 requests when possible

</pre>
</tt>

<!-- ==================== Start epilogue ==================== -->

		</td>
		</tr>
		</table>
	</td>
	</tr>
	</table>
</td>
</tr>
</table>

<table width="76%" border="0" align="center" valign="bottom" cellspacing="0" cellpadding="0">
	<tr>
	<td id="footer"><small>&copy; 2007 Xavier Roche & other contributors - Web Design: Leto Kauler.</small></td>
	</tr>
</table>

</body>

</html>
httrack-doc 3.48.1-4ubuntu1 / usr / share / httrack / html / options.html