Wget
From LaurasWiki
| Table of contents |
[edit]
On Wget
- GNU Wget Manual: 1 page (http://www.gnu.org/software/wget/manual/wget.html)
- GNU Wget Manual: nodal (http://www.gnu.org/software/wget/manual/html_node/index.html)
- Wget (http://en.wikipedia.org/wiki/Wget) at Wikipedia
[edit]
Directory Options
Directory-Options (http://www.gnu.org/software/wget/manual/html_node/Directory-Options.html#Directory-Options)
-nH --no-host-directories Disable generation of host-prefixed directories. By default, invoking Wget with -r http://fly.srk.fer.hr/ will create a structure of directories beginning with fly.srk.fer.hr/. This option disables such behavior.
-x --force-directories The opposite of -nd—create a hierarchy of directories, even if one would not have been created otherwise. E.g. wget -x http://fly.srk.fer.hr/robots.txt will save the downloaded file to fly.srk.fer.hr/robots.txt.
[edit]
Download Options
Download-Options (http://www.gnu.org/software/wget/manual/html_node/Download-Options.html#Download-Options)
-c --continue Continue getting a partially-downloaded file.
-w seconds --wait=seconds Wait the specified number of seconds between the retrievals. Use of this option is recommended, as it lightens the server load by making the requests less frequent.
[edit]
HTTP Options
HTTP-Options (http://www.gnu.org/software/wget/manual/html_node/HTTP-Options.html#HTTP-Options)
-E --html-extension If a file of type application/xhtml xml or text/html is downloaded and the URL does not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option will cause the suffix .html to be appended to the local filename. This is useful, for instance, when you're mirroring a remote site that uses .asp pages, but you want the mirrored pages to be viewable on your stock Apache server. Another good use for this is when you're downloading CGI-generated materials. A URL like http://site.com/article.cgi?25 will be saved as article.cgi?25.html.
[edit]
Recursive Retrieval Options
Recursive-Retrieval-Options (http://www.gnu.org/software/wget/manual/html_node/Recursive-Retrieval-Options.html#Recursive-Retrieval-Options)
-k --convert-links After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-html content, etc.
-K --backup-converted When converting a file, back up the original version with a .orig suffix. Affects the behavior of -N (see HTTP Time-Stamping Internals).
-l depth --level=depth Specify recursion maximum depth level depth (see Recursive Download). The default maximum depth is 5.
-m --mirror Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps ftp directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing.
-p --page-requisites This option causes Wget to download all the files that are necessary to properly display a given html page. This includes such things as inlined images, sounds, and referenced stylesheets.
-r --recursive Turn on recursive retrieving. See Recursive Download, for more details.
[edit]
Directory Based-Limits
Directory Based-Limits (http://www.gnu.org/software/wget/manual/html_node/Directory_002dBased-Limits.html#Directory_002dBased-Limits)
-np --no-parent no_parent = on The simplest, and often very useful way of limiting directories is disallowing retrieval of the links that refer to the hierarchy above than the beginning directory, i.e. disallowing ascent to the parent directory/directories.

