GNU
Wget merupakan sebuah utulitty
untuk men-download files
non-interactive dari
WEB, yang mendukung protokol pada HTTP, HTTPS, FTP, dan mendapatkan
informasi melalui berbagai HTTP proxy.
Wget
bersifat non-interactive,
yaitu dapat bekerja pada background
saat user sedang tidak
logged on (masuk
kedalam sistem). Command wget
ini mengizinkan user untuk
memulai penemuan kembali (retrieval) dan
keluar (disconnect) dari
sistem , dan membiarkan wget untuk menyelesaikan pekerjaan. Berbeda
dengan kebanyakan web browser yang membutuhkan pengguna untuk
menyelesaikan pekerjaan, yang hal ini sebenarnya dapat menjadi
gangguan terbesar saat kuantitas data yang dikirim sangat besar.
Wget juga dapat mengikuti link pada halaman HTML dan membuat local versions dari suatu website yang terpencil (remote website), dengan membuat struktur direktori yang serupa dengan original site, yang biasa disebut dengan recursive downloading. Walaupun demikian, wget masih menerapkan Robot Exclusion Standard (/robots.txt) saat pembuatan recursive downloading. Tidak hanya itu, Wget juga dapat melakukan konversi link pada file HTML yang diunduh ke lokal file untuk offline viewing.
Wget dirancang untuk tahan terhadap koneksi jaringan yang tidak stabil. Ketika proses download gagal karena masalah jaringan, Wget akan terus bekerja hingga semua file dapat terunduh atau wget akan melanjutkan proses downloading hingga file berhasil di download secara sempurna.
command : wget [option]... [URL]...
OPTIONS
Basic Startup Options
- -V
- --version
- Display the version of Wget.
- -h
- --help
- Print a help message describing all of Wget's command-line options.
- -b
- --background
- Go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.
- -e command
- --execute command
- Execute command as if it were a part of .wgetrc. A command thus invoked will be executed after the commands in .wgetrc, thus taking precedence over them.
Logging and Input File Options
- -o logfile
- --output-file=logfile
- Log all messages to logfile. The messages are normally reported to standard error.
- -a logfile
- --append-output=logfile
- Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created.
- -d
- --debug
- Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d.
- -q
- --quiet
- Turn off Wget's output.
- -v
- --verbose
- Turn on verbose output, with all the available data. The default output is verbose.
- -nv
- --non-verbose
- Non-verbose output---turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.
- -i file
- --input-file=file
- Read URLs from file, in which case no URLs need to be on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially. However, if you specify --force-html, the document will be regarded as html. In that case you may have problems with relative links, which you can solve either by adding "" to the documents or by specifying --base=url on the command line.
- -F
- --force-html
- When input is read from a file, force it to be treated as an HTML file. This enables you to retrieve relative links from existing HTML files on your local disk, by adding "" to HTML, or using the --base command-line option.
- -B URL
- --base=URL
- When used in conjunction with -F, prepends URL to relative links in the file specified by -i.
Download Options
- --bind-address=ADDRESS
- When making client TCP/IP connections, "bind()" to ADDRESS on the local machine. ADDRESS may be specified as a hostname or IP address. This option can be useful if your machine is bound to multiple IPs.
- -t number
- --tries=number
- Set number of retries to number. Specify 0 or inf for infinite retrying.
- -O file
- --output-document=file
- The documents will not be written to the appropriate files, but all will be concatenated together and written to file. If file already exists, it will be overwritten. If the file is -, the documents will be written to standard output. Including this option automatically sets the number of tries to 1.
- -nc
- --no-clobber
-
If a file is downloaded more than once in the same directory, Wget's
behavior depends on a few options, including -nc. In certain
cases, the local file will be clobbered, or overwritten, upon
repeated download. In other cases it will be preserved.
When running Wget without -N, -nc, or -r,
downloading the same file in the same directory will result in the
original copy of file being preserved and the second copy being
named file.1. If that file is downloaded yet again, the
third copy will be named file.2, and so on. When
-nc is specified, this behavior is suppressed, and Wget will
refuse to download newer copies of file. Therefore,
``"no-clobber"'' is actually a misnomer in this mode---it's not
clobbering that's prevented (as the numeric suffixes were already
preventing clobbering), but rather the multiple version saving that's
prevented.
When running Wget with -r, but without -N or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. -nc may not be specified at the same time as -N.
Note that when -nc is specified, files with the suffixes .html or (yuck) .htm will be loaded from the local disk and parsed as if they had been retrieved from the Web. - -c
- --continue
- Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.
- =======================================================================================
- Contoh Penggunaan
- Untuk mendownload file dari suatu website
wget http://fly.srk.fer.hr/
Untuk mengkonversi link pada file HTML agar dapat dilihat secara offline
wget --convert-links -r http://www.gnu.org/ -o gnulog
No comments:
Post a Comment