Tuesday, March 24, 2015

COMMAND Wget



GNU Wget merupakan sebuah utulitty untuk men-download files non-interactive dari WEB, yang mendukung protokol pada HTTP, HTTPS, FTP, dan mendapatkan informasi melalui berbagai HTTP proxy. 
 
Wget bersifat non-interactive, yaitu dapat bekerja pada background saat user sedang tidak logged on (masuk kedalam sistem). Command wget ini mengizinkan user untuk memulai penemuan kembali (retrieval) dan keluar (disconnect) dari sistem , dan membiarkan wget untuk menyelesaikan pekerjaan. Berbeda dengan kebanyakan web browser yang membutuhkan pengguna untuk menyelesaikan pekerjaan, yang hal ini sebenarnya dapat menjadi gangguan terbesar saat kuantitas data yang dikirim sangat besar.

Wget juga dapat mengikuti link pada halaman HTML dan membuat local versions dari suatu website yang terpencil (remote website), dengan membuat struktur direktori yang serupa dengan original site, yang biasa disebut dengan recursive downloading. Walaupun demikian, wget masih menerapkan Robot Exclusion Standard (/robots.txt) saat pembuatan recursive downloading. Tidak hanya itu, Wget juga dapat melakukan konversi link pada file HTML yang diunduh ke lokal file untuk offline viewing.

Wget dirancang untuk tahan terhadap koneksi jaringan yang tidak stabil. Ketika proses download gagal karena masalah jaringan, Wget akan terus bekerja hingga semua file dapat terunduh atau wget akan melanjutkan proses downloading hingga file berhasil di download secara sempurna.

command : wget [option]... [URL]...  

OPTIONS


Basic Startup Options


-V
--version
Display the version of Wget.
-h
--help
Print a help message describing all of Wget's command-line options.
-b
--background
Go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.
-e command
--execute command
Execute command as if it were a part of .wgetrc. A command thus invoked will be executed after the commands in .wgetrc, thus taking precedence over them. 

Logging and Input File Options

-o logfile
--output-file=logfile
Log all messages to logfile. The messages are normally reported to standard error.
-a logfile
--append-output=logfile
Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created.
-d
--debug
Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d.
-q
--quiet
Turn off Wget's output.
-v
--verbose
Turn on verbose output, with all the available data. The default output is verbose.
-nv
--non-verbose
Non-verbose output---turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.
-i file
--input-file=file
Read URLs from file, in which case no URLs need to be on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially. However, if you specify --force-html, the document will be regarded as html. In that case you may have problems with relative links, which you can solve either by adding "" to the documents or by specifying --base=url on the command line.
-F
--force-html
When input is read from a file, force it to be treated as an HTML file. This enables you to retrieve relative links from existing HTML files on your local disk, by adding "" to HTML, or using the --base command-line option.
-B URL
--base=URL
When used in conjunction with -F, prepends URL to relative links in the file specified by -i

Download Options

--bind-address=ADDRESS
When making client TCP/IP connections, "bind()" to ADDRESS on the local machine. ADDRESS may be specified as a hostname or IP address. This option can be useful if your machine is bound to multiple IPs.
-t number
--tries=number
Set number of retries to number. Specify 0 or inf for infinite retrying.
-O file
--output-document=file
The documents will not be written to the appropriate files, but all will be concatenated together and written to file. If file already exists, it will be overwritten. If the file is -, the documents will be written to standard output. Including this option automatically sets the number of tries to 1.
-nc
--no-clobber
If a file is downloaded more than once in the same directory, Wget's behavior depends on a few options, including -nc. In certain cases, the local file will be clobbered, or overwritten, upon repeated download. In other cases it will be preserved. When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. If that file is downloaded yet again, the third copy will be named file.2, and so on. When -nc is specified, this behavior is suppressed, and Wget will refuse to download newer copies of file. Therefore, ``"no-clobber"'' is actually a misnomer in this mode---it's not clobbering that's prevented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that's prevented.
When running Wget with -r, but without -N or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. -nc may not be specified at the same time as -N.
Note that when -nc is specified, files with the suffixes .html or (yuck) .htm will be loaded from the local disk and parsed as if they had been retrieved from the Web.
-c
--continue
Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.
=======================================================================================
Contoh Penggunaan
Untuk mendownload file dari suatu website

 wget http://fly.srk.fer.hr/
 
Untuk mengkonversi link pada file HTML agar dapat dilihat secara offline 
 wget --convert-links -r http://www.gnu.org/ -o gnulog
 


 
 
 





No comments:

Post a Comment