wget
The wget command, which we will use extensively throughout this aide-memoire, downloads files using the HTTP/HTTPS and FTP protocols. The example below shows the use of wget along with the -O switch to save the destination file with a different name on the local machine:
wget -O report_wget.pdf https://www.offensive-security.com/reports/penetration-testing-sample-report-2013.pdf
Additional Options
wget has some other handy options that you should be familiar with.
Specifying an Output Directory
We can specify a directory to download a file to with the -P option:
Download Multiple Files
If you want to download multiple files at once, use the -i option, followed by the path to a local or external file containing a list of the URLs to be downloaded. Each URL needs to be on a separate line.
The following example shows how to download the Arch Linux, Debian, and Fedora iso files using the URLs specified in the linux-distros.txt file:
Resume a Download
We can also resume a download that has dropped off with -c:
Download in the Background
We can also run downloads in the background with the -b option:
Change the User-Agent String
We can also manually change the User-Agent string so wget can appear as any browser when downloading a file via the --user-agent= option:
Download a Local Copy of a Website
To create a mirror of a website with wget, use the -m option. This will create a complete local copy of the website by following and downloading all internal links as well as the website resources (JavaScript, CSS, Images):
If you want to use the downloaded website for local browsing, you will need to pass a few extra arguments to the command above:
The -k option will cause wget to convert the links in the downloaded documents to make them suitable for local viewing. The -p option will tell wget to download all necessary files for displaying the HTML page.
Skipping the SSL Certificate Check
If you want to download a file over HTTPS from a host that has an invalid SSL certificate, use the --no-check-certificate option:
Download to Standard Output (STDOUT)
In the following example, wget will quietly (-q) download and output the latest WordPress version to STDOUT(-O -) and pipe it to the tar utility, which will extract the archive to the /var/www directory.
Last updated
Was this helpful?