Development

Downloading all files and folders with Wget

Guide for downloading all files and folders at a URL using Wget with options to clean up the download location and pathname. A basic Wget rundown post can be found here.

GNU Wget is a popular command-based, open-source software for downloading files and directories with compatibility amongst popular internet protocols.

You can read the Wget docs here for many more options.

For this example assume the URL containing all the files (and folders) we want to download is here:

https://domain.com/uploads/

One file example

The simple command to download just one file is:

wget 'https://domain.com/uploads/file.zip'

This will download file.zip into the current directory as file.zip.

To download the file and save as a certain name:

wget -O 'example_name.zip' 'https://domain.com/uploads/file.zip'

Recursive downloading

Now to download everything, start by adding the recursive and no parent flags:

wget -r -np

The -r flag means recursive download which will grab and follow the links and directories (default max depth is 5). -npis “no parent” making only directories below the stated one fetched.

Optionally yet something I find appealing is adding -nH which will remove the hostname parent directory from the save path when downloading, you will see this below.

Assume we are in the wget_testing directory (wget_testing/)

wget -r -np -nH https://domain.com/uploads/

This will download all into wget_testing/uploads

Cut directories from save name path

Time to use another flag which is --cut-dirs which will cut the directory names away starting after the hostname.

wget -r -np -nH --cut-dirs=1 https://domain.com/uploads/

Downloads all straight into wget_testing/ notice how the uploads directory got cut.

Set a folder to download into

If you want to download into a folder use the -P flag:

wget -r -np -nH --cut-dirs=1 -P afolder https://domain.com/uploads/

Downloads all into wget_testing/afolder/

Don’t download index files

Avoid downloading all of the index.html files with the reject flag -R

wget -r -np -nH --cut-dirs=1 -R "index.html*" https://domain.com/uploads/

Downloads all into wget_testing/ with NO index.html files.

Run Wget in the background

Finally to run a command in the background use nohup before the get command and add & after it:

nohup wget -r -np -nH https://domain.com/uploads/ &

 

Share

Recent Posts

Kennington reservoir drained drone images

A drained and empty Kennington reservoir images from a drone in early July 2024. The…

1 year ago

Merrimu Reservoir drone images

Merrimu Reservoir from drone. Click images to view larger.

1 year ago

FTP getting array of file details such as size using PHP

Using FTP and PHP to get an array of file details such as size and…

2 years ago

Creating Laravel form requests

Creating and using Laravel form requests to create cleaner code, separation and reusability for your…

2 years ago

Improving the default Laravel login and register views

Improving the default Laravel login and register views in such a simple manner but making…

2 years ago

Laravel validation for checking if value exists in the database

Laravel validation for checking if a field value exists in the database. The validation rule…

2 years ago