Linux: Scrape a website structure with wget

Sometimes you just need to grab (or scrape) all the content from a site quickly and easily, you can then extract what you actually need easily later.

Fortunately doing this with a tool like wget is easy!

wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --restrict-file-names=windows \
     --domains website.com \
     --no-parent \
         www.website.com

You can read up on all the options associated with doing this on the wget man page.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.