How to Extract Images from A .webarchive File Using Terminal

One way to extract content from a .webarchive file is through the Terminal app. The command to use is textutil convert -html.

Saving webpage as a .webarchive file in Safari

The Ariel Atom and Ariel Nomad are unique and beautiful cars. In fact, they are so unique and beautiful I wanted to use images of them for desktop wallpapers.

While there must be other places to look for the images, for me the most obvious choice was Ariel North America’s website. They make the cars, so if anyone’s got great photos it’s them.

A lot of the high-resolution images on the website are available for download and they make for amazing wallpapers too. You just simply need to click on them.

That’s the easy part.

The difficult part is getting those beautiful images the site uses in the banners. You can click on them all you want, but you just don’t get a download or save option.

By now I should mention I was using Safari on my Mac.

First, I tried to save the whole web page in hopes I could somehow maybe get a Zip file with the images in it. However, Safari only allows web pages to be saved in two formats, Page Source and Web Archive. The first one merely saves the source text, while the second one saves the source text as well as the images and other contents. I went with the second format and got a .webarchive file.

Saving Page with Safari Format One: Page SourceSaving Page with Safari Format One: Page Source
Saving Page with Safari Format Two: Web ArchiveSaving Page with Safari Format Two: Web Archive

Now the .webarchive file is not something that you can open using the stock Archive Utility app, or the commonly used third party app “The Unarchiver”.

Also, you might wonder if it helps to change the file’s extension name to .zip. Well, it doesn’t. 

Extracting content from .webarchive file

One way to extract the images from the .webarchive file is through the Terminal app. Here’s how.

1 – First, press Command and space to launch Spotlight, type in “Terminal” and hit Enter. That’ll open the Terminal app.

2 – Then type in this command without hitting Enter. Note there is a space at the end after “html”. 

textutil -convert html

3 – Lastly, drag the .webarchive folder into Terminal. Hit Enter and you’ll find all the extracted files in the same folder where the .webarchive file is located. Using Terminal Command to Extract Images from .webarchive FileOh, one more thing, did I mention I was using Safari to visit the site and download images? It turns out it’s much easier with Google’s Chrome.

With Chrome, you can simply save the web page using the “Webpage, Complete” format and that’ll give you a folder with all the page content in it, including the images.

Saving Web Page with Chrome

Lesson of the day?

If there’s anything that I learnt from this experience, it’s that Chrome is a better web browser. I mean, Safari and Edge are great too, especially when you use them to download Chrome.

7 thoughts on “How to Extract Images from A .webarchive File Using Terminal”

  1. The Terminal app did the trick! Thank you very much for the crystal-clear instructions!

  2. Well, the conclusion is not really true. There are many pages that can’t be downloaded from chrome, but can be downloaded as webarchives in safari. So, when chrome works it is good, but there are many scenarios where it doesn’t

  3. Thanks, this is very helpful. The big benefit to this approach is that you get images in a usable format. Chrome and Firefox will download webp files if the site uses them, and then you need to futz around with other tools to be able to view those on a Mac. Using this approach is simpler, IMHO.

  4. Do you mind if I quote a few of your posts as long as
    I provide credit and sources back to your website? My website
    is in the exact same niche as yours and my visitors would really benefit from some of
    the information you present here. Please let me know if this okay with you.
    Thanks a lot!

Leave a Reply

Your email address will not be published. Required fields are marked *