How To Build A Macro To Download Webpages With Excel

When conducting research, it’s not uncommon to perform repetitive tasks like downloading a group of webpages. A favorite tool I use to automate this is iMacros — it’s free and powerful. Just be sure to use the Firefox version, as its capabilities are limited in other browsers. Say you want a list of the Fortune […]

Chat with SearchBot

When conducting research, it’s not uncommon to perform repetitive tasks like downloading a group of webpages. A favorite tool I use to automate this is iMacros — it’s free and powerful. Just be sure to use the Firefox version, as its capabilities are limited in other browsers.

Say you want a list of the Fortune 500 companies’ websites. Normally, this would be an arduous task. CNN’s Fortune magazine website links to the websites, but you have to visit every company profile on a separate page. Watch what I do with a spreadsheet and a macro.

Get a list of company page URLs from the CNN Money Fortune 500 list.

I use the Web Developer tool to get a list of links to all the company profiles. Just select Information and View Link Information.

View links on a webpage.

View links on a webpage

Then, copy the links you want to scrape.

A list of webpage links.

A list of webpage links

And, paste into Excel.

A  list of webpage links in Excel.

A list of webpage links in Excel

Add these formulas to row 1:

Column B: 1
You will assign a unique number to each row. This will be used to name your download files.

Column C: =”URL GOTO=” & A1
This is the command that will tell the macro what web page to visit.

Column D: =”SAVEAS TYPE=HTM FOLDER=c:\web-pages FILE=” & B1 & “.html”
This saves a copy of the web page on your hard drive.

Column E: WAIT SECONDS=3
This instruction tells the macro to pause for three seconds. I find a brief pause between pages helps keep the macro running smoothly.

In Excel, number column B 1 to n (1, 2, 3, 4, etc.).

Copy row 1 of columns C, D, and E then paste them into every row so your spreadsheet looks like this:

A Web browser macro in Excel.

A Web browser macro in Excel

You are now done with Excel and ready to create your macro. First, each cell has to become a separate cell. Below is my trick for accomplishing this.

Copy the macro in Excel, columns C, D, and E; only the completed cells, no empty cells.

Paste special as text into Word.

Formatting the Excel web browser macro in Word.

Formatting the Excel Web browser macro in Word

 

In Word, use Find and Replace. Find ^t, tabs, and replace with ^p, paragraph breaks. Replace All. Your macro will change to one line per instruction.

The formatted macro in Word.

The formatted macro in Word

Before you go further, you will need to create your download folder. In this tutorial, it is

C:/web-pages

Install iMacros for Firefox. Avoid using the Chrome and other browser versions as they lack important features. After installing, you will find the program in your View Sidebars menu.

iMacros for Firefox

iMacros for Firefox

Copy your macro from Word.

In iMacro, open #Current.imm, or any macro. It does not matter because you will give your macro its own name. Select Edit then Edit Macro. Paste your macro into the editor. Instead of the Save button, click on the disk icon – save-icon. Name your macro #Fortune.imm.

The iMacros editor.

The iMacros editor

Go back to the iMacro menu. Select Edit, Refresh Macro List.

Click on #Fortune.imm in the macro list. Select Play, Play (the big button).

Your macro will run and download all the pages into C:/web-pages.

And there you have it. This tutorial shows three tricks I use:

  • Using macros to automate repetitive tasks
  • Using Excel formulas to write my macros
  • Using Word to convert tabular Excel content into separate lines

I admit I am leaving you hanging because, at this point, you have a set of HTML files on your hard drive, but you do not yet have a list of links to the actual Fortune 500 websites. To automate that part, I use a PHP script that opens each HTML file and extracts the URLs using PHP’s pattern matching command.

I’ll share my PHP script in my next column.

I use macros to capture all sorts of information, including SEO tools data for multiple sites and even Google keyword tool suggestions. Macros make it a breeze to dig through lots of pages, and Excel makes the creation of macros almost effortless.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Tom Schmitz
Contributor
Thomas Schmitz is a longtime digital marketing professional who works with startups, SMBs, enterprise, media and not-for-profit organizations. Regarded as an expert in inbound and content marketing, search engine optimization and social media, Tom's an innovative growth creator and turnaround specialist. Follow Tom at @TomSchmitz.

Get the must-read newsletter for search marketers.