A Foolproof Approach To Writing Complex Excel Formulas

There comes a time in a marketer’s life when making pretty charts with a predefined dataset just doesn’t cut it. And finding the sum and average of a column of data just doesn’t satisfy you anymore.

Eventually — and it’s really inevitable — you will actually have to dive deep into the data and cull out a smaller dataset or manipulate it in some way to make it cough up what you need. And, as freaking intimidating as they can be, formulas become your lifeline in these moments.

Admittedly, the learning curve with formulas can be pretty steep, but the only way they’ll become intuitive is exposure to them and practice.

Excel Formulas Still Scare Me Yet I Did This (You Can Too!)

However, when I was first learning them, I’d see a formula like the one below, which I wrote to find the last directory in a URL (just to see if I could, not for any particular use), and think that I’d never ever be able to do anything like that. Turns out I was dead wrong.

advanced text extraction formula in Excel

Click for larger image

Before you throw in the towel, let me tell you a trick I first saw Bill Jelen (AKA Mr. Excel) do that makes writing formulas — even advanced formulas like this one — much simpler.

If you put this hideous monstrosity in front of me with no context and asked me to explain step by step what each part of that formula means, I probably wouldn’t be able to do it. Not because I copied it off the Internet. I can honestly say I didn’t get any help with it at all. And I purposely chose something kind of esoteric so I wouldn’t be tempted to look it up and to test myself to see if this approach would work for a really advanced formula. And it did. Like a charm.

So, enough pontificating. Let me show it to you in action. But first let’s cover some terminology. And don’t worry; we’ll start with a much simpler task than that data voodoo.

Definitions

Formula: Formulas provide some kind of instructions for Excel to calculate something. They always start with an equal sign. It can be as simple as =2+2 or ridiculously complex like the one above.

Function: In the simplest terms, functions are formulas that come prepackaged in Excel. I mean, you could create your own functions, but most of the time you’ll be using one of the 80 bajillion offered natively in Excel (unless, of course, you want percent difference, which Excel doesn’t offer … go figure). With the more advanced formulas you’ll write, you will use multiple functions in one formula.

Argument: Each function starts with the function name, followed by a set of parentheses, e.g., =SUM(A3:A67) or =CONCATENATE(B2,C2,D2). The individual elements inside those surrounding parentheses, separated by commas, are arguments.

Square brackets: Sometimes you’ll see arguments listed in square brackets in the tool tip or on the Microsoft site. The brackets indicate that the argument is optional.

Boolean: Returns a TRUE or FALSE value.

Helper cells: Okay, this isn’t an actual term; it’s what I call the cells you use to break down a complicated formula into easy-to-understand steps.

Download

If you’d like to download the Excel workbook I worked from in the following examples, you can access it here. I even included the crazy formula you see above. You’ll see on one tab I break the process down into simple steps (which is exactly what I did first thing) and then step by step with all the helper cells.

Then in the last tab I replace all of the helper cells with the formulas they contain and just kept sweeping right until I got to the final formula. But,for whatever reason, I kept coming up with one extra character that I didn’t need. I checked my formulas several times and tried retracing my steps.

Finally, I just subtracted 1 on the end, and it all worked as planned. At the end of the day that’s what matters, so I went with it.

Task 1: Extract Domain From URLs

Let’s say you have a list of URLs (perhaps backlinks), and you want to extract just the domain. I did this recently so that I could use that column in a pivot table to group all target URLs that were linked to from a particular domain.

For example, let’s say www.linkingsite.com linked to www.mysite.com/landing-page-02 and www.mysite.com/landing-page-45. The two landing pages would show up under the www.linkingsite.com domain in my pivot table, the way I had it organized.

pivot table in Excel

Click for larger image

So, how’d I extract those? Easy. But first some background.

Important Stuff You Need To Understand

When you need to extract a string of text from a longer string in Excel, the three primary functions you want to use are LEFT, RIGHT, and MID. You can get tricky with REPLACE, which works a lot like Find and Replace in Excel or Word, but we won’t get into that here.

So basically what each of these functions does is asks you:

  1. What cell you want to extract the text from
  2. How many characters you want to extract

With the MID function you also specify the starting place because, well, you’re pulling from the middle of the string.

That’s all well and good, but with our column of URLs the number of characters we need to extract changes from URL to URL, so we can’t use a static number. That’s where the SEARCH function comes in.

Quick aside: If you’ve dabbled in formulas much, you might be asking why I don’t use the FIND function. Well, the SEARCH and FIND functions are very similar, only the FIND function is more limiting in that it’s case sensitive and it doesn’t support wildcard characters. Lame. So the only time I use FIND is when I want to specify case, which the last time I did that was last … Lessee here … Yeah, never.

So what we’re going to do is use the SEARCH function to specify the number of characters we need because it returns the position of whatever you ask Excel to find. If it doesn’t find what you’ve searched for it returns a nasty #N/A error, which we’ll actually USE in the next example. However, for the task at hand, what we need is the position of the first forward slash after the domain.

Buuut, there’s a problem.

See those two forward slashes before the domain (http://)? Yeah, those are going to trip us up. But no worries. The SEARCH function has an optional argument for the starting number (start_num), which we’ll use to our benefit.

The Playas

So for our formula we’re going to combine two functions: LEFT and SEARCH. Here’s the syntax for each:

LEFT(text, [num_chars])
SEARCH(find_text,within_text,[start_num])

Okay, so let’s get poppin here …

The Strategy

So, the secret sauce is we’re going to split the formula into two steps. Then when we’re finished, we’re going to combine them into one formula.

Excel helps you each step of the way by highlighting the argument you’re working on at the time. Here are what we’re going to use for each:

find_text: “/” This says we’re looking for a forward slash.

within_text: B3 This is the first cell we’re going to extract the domain from. When we finish, we’ll hover over the bottom-right corner of the cell to drag the formula down the column.

start_num: 9 We need to just choose a number that’s after the last / before the domain. We could have used 8 here, but if any of the URLs are secure, you would need one more character. Check it out for yourself.

https://* <– The * is in the 9th position, which is where we want to start looking for the / that comes after the domain.

So, the final formula looks like this:

SEARCH function in Excel

Click for larger image.

 

This tells us what position each of the forward slashes directly following the domain is in. And we’ll use that to provide the number of characters to extract in the LEFT function.

Here are the arguments we’ll use for the LEFT function:

text: B3

[num_chars]: C3

So, the final formula looks like this:

LEFT function in Excel

Click for larger image.

Now, all we need to do is combine the two so that instead of referencing cell C3, we’re embedding the SEARCH function into the final formula. To do that, just hit the Esc key to exit out of the cell you’re in, then copy the SEARCH function to your clipboard (not the = sign, though).

Next, hit the Esc key again to back out of that cell, and go back to the cell that houses your final formula and replace the C3 reference with the formula you just copied from C3. Your formula should now look like this:

LEFT function in Excel

Click for larger image.

Task 2: Compare Two Datasets

In my audit Google Doc, one of the tasks I include is a check for orphan pages on the site (on the Links tab). One way to check for those is to compare your site’s sitemap against a site crawl, which can be performed with a tool like Screaming Frog. (Disclosure: I’m not affiliated with Screaming Frog in any way.)

If you find that there are URLs in your sitemap that weren’t found in the Screaming Frog crawl (or whatever tool you use), there’s a good chance that these are orphaned or blocked pages. Either way, they warrant investigation and probably shouldn’t be in your sitemap.

Playas

For the formula, we’ll use the following functions:

IF: IF(logical_test, [value_if_true], [value_if_false])

ISNA: ISNA(value)

MATCH: MATCH(lookup_value, lookup_array, [match_type])

For the sake of time, I won’t go into all the details about what these mean. Each of the links takes you to the page on the Microsoft site that explains all about that particular function.

Since this formula requires a couple more steps, I’ll break them out into steps.

Step 1: Look to see if the URL in C2 is anywhere in Column B.

The MATCH function is just another lookup function Excel offers, and it’s especially versatile. But, what it returns is the row number of whatever you’re looking up.  If it can’t find it, it returns an #N/A error (which we’ll use).

MATCH function in Excel

Click for larger image.

Step 2: ISNA simply returns a Boolean value that just lets you know if the value in the cell you’re referencing is an #N/A error.

ISNA function in Excel

Step 3: Write an IF statement that assigns the value “Orphan” if the URL isn’t in the Screaming Frog column (in other words, is returning an #N/A error) and “Found” if it is found.

IF function in Excel

Step 4: Replace every helper cell reference with the formula inside that cell (everything but the =). If you spread your helper cells from left to right across your spreadsheet, like I do, start to the far-left and work your way to the right.

If you spread them out vertically, start at the top and work your way down to the last formula you wrote. That way, you won’t accidentally miss any helper cells.

complex formula in Excel

Resources

The definitive guide to Excel for marketers is Distilled’s Microsoft Excel for SEOs guide. It’s a great jumping-off point if you want to start your journey by focusing on functions marketers use most.

If you really want to go hardcore, I highly recommend Bill Jelen’s YouTube channel. His playlists are especially helpful if you’re looking to concentrate on a particular, weak area. Another podcast I really like is Mike Girvin’s ExcelIsFun YouTube channel.

I also regularly publish Excel video tutorials on my blog.

Final Thoughts

If you didn’t understand all of the nuances of the two formulas we worked with, don’t worry about it. Just grasp the principle of using helper cells to test your functions along the way and to break your process down into easily, digestible steps. I used to try to jam everything into one cell and would find myself all tangled up in the formula, even with the tool tips.

Now, I’m much more comfortable with navigating my way through somewhat complex formulas, but when they get too tortuous, I start parsing pieces into their own cells.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: Analytics | Search & Analytics

Sponsored


About The Author: is an SEO and analytics consultant. Her areas of expertise are analytics, technical SEO, and everything to do with data — collection, analysis, and beautification. She’s on a mission to rid the world of ugly data, one spreadsheet at a time. If you just can’t get enough data visualization tips, you can check out her blog, Annielytics.com.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.facebook.com/wksears William Sears

    Hi, Annie! Thanks for this. You know we are both hardcore Excel fans. :o)

    Funny I was just working up the formula to extract domains from URLs myself (for the zillionth time) a couple of days ago, and this time I took a second to keep the formula handy in Evernote for the next time I need it. I see our approaches are pretty similar.

    I do it in a Table, so the URL should be in any column titled “Website” (which is the what it is by default in an AWR export).

    This is the way I did it:

    =IFERROR(LEFT(SUBSTITUTE(SUBSTITUTE([@Website],”http://”,”"),
    “https://”,”"),FIND(“/”,SUBSTITUTE(SUBSTITUTE([@Website],”http://”,”"),
    “https://”,”"))-1),SUBSTITUTE(SUBSTITUTE([@Website],”http://”,”"),”https://”,”"))

    This method will work irrespective of whether or not the URL (“Website”) ends in a trailing slash.

    I’m sure there is way to tighten this up. I repeat chunks of it in order to wrap it in the IFERROR that traps for URLs that don’t end in slashes.

    I might make it into a function also, so it’s handy at all times.

    Another way to seriously clean up formulas and make them easier to understand is to use named ranges instead of cell references–but I’ll leave that for another day.

  • http://twitter.com/LeeLKennedy Lee L Kennedy

    Love the post, Annie. I always feel like I’ve accomplished something when I hit the max number of nested functions in Excel :).

    This is the perfect type of post to send to beginner Excelers to show them the power of nested functions. You could even introduce the Name Manager to them to clean up the final formula if you’re feeling adventurous!

  • http://twitter.com/smichaelgriffin Michael Griffin

    Great info! Really wish I had this a few years ago when I (painfully, slowly) started digging into Excel formulas.

  • http://www.annielytics.com/ Annie Cushing

    William! So good to see you here! My formulas bring all the boys to my yard.

    Anywaaay …

    Yeah, I *always* work from a formatted table. I just didn’t with this demo b/c I wanted the formulas to look less intimidating. But w/ your formula, it wouldn’t matter if a URL ended with a trailing slash or not, if you’re just extracting the domain out. It’s a gorgeous formula, nonetheless! That’s the amazing thing about Excel: so many paths can lead to the same outcome.

    My crazy formula for extracting the last directory though won’t work if there’s not a trailing slash. That’s how it detects it’s a directory. I thought about trying to compensate for it, but my brain was mush by that point.

  • http://www.annielytics.com/ Annie Cushing

    Me too!

  • http://www.annielytics.com/ Annie Cushing

    Yeah, I name my ranges for things like lookups to simplify the formula. Outside of those I usually just go grab what I need. I should probably get better about that.

  • Doc Sheldon

    Awesome stuff, Annie! I took one look at your formula at the beginning, groaned and backed away slowly. But you explained it beautifully! Who says you can’t teach an old dog new tricks? Thanks for sharing the wisdom!

  • http://www.annielytics.com/ Annie Cushing

    You know, that was my hesitation in starting off with a beastly formula. But if you download the Excel file and see the individual steps, it’s pretty simple. In any case, glad it helped in some way, Doc! :)

  • http://www.ezmaal.com/ hyderali

    Hi Annie,

    Thanks for this excel tips. I’m a n00b in excel & finding little bit hard the one you shown in the post. But I’ll definitely try it & see where I’m in the world of excel.

  • http://twitter.com/MaryKayLofurno Mary Kay Lofurno

    Hi Anne,

    Thanks for the great article, its a keeper. I have been trying to buckle down in excel and bought an elearning course on advanced excel. My goal is to go through all your articles that I have bookmarked once I finished the course.

  • http://www.annielytics.com/ Annie Cushing

    Yeah, like I said in the post, even if you don’t understand what I did in those formulas, the principle of using helper cells still stands. Try it with simpler formulas, and build your confidence. Then go big.

  • http://www.annielytics.com/ Annie Cushing

    Good luck, Mary Kay! I think time spent learning Excel is time well spent.

  • http://www.annielytics.com/ Annie Cushing

    Yeah, good point! I’ve just been using the IF(ISERROR()) for so long. Some habits are hard to break …

  • http://www.annielytics.com/ Annie Cushing

    I store formulas by appending them with the single quote, but I don’t use them when I’m sculpting my formulas. Instead I use F9 to evaluate portions of a complex formula to convert them to arrays. You just have to remember to press Ctrl-Z to change it back to a formula.

  • http://twitter.com/dergal Gerry White

    for domains, seotools for the win! – niels bosma – I know that the point of the article was to teach people how to do it, but sometimes you got to be lazy! – Love using tables and pivots, so many cool functions within Excel no one knows!

  • http://www.annielytics.com/ Annie Cushing

    Very true!

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide