Sign up for our daily recaps of the ever-changing search marketing landscape.
Machine Learning With AdWords Scripts And Google Prediction API
In this helpful how-to, columnist Russell Savage explains how to use the Google Prediction API in conjunction with AdWords scripts to glean insights about your PPC data.
For many of us, analyzing our AdWords data starts with downloading a massive .CSV file into Excel and running various calculations and building charts. After that, we turn into fortune tellers, trying to read the analytical tea leaves of our data and predict what changes to make.
That analysis is time consuming, difficult, and biased by our personal experiences and emotions. Machine learning can help us fix that. Today, we are going to use the Google Prediction API and AdWords Scripts to predict the future.
Asking For A Prediction
With the Google Prediction API, you no longer need a dedicated team of PhDs to build and maintain an analytical model for your pay-per-click (PPC) data. All you need to do is format and push your data in, then ask it for a prediction. The more data you can provide, the more accurate that prediction should be.
The world of machine learning seems a little daunting at first, but I’m going to give you a crash course to help you get started quickly. I’ll start by saying that I’ve never taken any advanced statistics courses or programed anything in R, but I am able to use the Prediction API without a problem — and you can, too.
We need to start with a question, or something that we want our model to be able to predict. I’m going to build a model that is able to predict what the average CPC will be for a given temperature, wind speed, and weather condition in my account. Of course, we all know that weather impacts our bids, but this will tell me exactly how much of an impact I should expect.
Collect Historical Data
In order for my model to make a prediction, just like a student, I need to teach (or train) it with examples. That means I will need to collect historical weather data for my account. This will allow my model to understand the relationships between the data. It will use those training examples to return a prediction for a new query it has never seen before.
In this post, we are going to be writing two scripts at the account level. The first one is simply to gather and store training data for the account. The second one will use that training data to build and update a model.
The training data is made up to two parts: the example value, which will be the answer returned, and a set of features. In this example, my value is going to be the average CPC for a specific location and the features are going to include all the information I know at the time. (This is just an example, so I don’t have all the data, but this should get you started.)
On to script one. We’re looking at weather is based on location, so a good place to start would be a function that pulls the Geo Performance Report. We can use that data to get an idea of where our traffic is coming from.
Of course, if you have very specific campaign targets in your account, you could simply supply a list of locations you are interested in, but where’s the fun in that? Here is a function to help us grab the performance data by geo.
You can use that function to grab data for any date range you want. For the initial data pull, you may want to look back 30 days or more. After that, you can schedule the script to run daily to continue collecting new information.
For pulling historical weather data, I am going to use Weather Underground. It is free to sign up and get started with an API key, but you will hit the limits pretty quickly. Another option is the Open Weather Map API, but I found it a little more confusing to use. We are just trying to get some training data, so the limits aren’t as important right now. I have added variables and caching to the final version of the script to help deal with any limits you might run in to.
We will need to translate the locations in the AdWords report into locations that Weather Underground can understand. For that, we can use their AutoComplete API. The following code uses the CityCriteriaId, RegionCriteriaId, and CountryCriteriaId information from the geo report and looks up the weather location URL to use with Weather Underground.
One thing you will notice is that the AdWords Geo report returns the full country names, while Weather Underground uses only the two-digit ISO country code. Here is a quick function that will build a mapping of full country names to two-digit codes based on the data from Open Knowledge.
This code utilizes caches to speed up the lookups for city and reduce API calls. Once we find the city, we store it in the CITY_LOOKUP_CACHE variable so that we don’t need to request it again.
Now that we have the geo data from AdWords and the location information from Weather Underground, we can look up the historical weather data for the location. The following function looks up the historical weather information for the given date and location. Again, we are using a cache to limit the number of calls to the API.
When we put all these pieces together, we have a script that we can run daily which will store our training data in a spreadsheet, which we can then access from our modeling script. Here is the main function and some more helpers to tie things together.
Now that we have a script to build and continuously add to our training data, we can create the second script that will actually build our model. To do this, we need to enable the Prediction API for our script under the Advanced APIs button and follow the link to the Developers Console to enable it there as well.
Once that’s done, we can create our model. The following code will pull the data from the same spreadsheet we built in part one and create a model.
One thing to note is the field for ignoring columns from our training data. When a field is unique for every row, such as a date, it doesn’t really help us with a prediction. Also, items that have the same level of uniqueness, such as Campaign Name and Campaign Id, don’t add much either. Many of these actually make your model a little less flexible because you will need to pass those values in with your query. So, I have ignored any column which does not impact the average cost-per-click (CPC).
I have also excluded values such as impressions, clicks and cost, only because those are items I won’t know when I query the model. Of course, you could pass desired values for these fields in your query to see how the output reacts. There are plenty of things to play around with here, and you can create and train as many model variations as you like if you want to compare performance. Just change the names.
The code to create the model only needs to run once, then it should be disabled. You can call it from the main function like this.
Now that your model has been created, you can continue to update it as more data is added to your training spreadsheet. You can use the following code, which is very similar to the training function, to add new training data to your model on a daily basis.
Just make sure that you don’t continue to update your model with previous training data. Once the data is in the model, you should move it to a different spreadsheet or just clear it.
Now we can start to query the model pretty easily. The following code will accept an array of queries (which are just arrays of values, the same as the features in the training data) and return the prediction results for each row. One way to test your model is to grab a chunk of your training data, remove the output column, and pass it to this function.
And that’s exactly what I do here as an example. Our main now looks like this:
And that’s all there is to it. If you made it this far, then congratulations! You will still need to tweak and test the model to make sure the data being returned makes sense. If not, take a look at the training data and see if there are things in there that don’t make any sense.
It’s pretty amazing to think that what used to take a team of PhD data scientists months to produce can now be built using AdWords Scripts and a few lines of code. And this is only the beginning. You can use any external data you want to build out your model. You can now go from “if greater than x, do y” to using the output from your own machine learning algorithm to determine your actions.
Of course, with great power comes great responsibility. It will take time and loads of data before your model is a good predictor of behavior, so you better start training now!
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.