Use ColdFusion to track your Google ranking! (while obeying the

If you haven’t jumped on the SEO (Search Engine Optimization) bus then you’re getting left behind. In this fast-paced e-world it is extremely vital to know where your web pages rank at all times in the search engines to determine what you need to do to optimize your site to rank better. Never before has the saying "here today, gone tomorrow" had such a powerful truth to it. For websites relying on visibility to turn profit, being in the coveted "top 10" could mean life or death to the company.

There are many tools out there to help with SEO, but some are costly, and some just don’t give you what you need. And besides, we’re developers and it’s cool to make our own stuff right!

About this tutorial
This tutorial will not explain all the "tricks of the trade", but will show you how easy it can be to track your site ranking for your targeted keywords in Google using some pretty basic to intermediate ColdFusion coding instead of manually checking or buying expensive software. This is an intermediate tutorial, so some advanced knowledge of complex arrays, structures, cfc’s and webservices are assumed.

What You’ll Learn
In this tutorial I will show you how to create a CFC (ColdFusion Component) with a few UDF’s (User Defined Functions). You will also learn how to invoke and interact with both CFCs and Webservices as well as some very handy techniques such as storing and retrieving data from complex arrays with nested structures, functions and other arrays. To make it a little more "real-world" we will be developing an application that will use the Google SOAP Search API to determine your sites search engine positioning.

Google requires developers to use this API when sending automated queries to its search engine, violators may be penalized/banned from the search (and if found related to your sites could possibly effect your ranking).

NOTE: You may only run 1000 queries per day, so if you are targeting many keywords you may need to break this down into multiple runs and span it across multiple days. Also, the Google API will only return 10 results at a time. That means if you are searching the top 3 pages it will cost 3 queries.

Prerequisites
To begin you will need to download the Google SOAP Search API, create a Google Account and obtain a license key (which Google will provide once you have signed up).

Off we go!
Once you have the prerequisites you can continue below to start developing your ColdFusion wrapper.

First you will create a folder structure under your web root. For this tutorial I will use: c:\inetpub\wwwroot\seo\. (see structure below)
Listing 1

Eextract the Google API wsdl file (GoogleSearch.wsdl) to the "ws" folder you just created. This is the webservice/API you will invoke to interact with the Google search function.

The Files
We will create two files in this tutorial to do all the work we need: google_API.cfc and google_ranking.cfm. Google_API.cfc will contain the functions to interact with Google’s API and format the results. Google_ranking.cfm will simply call the functions from google_API.cfc and display the results.

Now let’s create the CFC that will be the heart of this utility. Create a file named google_API.cfc in your "cfc" folder. The ".cfc" extension will inform ColdFusion that this is a ColdFusion Component. This CFC will contain 3 functions that will perform everything we need to search, gather and display the results to the screen. These functions are: getGoogleRanking, doGoogleSearch and makeTable. These functions are not by any means meant to be an extensive "do-it-all" solution, but rather a very basic, easy to digest starting point that you can adapt to your own situation. You could easily alter this to store the results in a database, or pass it to other functions/applications for further analyzing and reporting.

The first thing we need to do in the google_API.cfc is to add the <cfcomponent> tag block.

<cfcomponent>

</cfcomponent>

The functions in this file will be nested inside of the <cfcomponent> tags.

Now we will create the first function in our component. Create a function between the <cfomponent> tags named "doGoogleSearch". Set the return type to Array, which is the format in which the Google API returns the search results. Also, set the access to private. This will make this function only available to other functions in this component. By now you should have something like the following:

doGoogleSearch
<cfcomponent>
    <cffunction name=
"doGoogleSearch" access="private" returntype="array">

    </cffunction>
<cfcomponent>

If in the function declaration we specify a returntype we must return a value using <cfreturn>, otherwise an error is thrown. So let’s create our return container "arrResults" a blank one dimensional array:

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">

        <cfset arrResults = arrayNew(1)>

        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

Now this function will work, returning an empty array. To make this function useful, and flexible we need to create some arguments. Using arguments will allow us to change the behavior of the function based on the values we pass in those arguments. Create the following arguments just below our arrResults declaration:

<cfargument name="q" type="string" required="yes">
<cfargument name="start" type="string" required="yes">

"Q" is the key-phrase or query that you are submitting to the Google API. This is the keywords you are hoping to find your pages rank for.
Start is the results page number we are fetching from the Google API. This will let us focus on a single page at a time and extract the results just as they would be seen in a user’s browser.

Now the fun part. Let’s invoke the Google API. The Google API is actually a webservice, and the wsdl file was included in the API download we did earlier. Invoking the webservice will essentially make it’s public functions available to us within our application. Just below your declarations, and just above your <cfreturn> invoke the Google API’s doGoogleSearch method as in the example below:

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">
        <cfset arrResults = arrayNew(1)>
        <cfargument name="q" type="string" required="yes">
        <cfargument name="start" type="string" required="yes">

        <cfinvoke
                webservice=
"http://localhost/seo/ws/GoogleSearch.wsdl"
                method=
"doGoogleSearch"
                returnvariable=
"aGoogleSearchResult">
                    <cfinvokeargument name=
"key" value="{YOUR KEY HERE}"/><!--- Change value to your license key --->
                    <cfinvokeargument name=
"q" value="#arguments.q#"/>
                    <cfinvokeargument name=
"start" value="#arguments.start#"/>
                    <cfinvokeargument name=
"maxResults" value="10"/>
                    <cfinvokeargument name=
"filter" value="false"/>
                    <cfinvokeargument name=
"restrict" value=""/>
                    <cfinvokeargument name=
"safeSearch" value="false"/>
                    <cfinvokeargument name=
"lr" value=""/>
                    <cfinvokeargument name=
"ie" value="latin1"/>
                    <cfinvokeargument name=
"oe" value="latin1"/>
        </cfinvoke>

        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

The method is the method/function we are going to invoke from the webservice, and the returnvariable is the variable that will contain the results from the call to doGoogleSearch.

The <cfinvokeargument> tags essentially act in the same fashion as the <cfargument>, but are passed to the webservice. The above arguments are required by the Google API. Notice that we are using #arguments.q# and #arguments.start# as argument values in the <cfinvokeargument>s. Arguments is the named scope that ColdFusion uses to access the values of <cfargument>s.

NOTE: You must replace "http://localhost/seo/ws/GoogleSearch.wsdl" with your actual url to the wsdl file and you must replace "{YOUR KEY HERE}" with your Google API Key.

The last thing we need to do in this function is to populate the return array we created earlier (arrResults) with the results from the doGoogleSearch method. To see what the entire result set looks like, do a cfdump just before the <cfreturn>:

<cfdump var="#aGoogleSearchResult#"><cfabort>

Note that this returns an array of structures which some are functions. These functions will be used to extract the actual data later. By passing the array to the calling page we make those functions available to it and can use them to extract the data we need. I’ll cover this a bit later.

Now, remove the <cfdump> and let’s populate arrResults with the aGoogleSearchResult.getResultElements() call. This will return an array of results from the Google API.

<cfset arrResults = aGoogleSearchResult.getResultElements()>

This function is now complete. Your code should look like this:
Completed doGoogleSearch Function

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">
        <cfset arrResults = arrayNew(1)>
        <cfargument name="q" type="string" required="yes">
        <cfargument name="start" type="string" required="yes">

        <cfinvoke
                webservice=
"http://localhost/seo/ws/GoogleSearch.wsdl"
                method=
"doGoogleSearch"
                returnvariable=
"aGoogleSearchResult">
                    <cfinvokeargument name=
"key" value="{YOUR KEY HERE}"/><!--- Change value to your license key --->
                    <cfinvokeargument name=
"q" value="#arguments.q#"/>
                    <cfinvokeargument name=
"start" value="#arguments.start#"/>
                    <cfinvokeargument name=
"maxResults" value="10"/>
                    <cfinvokeargument name=
"filter" value="false"/>
                    <cfinvokeargument name=
"restrict" value=""/>
                    <cfinvokeargument name=
"safeSearch" value="false"/>
                    <cfinvokeargument name=
"lr" value=""/>
                    <cfinvokeargument name=
"ie" value="latin1"/>
                    <cfinvokeargument name=
"oe" value="latin1"/>
        </cfinvoke>


        <cfset arrResults = aGoogleSearchResult.getResultElements()>
        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

Now we’ll create the getGoogleRanking function that will call the doGoogleSearch function we just finished. The getGoogleRanking function will grab the results from the doGoogleSearch and check for the specified domain, then output the ranking results to the screen.

Create the getGoogleRanking function as shown below (remember to create the function within the <cfcomponent> tags:

getGoogleRanking
<cffunction name="getGoogleRanking" access="public" returntype="struct">

</cffunction>

This defines the function getGoogleRanking as public, which means it can be called from outside this component, and will return a value in the format of a struct. Like before, this function by itself will not work because we are not returning anything. So, before we go any further let’s declare the return variable "structReturn":

<cffunction name="getGoogleRanking" access="public" returntype="struct">
    <cfset structReturn = structNew()>
    <cfreturn structReturn>

</cffunction>

Now the function would return a blank structure. Our next step is to populate that structure with the Google ranking data we are looking for. To do this we need to setup a few arguments that we will use in this function. Make the following arguments just underneath your structReturn declaration:

<cfargument name="maxpages" type="string" required="yes">
<cfargument name=
"keyword" type="string" required="yes">
<cfargument name=
"domain" type="string" required="yes">

Maxpages will be used to tell the function how many pages deep to parse from the Google results. I would recommend limiting it to 2 as each call to the Google API costs 1 query, of which there are only 1000 per day available.
Keyword is just that, the keyword you are checking to see if your site ranks for. This is the keyword you would expect a Google searcher to type in to find the product/services/information you are trying to promote on your site.
Domain is the domain you are looking for when you search for Keyword. You could use www.site.com for just the www subdomain to be searched for or you could use site.com to search for all subdomains under that domain.

Okay, now that we have the initial declarations setup, let’s start searching the Google index. Remember, we only want to look at the first "maxpages" results pages, so we will loop from 1 (first page) to maxpages calling the Google API on each iteration, while specifying the page number. So, underneath your declarations we just made, and above the <cfreturn> create a loop like the following:

<cfloop from="1" to="#arguments.maxpages#" index="i">

</cfloop>

Now within this loop is where all the fun begins. First we are going to call the doGoogleSearch function we created earlier and store the results in the arrResults array. This will "pass" the array and all the nested functions returned by the Google API, which we will use to extract the result data. Remember that we created 2 required arguments in the doGoogleSearch function: "Q" and "start". We will need to pass in the query (Q) which is the arguments.keyword argument we declared as a required argument in this function and the results page (start), which is the current value of the loop’s index (i). This way it will be called once for every page.

<cfset arrResults = doGoogleSearch(arguments.keyword,i)>

Okay, the doGoogleSearch returned the search results in a complex array: arrResutls. To find out our domain’s ranking we need to check each result to see if it is from our domain. To do this we will loop through the array and check each result for arguments.domain. So, create a loop nested inside the loop we just created. The loop will need to iterate through the arrResults array to extract each result. Your loop should look like this:

<cfloop from="1" to="#arguments.maxpages#" index="i">
    <cfset arrResults = doGoogleSearch(arguments.keyword,i)>

    <cfloop from="1" to="#arraylen(arrResults)#" index="idx">

    </cfloop>
</cfloop>

Now, for each index of the array we check for our arguments.domain. This is done by calling the getURL() function returned in the array from the Google API. This should look like:

<cfif FindNoCase(arguments.domain,arrResults[idx].getURL())>

</cfif>

If the domain is found, then we use the Google API functions to grab the data and store it in the structReturn we declared at the beginning of this function. This will be passed back to the calling page. The code should look like this:

<cfloop from="1" to="#arguments.maxpages#" index="i">
    <cfset arrResults = doGoogleSearch(arguments.keyword,i)>
    <cfloop from="1" to="#arraylen(arrResults)#" index="idx">
        <cfif FindNoCase(arguments.domain,arrResults[idx].getURL())>

            <cfset structReturn.keyword = arguments.keyword>
            <cfset structReturn.url = arrResults[idx].getURL()>
            <cfset structReturn.title = arrResults[idx].getTitle()>
            <cfset structReturn.snippet = arrResults[idx].getSnippet()>

        </cfif>    
    </cfloop>
</cfloop>

Now that we have the results data, we can determine our position. To do this we will need to know what page we are on, and what iteration of the array we are on. The array will have 10 iterations, since the Google API will only return 10 results at a time. So on the first page we know that the position (idx) is the actual iteration of the arrResults loop; 1-10. However, if we are beyond the first page we need to calculate our position. So we’d need an if block and two different ways of calculating the current position. Instead, let’s just write a formula that will calculate our position regardless of the page we are on. To do this we need to add the current iteration (idx) to the current page (i) minus 1 multiplied by 10: idx + ((i-1)*10) = current position. So if we find it on the second page on the 4th result we would have 4 + ((2-1)*10), or in the 14th position.

The code for this looks like the following:

<cfset pos = idx + ((i-1)*10)>

Now we simply add the pos value to the structReturn structure as such:

<cfset structReturn.position = pos>

Okay, we have our position and there is no need to look any further. In order to spare precious CPU time and our limited number of queries we can per day perform we need to break out of both of the loops. To do this, add a <cfbreak> just after you set the structReturn.position variable. This will break out of the arrResults loop. Now we need to also break out of the page loop by the same means. However, if we just put a <cfbreak> it will break out on the first iteration, always. So, to make sure it’s time to break out we’ll simply test the structReturn variable to see if it has data. If it does, break out. Your code should be just under the closing cfloop tag for the arrResults and should look like the following:

            <cfbreak>
        </cfif>
    </cfloop>
    <cfif not StructIsEmpty(structReturn)>
        <cfbreak>
    </cfif>
</cfloop>

Almost done. Now what if we are not found in the pages we searched? We need to return something, or our function will fail, right? What we need to do is test the structReturn again. If it is empty, we’ll populate it with some data to pass back to the calling page. So below the last closing cfloop tag, do another if block and set each of the structure elements that we will return.

<cfif structIsEmpty(structReturn)>
    <cfset structReturn.keyword = arguments.keyword>
    <cfset structReturn.url = "No pages found">
    <cfset structReturn.title = "&nbsp;">
    <cfset structReturn.snippet = "&nbsp;">
    <cfset structReturn.position = 0>
</cfif>

That’s it. Here’s the entire getGoogleRanking function:

Completed getGoogleRanking

<cffunction name="getGoogleRanking" access="public" returntype="struct">
    <cfset structReturn = structNew()>
    <cfargument name="maxpages" type="string" required="yes">
    <cfargument name="keyword" type="string" required="yes">
    <cfargument name="domain" type="string" required="yes">
    <cfloop from="1" to="#arguments.maxpages#" index="i">
        <cfset arrResults = doGoogleSearch(arguments.keyword,i)>
        <cfloop from="1" to="#arraylen(arrResults)#" index="idx">
            <cfif FindNoCase(arguments.domain,arrResults[idx].getURL())>
                <cfset structReturn.keyword = arguments.keyword>
                <cfset structReturn.url = arrResults[idx].getURL()>
                <cfset structReturn.title = arrResults[idx].getTitle()>
                <cfset structReturn.snippet = arrResults[idx].getSnippet()>
                <cfset pos = idx + ((i-1)*10)>
                <cfset structReturn.position = pos>
                <cfbreak>
            </cfif>    
        </cfloop>
        <cfif not StructIsEmpty(structReturn)>
            <cfbreak>
        </cfif>
    </cfloop>
    <cfif structIsEmpty(structReturn)>
        <cfset structReturn.keyword = arguments.keyword>
        <cfset structReturn.url = "No pages found">
        <cfset structReturn.title = "&nbsp;">
        <cfset structReturn.snippet = "&nbsp;">
        <cfset structReturn.position = 0>
    </cfif>
    <cfreturn structReturn>
</cffunction>

So far we have a function to interface with the Google API (doGoogleSearch) and a function that calls doGoogleSearch to collect the ranking data. Now we need a function to format the output and display it to the screen. For this tutorial we will make a simple function that outputs the data in a simple table format.

Create a function, still within the <cfcomponent> tags, named makeTable. This will be a public access function and will return a Boolean value (true/false). We will declare one required argument; tableData which will be an array of positioning data we gather from the previous functions. It should look something like the following:

<cffunction name="makeTable" access="public" returntype="boolean">
    <cfargument name="tableData" type="array" required="yes">

    <cfreturn true>
</cffunction>

Now, the next part is simple; build the html table. We will be outputting variables, so create a <cfoutput> block and the html table structure like below:

<cfoutput>
    <table width="100%" border="1">
        <tr>
            <th>
Keyword</th>
            <th>
Position</th>
            <th>
URL</th>
            <th>
Title</th>
            <th>
Snippet</th>
        </tr>

    </table>
<cfoutput>

Alright. Not too bad. Now just before the closing <table> tag let’s loop through the array that is passed in as tableData. Each iteration of this loop will be a new row consisting of: Keyword, Position, URL, Title and Snippet. Create the loop shown below:

<cfloop from="1" to="#arrayLen(tableData)#" index="idx">
    <tr>
        <td nowrap>
#tableData[idx].keyword#</td>
        <td>
#tableData[idx].position#</td>
        <td>
#tableData[idx].url#</td>
        <td>
#tableData[idx].title#</td>
        <td>
#tableData[idx].snippet#</td>
    </tr>

</cfloop>

That’s all for the function makeTable! Here’s the completed function:

Completed getGoogleRanking

<cffunction name="makeTable" access="public" returntype="boolean">
    <cfargument name=
"tableData" type="array" required="yes">
    <cfoutput>
    <table width="100%" border="1">
        <tr>
            <th>
Keyword</th>
            <th>
Position</th>
            <th>
URL</th>
            <th>
Title</th>
            <th>
Snippet</th>
        </tr>

        <cfloop from="1" to="#arrayLen(tableData)#" index="idx">
        <tr>
            <td nowrap>
#tableData[idx].keyword#</td>
            <td>
#tableData[idx].position#</td>
            <td>
#tableData[idx].url#</td>
            <td>
#tableData[idx].title#</td>
            <td>
#tableData[idx].snippet#</td>
        </tr>

        </cfloop>
    </table>
    <cfoutput>

    <cfreturn true>
</cffunction>

That’s it for the CFC, now let’s create the page to call these functions. In the "seo" folder create a file named google_ranking.cfm.

This can be done both either cfml or cfscript. Since everything else we’ve done has been in cfml, let’s make this part with cfscript. Create your <cfscript> block within the body of the new document. Now the first thing we need to do in our <cfscript> block is to invoke the CFC we created earlier into a new ColdFusion object: variables.google. To do that we will use the CF function createObject(). See below for the code:

<cfscript>
    variables.google = createObject("component","seo.cfc.google_API");
</cfscript>

The createObject takes two arguments. The first is the type of object. In this case it is a CFC, or component. The second is the name of the object you are creating. Remember in step 1 when we made the google_API.cfc? By specifying "seo.cfc.google_API" as the component name ColdFusion will look for the file google_API.cfc in the "seo\cfc\" folder and will create the object based on the component definition.

Now let’s do a little more prep-work before we call our functions. First of all, we need to setup the variables to pass as arguments to our functions. Remember the getGoogleRanking required 3 arguments: maxpages, keyword and domain? Let’s set those now:

intMax = 2;
lstKeywords = "keyword 1, keyphrase 2, keyword 3"; /*Replace with actual list of keywords*/
strDomain = "mysite.com"; /*Replace with actual domain (exclude www)*/

Notice that the variable I’m using to pass as the keyword is actually a list of keywords. This doesn’t have to be, but to make this whole thing worth while we better be able to check multiple keywords, right? So you can either use a single keyword, or a whole list of them. Just keep in mind your 1000 per day limit.

Okay, one last bit of setup then we’ll get to work. Recall our function makeTable? This is essentially what we’re here for, to display our ranking in a table format. Well that function requires an array type argument containing the ranking data. So let’s declare a one dimensional array: arrRanking, to pass to our makeTable function.

arrRanking = arrayNew(1);

Now on with the show. Since we made a list of keywords to search, we’ll need to loop through the keywords to call the getGoogleRanking function for each keyword which we will later use the returned data to build our array which will ultimately be used to output our data to the screen. Create the loop from 1 to the number of keywords in the list.

for (i=1;i LTE listLen(lstKeywords);i=i+1){

}

Inside that loop create the new index of the arrRanking array as a new structure. Wait a minute. Why create a structure inside the array? Well, this is called creating an array of structures. Using this method will allow us to have many rows of named values. For instance, each index/row represents a query result from the Google API, which has associated with it several named values: keyword, position, title, url, snippet. If you remember the getGoogleRanking function returns a structure containing the keyword, position, title, url and snippet. To put it all together we are creating one row for each getGoogleRanking call which will contain the keyword, position, title, url and snippet information. Here’s how to create the next array index as a structure:

for (i=1;i LTE listLen(lstKeywords);i=i+1){
    arrRanking[arrayLen(arrRanking)+1] = structNew();
}

Now to call getGoogleRanking to store the returned structure in our array. Before we do that note that the first thing we did on this page was to invoke the CFC into the variable google. So the functions we need are now referenced in the google variable. Therefore our call to getGoogleRanking should look like: googleGetGoogleRanking();. Add the below code inside our loop:

arrRanking[arrayLen(arrRanking)] = google.getGoogleRanking(maxpages: intMax, keyword: listGetAt(lstKeywords,i), domain: strDomain);

Notice that when I passed in the arguments I passed them in as named objects. This is not totally necessary, but it helps to know what each variable is doing in that function call. It also allows you to pass the arguments in any order.

Now we’re almost done. Last thing we need to do is call the makeTable function to display all the data we just captured. To do that all we need to do is pass the arrRanking array in as an argument to the makeTable function. This will generate the table and display it to the screen.

google.makeTable(arrRanking);

Here’s what the completed google_ranking.cfm template should look like:
Completed google_ranking.cfm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns=
"http://www.w3.org/1999/xhtml">
    <head>
        <meta http-equiv=
"Content-Type" content="text/html; charset=iso-8859-1" />
        <title>
Untitled Document</title>
    </head>
    <body>

    <cfscript>
        variables.google = createObject("component","seo.cfc.google_API");
        intMax = 2;
        lstKeywords = "keyword 1, keyphrase 2, keyword 3"; /*Replace with actual list of keywords*/
        strDomain = "mysite.com"; /*Replace with actual domain (exclude www)*/
        arrRanking = arrayNew(1);
        for (i=1;i LTE listLen(lstKeywords);i=i+1){
            arrRanking[arrayLen(arrRanking)+1] = structNew();
            arrRanking[arrayLen(arrRanking)] = google.getGoogleRanking(maxpages: intMax, keyword: listGetAt(lstKeywords,i), domain: strDomain);
        }
        google.makeTable(arrRanking);

    </cfscript>
    </body>
</html>

That’s it! Now all you do is navigate to the google_ranking.cfm page in your favorite browser and in seconds you’ll have your current Google ranking! You could easily extend the google_ranking.cfm page by adding a submit form where you can pass the keywords and domain via form post.

Challenge – Easy: Change the makeTable function to store results in a database.
Challenge - Advanced: Alter this CF utility to track and limit to 1000 queries per day and write a queue routine to chronologically run the next 1000 queries.

NOTE: Google has many indexes/datacenters that are used to query and return results to the Google user. Because of this there are usually discrepancies between each datacenter. So, if you run this utility and compare it to a search through www.google.com you may see different results. This utility and any out there that perform such tasks cannot be 100%, but should be considered a good measuring tool as you analyze your websites SEO.

Last Modified on: September 9th, 2006 (Pvarando @ Authors Request).

All ColdFusion Tutorials By Author: Abram Adams
Download the EasyCFM.COM Browser Toolbar!