If you haven’t jumped on the SEO (Search Engine Optimization) bus then you’re getting left behind. In this fast-paced e-world it is extremely vital to know where your web pages rank at all times in the search engines to determine what you need to do to optimize your site to rank better. Never before has the saying "here today, gone tomorrow" had such a powerful truth to it. For websites relying on visibility to turn profit, being in the coveted "top 10" could mean life or death to the company.

There are many tools out there to help with SEO, but some are costly, and some just don’t give you what you need. And besides, we’re developers and it’s cool to make our own stuff right!

About this tutorial
This tutorial will not explain all the "tricks of the trade", but will show you how easy it can be to track your site ranking for your targeted keywords in Google using some pretty basic to intermediate ColdFusion coding instead of manually checking or buying expensive software. This is an intermediate tutorial, so some advanced knowledge of complex arrays, structures, cfc’s and webservices are assumed.

What You’ll Learn
In this tutorial I will show you how to create a CFC (ColdFusion Component) with a few UDF’s (User Defined Functions). You will also learn how to invoke and interact with both CFCs and Webservices as well as some very handy techniques such as storing and retrieving data from complex arrays with nested structures, functions and other arrays. To make it a little more "real-world" we will be developing an application that will use the Google SOAP Search API to determine your sites search engine positioning.

Google requires developers to use this API when sending automated queries to its search engine, violators may be penalized/banned from the search (and if found related to your sites could possibly effect your ranking).

NOTE: You may only run 1000 queries per day, so if you are targeting many keywords you may need to break this down into multiple runs and span it across multiple days. Also, the Google API will only return 10 results at a time. That means if you are searching the top 3 pages it will cost 3 queries.

Prerequisites
To begin you will need to download the Google SOAP Search API, create a Google Account and obtain a license key (which Google will provide once you have signed up).

Off we go!
Once you have the prerequisites you can continue below to start developing your ColdFusion wrapper.

First you will create a folder structure under your web root. For this tutorial I will use: c:\inetpub\wwwroot\seo\. (see structure below)
Listing 1

Eextract the Google API wsdl file (GoogleSearch.wsdl) to the "ws" folder you just created. This is the webservice/API you will invoke to interact with the Google search function.

The Files
We will create two files in this tutorial to do all the work we need: google_API.cfc and google_ranking.cfm. Google_API.cfc will contain the functions to interact with Google’s API and format the results. Google_ranking.cfm will simply call the functions from google_API.cfc and display the results.

Now let’s create the CFC that will be the heart of this utility. Create a file named google_API.cfc in your "cfc" folder. The ".cfc" extension will inform ColdFusion that this is a ColdFusion Component. This CFC will contain 3 functions that will perform everything we need to search, gather and display the results to the screen. These functions are: getGoogleRanking, doGoogleSearch and makeTable. These functions are not by any means meant to be an extensive "do-it-all" solution, but rather a very basic, easy to digest starting point that you can adapt to your own situation. You could easily alter this to store the results in a database, or pass it to other functions/applications for further analyzing and reporting.

The first thing we need to do in the google_API.cfc is to add the <cfcomponent> tag block.

<cfcomponent>

</cfcomponent>

The functions in this file will be nested inside of the <cfcomponent> tags.

Now we will create the first function in our component. Create a function between the <cfomponent> tags named "doGoogleSearch". Set the return type to Array, which is the format in which the Google API returns the search results. Also, set the access to private. This will make this function only available to other functions in this component. By now you should have something like the following:

doGoogleSearch
<cfcomponent>
    <cffunction name=
"doGoogleSearch" access="private" returntype="array">

    </cffunction>
<cfcomponent>

If in the function declaration we specify a returntype we must return a value using <cfreturn>, otherwise an error is thrown. So let’s create our return container "arrResults" a blank one dimensional array:

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">

        <cfset arrResults = arrayNew(1)>

        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

Now this function will work, returning an empty array. To make this function useful, and flexible we need to create some arguments. Using arguments will allow us to change the behavior of the function based on the values we pass in those arguments. Create the following arguments just below our arrResults declaration:

<cfargument name="q" type="string" required="yes">
<cfargument name="start" type="string" required="yes">

"Q" is the key-phrase or query that you are submitting to the Google API. This is the keywords you are hoping to find your pages rank for.
Start is the results page number we are fetching from the Google API. This will let us focus on a single page at a time and extract the results just as they would be seen in a user’s browser.

Now the fun part. Let’s invoke the Google API. The Google API is actually a webservice, and the wsdl file was included in the API download we did earlier. Invoking the webservice will essentially make it’s public functions available to us within our application. Just below your declarations, and just above your <cfreturn> invoke the Google API’s doGoogleSearch method as in the example below:

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">
        <cfset arrResults = arrayNew(1)>
        <cfargument name="q" type="string" required="yes">
        <cfargument name="start" type="string" required="yes">

        <cfinvoke
                webservice=
"http://localhost/seo/ws/GoogleSearch.wsdl"
                method=
"doGoogleSearch"
                returnvariable=
"aGoogleSearchResult">
                    <cfinvokeargument name=
"key" value="{YOUR KEY HERE}"/><!--- Change value to your license key --->
                    <cfinvokeargument name=
"q" value="#arguments.q#"/>
                    <cfinvokeargument name=
"start" value="#arguments.start#"/>
                    <cfinvokeargument name=
"maxResults" value="10"/>
                    <cfinvokeargument name=
"filter" value="false"/>
                    <cfinvokeargument name=
"restrict" value=""/>
                    <cfinvokeargument name=
"safeSearch" value="false"/>
                    <cfinvokeargument name=
"lr" value=""/>
                    <cfinvokeargument name=
"ie" value="latin1"/>
                    <cfinvokeargument name=
"oe" value="latin1"/>
        </cfinvoke>

        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

The method is the method/function we are going to invoke from the webservice, and the returnvariable is the variable that will contain the results from the call to doGoogleSearch.

The <cfinvokeargument> tags essentially act in the same fashion as the <cfargument>, but are passed to the webservice. The above arguments are required by the Google API. Notice that we are using #arguments.q# and #arguments.start# as argument values in the <cfinvokeargument>s. Arguments is the named scope that ColdFusion uses to access the values of <cfargument>s.

NOTE: You must replace "http://localhost/seo/ws/GoogleSearch.wsdl" with your actual url to the wsdl file and you must replace "{YOUR KEY HERE}" with your Google API Key.

The last thing we need to do in this function is to populate the return array we created earlier (arrResults) with the results from the doGoogleSearch method. To see what the entire result set looks like, do a cfdump just before the <cfreturn>:

<cfdump var="#aGoogleSearchResult#"><cfabort>

Note that this returns an array of structures which some are functions. These functions will be used to extract the actual data later. By passing the array to the calling page we make those functions available to it and can use them to extract the data we need. I’ll cover this a bit later.

Now, remove the <cfdump> and let’s populate arrResults with the aGoogleSearchResult.getResultElements() call. This will return an array of results from the Google API.

<cfset arrResults = aGoogleSearchResult.getResultElements()>

This function is now complete. Your code should look like this:
Completed doGoogleSearch Function

<cfcomponent>
    <cffunction name="doGoogleSearch" access="private" returntype="array">
        <cfset arrResults = arrayNew(1)>
        <cfargument name="q" type="string" required="yes">
        <cfargument name="start" type="string" required="yes">

        <cfinvoke
                webservice=
"http://localhost/seo/ws/GoogleSearch.wsdl"
                method=
"doGoogleSearch"
                returnvariable=
"aGoogleSearchResult">
                    <cfinvokeargument name=
"key" value="{YOUR KEY HERE}"/><!--- Change value to your license key --->
                    <cfinvokeargument name=
"q" value="#arguments.q#"/>
                    <cfinvokeargument name=
"start" value="#arguments.start#"/>
                    <cfinvokeargument name=
"maxResults" value="10"/>
                    <cfinvokeargument name=
"filter" value="false"/>
                    <cfinvokeargument name=
"restrict" value=""/>
                    <cfinvokeargument name=
"safeSearch" value="false"/>
                    <cfinvokeargument name=
"lr" value=""/>
                    <cfinvokeargument name=
"ie" value="latin1"/>
                    <cfinvokeargument name=
"oe" value="latin1"/>
        </cfinvoke>


        <cfset arrResults = aGoogleSearchResult.getResultElements()>
        <cfreturn arrResults>
    </cffunction>
<cfcomponent>

Now we’ll create the getGoogleRanking function that will call the doGoogleSearch function we just finished. The getGoogleRanking function will grab the results from the doGoogleSearch and check for the specified domain, then output the ranking results to the screen.

Create the getGoogleRanking function as shown below (remember to create the function within the <cfcomponent> tags:

getGoogleRanking
<cffunction name="getGoogleRanking" access="public" returntype="struct">

</cffunction>

This defines the function getGoogleRanking as public, which means it can be called from outside this component, and will return a value in the format of a struct. Like before, this function by itself will not work because we are not returning anything. So, before we go any further let’s declare the return variable "structReturn":

<cffunction name="getGoogleRanking" access="public" returntype="struct">
    <cfset structReturn = structNew()>
    <cfreturn structReturn>

</cffunction>

Now the function would return a blank structure. Our next step is to populate that structure with the Google ranking data we are looking for. To do this we need to setup a few arguments that we will use in this function. Make the following arguments just underneath your structReturn declaration:

<cfargument name="maxpages" type="string" required="yes">
<cfargument name=
"keyword" type="string" required="yes">
<cfargument name=
"domain" type="string" required="yes">

Maxpages will be used to tell the function how many pages deep to parse from the Google results. I would recommend limiting it to 2 as each call to the Google API costs 1 query, of which there are only 1000 per day available.
Keyword is just that, the keyword you are checking to see if your site ranks for. This is the keyword you would expect a Google searcher to type in to find the product/services/information you are trying to promote on your site.
Domain is the domain you are looking for when you search for Keyword. You could use www.site.com for just the www subdomain to be searched for or you could use site.com to search for all subdomains under that domain.

Okay, now that we have the initial declarations setup, let’s start searching the Google index. Remember, we only want to look at the first "maxpages" results pages, so we will loop from 1 (first page) to maxpages calling the Google API on each iteration, while specifying the page number. So, underneath your declarations we just made, and above the <cfreturn> create a loop like the following:

<cfloop from="1" to="#arguments.maxpages#" index="i">

</cfloop>

Now within this loop is where all the fun begins. First we are going to call the doGoogleSearch function we created earlier and store the results in the arrResults array. This will "pass" the array and all the nested functions returned by the Google API, which we will use to extract the result data. Remember that we created 2 required arguments in the doGoogleSearch function: "Q" and "start". We will need to pass in the query (Q) which is the arguments.keyword argument we declared as a required argument in this function and the results page (start), which is the current value of the loop’s index (i). This way it will be called once for every page.

<cfset arrResults = doGoogleSearch(arguments.keyword,i)>

Okay, the doGoogleSearch returned the search results in a complex array: arrResutls. To find out our domain’s ranking we need to check each result to see if it is from our domain. To do this we will loop through the array and check each result for arguments.domain. So, create a loop nested inside the loop we just created. The loop will need to iterate through the arrResults array to extract each result. Your loop should look like this:

<cfloop from="1" to="#arguments.maxpages#" index="i">
    <cfset arrResults = doGoogleSearch(arguments.keyword,i)>

    <cfloop from="1" to="#arraylen(arrResults)#" index="idx">

    </cfloop>
</cfloop>

Now, for each index of the array we check for our arguments.domain. This is done by calling the getURL() function returned in the array from the Google API. This should look like:

<cfif FindNoCase(arguments.domain,arrResults[idx].getURL())>

</cfif>

If the domain is found, then we use the Google API functions to grab the data and store it in the structReturn we declared at the beginning of this function. This will be passed back to the calling page. The code should look like this:

<cfloop from="1" to="#arguments.maxpages#" index="i">
    <cfset arrResults = doGoogleSearch(arguments.keyword,i)>
    <cfloop from="1" to="#arraylen(arrResults)#" index="idx">
        <cfif FindNoCase(arguments.domain,arrResults[idx].getURL())>

            <cfset structReturn.keyword = arguments.keyword>
            <cfset structReturn.url = arrResults[idx].getURL()>
            <cfset structReturn.title = arrResults[idx].getTitle()>
            <cfset structReturn.snippet = arrResults[idx].getSnippet()>

        </cfif>    
    </cfloop>
</cfloop>

Now that we have the results data, we can determine our position. To do this we will need to know what page we are on, and what iteration of the array we are on. The array will have 10 iterations, since the Google API will only return 10 results at a time. So on the first page we know that the position (idx) is the actual iteration of the arrResults loop; 1-10. However, if we are beyond the first page we need to calculate our position. So we’d need an if block and two different ways of calculating the current position. Instead, let’s just write a formula that will calculate our position regardless of the page we are on. To do this we need to add the current iteration (idx) to the current page (i) minus 1 multiplied by 10: idx + ((i-1)*10) = current position. So if we find it on the second page on the 4th result we would have 4 + ((2-1)*10), or in the 14th position.

The code for this looks like the following:

<cfset pos = idx + ((i-1)*10)>

Now we simply add the pos value to the structReturn structure as such:

<cfset structReturn.position = pos>

Okay, we have our position and there is no need to look any further. In order to spare precious CPU time and our limited number of queries we can per day perform we need to break out of both of the loops. To do this, add a <cfbreak> just after you set the structReturn.position variable. This will break out of the arrResults loop. Now we need to also break out of the page loop by the same means. However, if we just put a <cfbreak> it will break out on the first iteration, always. So, to make sure it’s time to break out we’ll simply test the structReturn variable to see if it has data. If it does, break out. Your code should be just under the closing cfloop tag for the arrResults and should look like the following:

            <cfbreak>
        </cfif>
    </cfloop>
    <cfif not StructIsEmpty(structReturn)>
        <cfbreak>
    </cfif>
</cfloop>

Almost done. Now what if we are not found in the pages we searched? We need to return something, or our function will fail, right? What we need to do is test the structReturn ag

About This Tutorial
Author: Abram Adams
Skill Level: Intermediate 
 
 
 
Platforms Tested: CFMX,CFMX7
Total Views: 108,037
Submission Date: August 02, 2006
Last Update Date: June 05, 2009
All Tutorials By This Autor: 3
Discuss This Tutorial
  • It looks like the api was only available until 2005 now there is a new system you need to get an adwords account.I´m looking into it https://adwords.google.com/

  • Mr. Chiverton, You’re right. PageRank is not worth mucking around with. Fortunuately this tutorial isn't about tracking PageRank but page ranking. There is a difference. Also, I would have to disagree with you. I run many sites and "community post" referrals are only a small fraction of referrals compared to search engine referrals. I'm talking maybe 1% - 5%. Advertising costs money, natural search engine placement doesn't.

  • I so sorry! your codes are confused! to your audience. Maybe you can make them clear more! Reduce the nonsense markups.Then them will be better!

  • "haven’t jumped on the SEO (Search Engine Optimization) bus then you’re getting left behind" :shrungs Depends. Most sites cater to a particular group, and some targeted advertising / community posts are a much better traffic driver than mucking about with PageRank.

Advertisement

Sponsored By...
iOpenSoft, LLC is a Houston, Texas Advanced Technology Studio Specializing in Web Design, Web Development, iPhone App Development and Android App Development.