Using Ajax and Search Referrer Info to Help Users Navigate Your Site

By Kyle Dent
July 18, 2009 | Comments: 5

On my web site, I host a FAQ covering many aspects of Postfix administration (http://www.seaglass.com/postfix/faq.html). Most of the traffic coming to the page arrives from Google searches. There are currently seventy-three questions (and answers) on the page, and it extends down quite some way. I've tried to set up the page so that it's easy to find what you're looking for. I purposely kept it to a single HTML file, so that you can use your browser's search function, and at the top of the page there is a list of just the questions with links to the answers for people who want to browse the list. I also recently added search to my site. Still, I wonder if the people who arrive from a search engine looking for something very specific are not overwhelmed when they land on the page. Upon arriving, it may not be obvious that what they are looking for is actually on the page. Are they discouraged and go back to the search listing in favor of a page with more immediate results?

It occurred to me that since I know what search strings bring people to the FAQ, I should be able to provide quick access to the information they are looking for. I could also give pointers to other material on the site that might be useful to them. It would be nice if users coming from search engines saw the page with a little summary box at the top with pointers to what they are looking for. With the almost magical asynchronicity of Ajax, I can do that without disturbing the rest of the page or bothering browsers who come to the page other than through a search engine. The screen shot below shows the final result, or you can see it yourself by going through Google and entering the search term "mailq postfix." My site should be at or near the top of the list with a title of "[Kyle Dent] - Postfix FAQ."

query_referrer_scrn.png

The box at the top appears only for people coming from a search engine, and it displays content based on the search string that they followed to get to the page. There are two pieces to make it work. The client-side JavaScript and a server-side Perl script that sends the HTML to display. Normally, I would do most of the processing on the client so that only accesses that need the additional content would send a request to the server. However, as I start working with this technique, I want to see that it behaves the way I expect, so I'm handling most of it on the server and logging what happens to keep an eye on it for awhile. Once I'm satisfied with it I may move things around a bit.

The JavaScript starts out with:

window.addEventListener('load', checkRefQueryString, false);
var qs_req;
var referrerUrl;

This tells the browser to invoke my function checkRefQueryString once the page has loaded and declares a couple of variables I'll be using to store an XMLHttpRequest object and the referrer URL as it comes from Google or another search engine.

function checkRefQueryString() {
        var checkReferrer = false;
        referrerUrl = document.referrer;

        if ( referrerUrl.indexOf('www.google.') != -1 ) {
                checkReferrer = true;
        } else if ( referrerUrl.indexOf('www.bing.com') != -1 ) {
                checkReferrer = true;
        }

        if ( checkReferrer ) {
                qs_req = createRequest();
                qs_req.onreadystatechange = setSearchReferrerHtml;
                qs_req.open("GET", "/cgi-bin/check_referrer.pl?referrerUrl="
                  + encodeURIComponent(referrerUrl), true);
                qs_req.send(null);
        }       
} 
The checkRefQueryString function checks the referring page to see if it's from a search engine. Currently I'm checking for Google and Bing, and I'll probably add Yahoo shortly. If the request was referred by a search engine, url-encode the entire referrer URL and send it to the Perl script on the server. This an area where in the future I could move processing to the client. The JavaScript could parse the request and check for one of the topics I have summary information for. Only when it determines that the original search string contains one of those topics would it send the request to the server. The function createRequest() is bit of boring code that handles browser variations to obtain an XMLHttpRequest object. There are examples of that strewn all around the Internet. The line
qs_req.onreadystatechange = setSearchReferrerHtml;
indicates which function to invoke once the server has returned its response. You'll notice that I'm calling encodeURIComponent on the referrerURL because it's going to be sent within the HTTP request. The parameter is a full URL with its own request parameters as it was set up by the search engine. We have to be sure that characters like ampersands in this URL are encoded so that they are not interpreted as part of the current request. Characters that were previously encoded will be encoded again. In other words the encoding will be encoded. That's no problem, I just have to decode it twice on the server to restore it to it's original characters. For example a space in the query terms will be encoded as a plus sign by the search engine. This second encoding will convert it to %2B. We'll see what happens on the server next.
use strict;
use CGI;

$|++;
my $subject = "";
my $html = "";
my $log = "/usr/local/apache2/logs/check_referrer.log";

print "Content-type: text/html\n\n";

my $c = CGI->new();
my $referrerUrl = $c->param('referrerUrl');

$referrerUrl =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg;
my($j, $qs) = split(/\?/, $referrerUrl);
my(@pairs) = split(/\&/, $qs);

my $pair;
foreach $pair ( @pairs ) {
        my($name, $value) = split(/=/, $pair);
        if ( $name eq "q" ) {
                if ( $value =~ /log/ ) {
                        $subject = "logging";
                } elsif ( $value =~ /mailq/ ) {
                        $subject = "mailq";
                } elsif ( $value =~ /loops.?back/ ) {
                        $subject = "loopsback";
                } elsif ( $value =~ /port/ ) {
                        $subject = "changeport";
                } elsif ( $value =~ /white.?list/ ) {
                        $subject = "whitelist";
                }
                last;
        }
}

This script uses the standard CGI.pm library to obtain the referrerUrl parameter sent by the JavaScript code. The CGI library decodes the request, but as I mentioned, it has to be decoded twice. The line

$referrerUrl =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg;
performs the second decoding. Using the example of a space, the CGI library turns any %2B encodings back to plus signs, and this line turns the plus signs back to their original spaces.

The script then parses the referrer URL for to get the original query terms. If the query terms fit one of the topics it's checking for, it sets the $subject variable to one of the normalized values used within the script. Later based on the normalized value, the script sends the browser some simple HTML with pointers to the related information on the site. Otherwise, it sends a blank response. The code to return the HTML is long and cluttered, so I haven't included it here, but the logic is simple. The script finally logs the complete referrer URL and the action it took.

Meanwhile, back at the client, the JavaScript receives the additional content from the server, formats it with some styling elements, and puts it into a <div> block on the page that was previously empty.

function setSearchReferrerHtml() {
        if ( qs_req.readyState == 4 ) {
                if ( qs_req.status == 200 ) {
                        box = document.getElementById("search-ref");
                        box.innerHTML = qs_req.responseText;
                        box.style.fontSize = "14px";
                        box.style.background = "#f5fbef";
                        box.style.border = "3px solid #e3f6ce";
                        box.style.padding = "10px";
                        box.style.margin = "10px auto";
                } 
        } 
}

If you're interested in the complete source code, let me know.


You might also be interested in:

5 Comments

That's a lousy Perl(4?) code. I'm surprised to see such a thing in the O'Reilly website. It can be better written as ( http://pastebin.com/fa01ea40 ):


use strict;
use warnings;
use CGI ();
use URI ();

my $subject = subject() || '';

print "Subject: $subject\n";

sub subject {
my $cgi = CGI->new;
my $referrerUrl = $cgi->param('referrerUrl');
my $uri = URI->new($referrerUrl);
my $path = $uri->path;
my %param = $uri->query_form;
my $value = $param{q} || return;

return $value =~ /log/ ? 'logging'
: $value =~ /mailq/ ? 'mailq'
: $value =~ /loops.+?back/ ? 'loopsback'
: $value =~ /port/ ? 'changeport'
: $value =~ /white.+?list/ ? 'whitelist'
: ''
;
}

Burak,

Thanks for the attention to quality. I'm all for it. In this case, that little bit of code is mainly meant for illustration purposes. I wanted my code to read easily especially for someone who doesn't know Perl.

On a side note most of your suggestions depend on the URI.pm module, which was not installed on the stock OS on the system I was working on. I always hate it when even the simplest examples force me to install something else to make them work, so I reworked my sample code to run without it.

Thanks, Kyle

At least the Perl is in good company...

The JavaScript is also lousy. It uses the W3C event model without branching to support the Microsoft model, which leaves a lot of IE users out in the cold, it uses globals, and it doesn't check that the strings it checks for are in the domain part of the URL.

The styling is also poor - using inline style assigned by JS instead of a stylesheet, not to mention the pixel based font sizes.

But the real kicker is the whole concept.

Ajax allows a page to be updated from the server in response to events without loading the whole page again - but the only event here is "loading the whole page"!

This could be handled entirely server side; reducing the number of HTTP requests the user has to make, removing the dependency for JS to be supported and removing the way the page will jump down when it is extended from the top.

You doubt that many people are coding server side scripts in perl these days??? That seems to be a pretty unflattering remark to post on perl.com - a site supposedly devoted to perl programmers! What other language would you suggest? Ruby? Python? They each of their own pros and cons. The biggest advantage perl has over both of those is CPAN. The value of a software library like CPAN is immeasurable. But rather than encourage its use, you complain about the inconvenience. I don't think having to type "cpan URI" is such a big deal. You also say you want your code to be readable - but then you use a long, cryptic regex that involves the pack command with no explanation other than "this does the second decoding". No beginner will be able to follow that. The version using URI seems much easier to understand to me.

I agree that there's no need to complicate code for beginners. Code can be complex enough without adding extra coding that doesn't seem to make a lot of sense, at least without explanation.
Thank you for an informative post, that can help those of us working with any Web Design Company

News Topics

Recommended for You

Got a Question?