Some writers at the WSJ recently “discovered” that one of the longstanding conventions of the web, the referring URL, can leak unexpected information if websites put things like user names in their URLs. Although my first thought was, “that’s not news,” Facebook appears to have responded with some technical changes that protect referrer data better than the industry norm.
Quick technical background
- Assume you’re on a web page http://www.Site-A.com/origin-page.html
- There’s a link on that page that goes to http://www.Site-B.net/destination-page.html
- You click on the link.
- Your web browser finds the web server for www.Site-B.net and sends that server a request for /destination-page.html.
- The web browser usually also sends the www.Site-B.net web server the URL of the referring page: http://www.Site-A.com/origin-page.html
In short, when you’re on a web page on site A and you click on a link to site B, site B normally knows:
- That you came from site A,
- and the URL of the specific page on site A.
How is referring URL data used?
The web server for site B often just ignores the referring URL. Occasionally it may serve up different content based on the referring URL. But the most common practice I’ve seen is for the receiving website to study the referring URL information in aggregate to analyze traffic sources. “Last month we saw a big jump in the number of users reaching our home page from referring URLs that look like search result pages,” a marketing manager might say, “so our SEO efforts must be working.”
What’s dangerous about referring URLs?
If the referring URL contains sensitive data, that data will be visible to the destination site’s web servers. E.g., if the page
links to an article at
then the people who run the web servers for news-website.com need only read their server logs to find the name of the social network user who linked to their article.
If you work for an organization with an internal Wiki, you might have a page for your secret new project
that links to the competing product that you plan to crush:
Your competitor need only study the referring URLs in their server logs to get the hint that you’re working on something called “Project Phoenix” that has something to do with their Widget-2000 product.
As early as 1996, the people writing the HTTP specification (the technical standard describing how web browsers talk to web servers) recognized the potential privacy problems with referring URLs. RFC 1945, the first version of the HTTP spec, introduced a guideline that has remained in all subsequent versions of the spec:
Two things are noteworthy here:
- Yes, the spec uses a misspelling of “referrer.” A programmer working on one of the earliest web browsers misspelled the word, and the misspelling became a de facto standard and then part of the official spec.
- As far as I know, none of the major web browsers has implemented the “strongly recommended” switch to disable the sending of referring URLs.
Defending against data leakage
In the absence of widespread browser support for hiding referring URLs, people who build websites can defend against data leakage on the server side.
One solution is to design the site so that the URLs don’t provide any externally meaningful information. Webmail services, for example, typically are designed so that no URL conveys information about who the user is. (If you use a webmail service and find that your username or user ID shows up in the service’s URLs, it’s probably time to switch providers.)
That solution isn’t always feasible, though. In social networking websites, for example, it has become popular to put a human-readable user name in URLs. In cases where the referring URL necessarily contains sensitive data, it is possible to keep most browsers from sending the referring URL to the destination site’s web servers. The technique for doing this is an old trick:
- On Site A, instead of linking directly to http://www.Site-B.net/destination-page.html, link to http://www.Site-A.com/some-URL-that-gives-away-no-user-data.html
- The page http://www.Site-A.com/some-URL-that-gives-away-no-user-data.html contains (nothing but) an HTML meta tag that redirects immediately to http://www.Site-B.net/destination-page.html
- The web server for www.Site-B.net sees a referring URL of http://www.Site-A.com/some-URL-that-gives-away-no-user-data.html
It appears that Facebook is now using this trick for all links that go offsite, including ads and links in user-generated content. Thus, while the WSJ’s article arguably is alarmist, it seems to have helped push Facebook to deploy a more rigorous referrer protection than has been common in the industry.








Brian, there’s another name for the “trick” that Facebook is using, the one that they had to no doubt invest thousands of engineering hours on, is called “click tracking” which any budget ad serving solution offers out of the box, including my perennial favorite, OAS.
P.S. I’ll give you one guess how easy it is to turn on click tracking for 3rd party served ads…