Protocol Relative URLs (and why not to use them)

Back in October 2010 (that long?!) I noticed a commit to Paul Irish’s (awesome) HTML5 Boilerplate project on GitHub that piqued my curiosity. I hadn’t really noticed the trick of linking to a resource in a protocol-independent manner before. So I drafted this post and then promptly forgot about publishing it. It’s still cool five years later—but not quite as cool, for reasons I’ll explain in a sec.

For the longest time, I thought links had to have a protocol specified, no matter what. I thought that was why Google Analytics used a kind of ugly detection hack to check document.location.protocol and switch the script src accordingly. Turns out that Google used that hack not because of the protocol itself, but because Analytics offered HTTPS on a different subdomain.

My mistake.

Cool

That commit got me to look it up, and sure enough, protocol-independent links are a thing. Until then I had no idea the protocol could be implied—though I knew the domain name is implied if the URI starts with /, and the entire base path is implied if there’s no initial slash at all.

So, adding a script that will load securely if the page is secured and by normal HTTP if not is as easy as removing the http: or https: from the src attribute, leaving a URI that looks like //domain.name/path/to/script.js

Not So Cool

But there’s a big caveat these days. In a 2014 update to that same post, Paul points out that as cool as this is, it’s an anti-pattern now. Why? Because if an attacker can trick the parent page into loading over an insecure connection, she can also get the user’s browser to load fake JavaScript. Long story short, China used insecure JavaScript to attack GitHub and it wasn’t pretty.1

So basically, I learned something five years ago that is now kind of frowned upon if you actually use it. Use HTTPS to load resources if the origin server supports it, period. Both server hardware and server software have gotten so good at encryption that the old argument—”It adds too much overhead”—no longer applies, and the upsides far outweigh the downsides.

Speaking of learning things: I looked today, and Google Analytics uses this trick now.2 I guess at some point they decided a separate subdomain for SSL was silly.3 Now it’ll take them who knows how long to decide that serving insecure JavaScript is silly, and just load over HTTPS all the time.


Notes:

  1. It’s ironic that NETRESEC’s site doesn’t even load over HTTPS. You’d think a network security blog would implement that, at the very least. []
  2. I don’t know for how long, because the last time I actually touched tracking code myself must have been 2011 at the latest. []
  3. It always was silly to put secure traffic on a separate subdomain, though I still see sites do it. I still don’t know why—especially in cases like Google Analytics where the hostname already load balances across dozens or hundreds of machines. []

dgw

I am an avid technology and software user, in addition to being reasonably well-versed in CSS, JavaScript, HTML, PHP, Python, and (though it still scares me) Perl. Aside from my technological tendencies, I am also a theatre technician, sound designer, violinist, singer, and actor.

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail (or subscribe without commenting)

 

Comments are subject to moderation, and are licensed for display in perpetuity once posted. Learn more.