"The author does his point a disservice by being under-informed, imo."
I admit my Google searching turned up no useful results about hash-banging URLs for new websites, only Google's attempt to index sites that ignored progressive enhancement and hid all content behind Ajax calls.
"Sounds great, but what happens when you click to a new article, which is loaded with AJAX?"
If you want to load the content in via Ajax, you start with a valid and direct URL to a piece of content. You can then use JavaScript on the click handler to mangle the URL into a hash-bang, and then proceed as you wish. You get all the benefits of hash-bang without the downside of uncrawlable URLs.
"now it's "/#!downloads/5753887". I can bookmark this, send the link to a friend, etc etc."
* Bookmark it to a third party service - yes (assuming they don't truncate fragment identifiers!)
* That service then retrieves the content of that article (for summarising, entity recognition, extracting microformatted data). no.
The hash-bang URL is not a direct reference to the piece of content. The indirection shenanigans isn't part of HTTP/1.1 or RFC-2396. So it's not Restful for starters.
"after all preserving state isn't that important, right"
That can be done without making all links on the page non-traversable with a crawler.
"Conclusion: If you must load content with AJAX, using URL fragments to track state is the most functional and cleanest-URL option available."
a clean URL would not be an indirect reference that would need to be reformatted into a direct reference before being curled.
State can be done without mangling URLs in this way.
> You can then use JavaScript on the click handler to mangle the URL into a hash-bang, and then proceed as you wish
I wish, unfortunately this is only supported in the very latest browsers (I've literally seen one website do this [1]). You can only change the portion of the address bar following the "#" with javascript, which is why you see URLs like "/content/30485#!content/9384" in some places.
The short of it is, if you insist on using AJAX to load your primary content, and to power your primary navigation, then the way they do it is the best method at our disposal at this time.
Personally, I don't think the best method is good enough, but right now there is no better solution - imo, the only way to avoid this nastiness is not to load your content using AJAX in the first place! I don't see the benefit in throwing away half of HTTP, all of REST, breaking inter-operability and SEO, requiring javascript, and having ugly URLs, in order for... what? It's hard to see the plus side to be honest, this is categorically the Wrong Way to make a website and I wish people would stop doing it!
"I wish, unfortunately this is only supported in the very latest browsers (I've literally seen one website do this [1]). You can only change the portion of the address bar following the "#" with javascript, which is why you see URLs like "/content/30485#!content/9384" in some places."
Okay, you are clearly missing the solution here.
This is what the html looks like:
<a href="path/to/page">Page</a>
Then add a click handler to the link (using your preferred JavaScript library):
$('a').click(function(e) {
this.href = "#!" + $(this).get(0).href;
});
So this code mangles your URL into a hashbang with JavaScript when the link is clicked.
Now the rest of your code remains exactly the same, and updates the window.location.hash the same way as before - it gets the same value as before.
Except the benefit is that with no JavaScript - like a crawler - sees a working URL. And the JavaScript then mangles the URL appropriately, leaving your framework blissfully unaware of the difference.
This is progressive enhancement - using JavaScript to mangle links that are only meant to work when JavaScript is available (and in Gawker's case, make it to the browser).
If you're quite confident in your JavaScript skills, you can mangle all the links on the page in one go right at DOMReady.
The key is not expecting JavaScript to be running. And when JavaScript is indeed running, then mangle links to your heart's content.
Ah yes, I did misunderstand what the OP and you meant by that. Yes, progressive enhancement would be preferable, I (almost) always try to use the PE approach myself.
The problem with having this mix though is that someone could visit "/45382/some-content-url" with JS turned on and then click a mangled link - taking them to "/45382/some-content-url#!94812/some-other-content" which is absolutely horrible.
As far as I'm concerned, the only real solution is to not use javascript for loading pages rather than, you know, invoking page loads. I really don't see why this method is beneficial at all, there is nothing of merit in the design or functionality of Gawker in my opinion.
Along these lines, one pattern is to build a widget or page that renders in straight html and has links to take you to pages to perform actions. If javascript exists, scoop up the html and introspect it to build your rich widget. If the javascript never loads, you have a working page.
This, I believe is a good use of custom data attributes. When your JS introspects the html, it can use data here for configuration and setting values. It's messy and often impossible to use the same values you output to html in your javascript models. (For instance you might have a Number with arbitrary percision on your model, but only spit it onto the page with decimal points.)
But, yeah sure. It's a some extra work and planning we don't always have time for.
I admit my Google searching turned up no useful results about hash-banging URLs for new websites, only Google's attempt to index sites that ignored progressive enhancement and hid all content behind Ajax calls.
"Sounds great, but what happens when you click to a new article, which is loaded with AJAX?"
If you want to load the content in via Ajax, you start with a valid and direct URL to a piece of content. You can then use JavaScript on the click handler to mangle the URL into a hash-bang, and then proceed as you wish. You get all the benefits of hash-bang without the downside of uncrawlable URLs.
"now it's "/#!downloads/5753887". I can bookmark this, send the link to a friend, etc etc."
* Bookmark it to a third party service - yes (assuming they don't truncate fragment identifiers!) * That service then retrieves the content of that article (for summarising, entity recognition, extracting microformatted data). no.
The hash-bang URL is not a direct reference to the piece of content. The indirection shenanigans isn't part of HTTP/1.1 or RFC-2396. So it's not Restful for starters.
"after all preserving state isn't that important, right"
That can be done without making all links on the page non-traversable with a crawler.
"Conclusion: If you must load content with AJAX, using URL fragments to track state is the most functional and cleanest-URL option available."
a clean URL would not be an indirect reference that would need to be reformatted into a direct reference before being curled.
State can be done without mangling URLs in this way.