Let me preface this by admitting I know next to nothing about SEO. Those companies that advertise ‘top of Google’ rankings seem like modern day snake oil salesmen to me. I mean, does Google even know how Google ranks search results? Seems like a dumb question, but think about it. What better way to keep your process secret than if you yourself don’t even know the whole thing? Maybe that little input box is just a tiny bit self aware, and makes the final call on the order.
But it’s an important consideration when working with the web- particularly when working on a site for a company that depends heavily on search-related sales inquiries (everyone these days?). So we created a nice, shiny Angular 1.5 driven site for a company. It works, it’s quick, it’s got nice fades and slides and transitions. All was well.
Until Google started indexing. Or, rather, didn’t start indexing. Turns out, despite what they’ve stated, the crawler just wasn’t rendering our JS before indexing a page (most likely an issue with how we structured our app, but the results were the same). The company was dropping in search results, title and description tags were confusing strings of curly braces and dot notation variables- total anarchy.
So we turned to Prerender.io – and it was a great decision. This post isn’t about Prerender, but they really are great- and when we had questions about implementation, they were very quick to respond. After a few stumbles, we got it working and were climbing back up the ladder.
Until our cached links started to explode. 400, 500, even more- there definitely aren’t that many links on the site, and some were just complete gibberish. Random strings of text- probably part of some automated Google crawling, but all were being cached by Prerender, because all were returning a 200 status message.
One of the limitations of Prerender with Angular is that you can’t use an ‘otherwise’ route (it uses a specific tag in your url to serve cached pages to search bots, post js rendering). But that also means if you used dynamic routing (using the templateUrl function option to get the proper endpoint path from the url, then passing it to your backend), everything is returning status 200 ‘ok’. And all knowing, all seeing Google is indexing everything.
Prerender is so awesome that they include a meta tag that lets you return a 404 http status, but how would we add it? There’s only one html page with a ‘head’ in the whole damn project.
Our answer: dynamically add the tag to the head if our checks for valid routes fail. Using the ‘onRouteChange’ hook in Angular’s lifecycle, we could add the tag anytime we served up our custom ‘not found’ notice (which isn’t actually a 404 page). The last piece to the puzzle was finding a way around the missing ‘otherwise’ route. We were using the path on the url itself to get the right endpoint from out (Django) API, which seemed really clever at the time, but now how do we deal with paths that are outside the range of our checking?
Arrays to the rescue! We decided to grab the location.path() when a page loads (as from a search engine link) and split it into an array. We knew the max number of items in this array should be 5, and knew what the last term should always be. From there, it was an easy check to make sure that array === ‘ourKnownLastTermConstant’ and that array.length <= 5. Anything else gets a nice ‘not found’ message and the one html file gets the Prerender 404 meta tag added (which is then removed as the first step when clicking a different link, and the check starts all over again.
It was really only necessary to do the check on a full refresh- no one will get to a ‘not found’ page by clicking links within the app itself- those are all pre-validated in our routing. So it shouldn’t be too big a hit on performance.
As is becoming a recurring theme on this blog, I’m sure this isn’t what the folks at Angular intended their tool to be used for. But it works! Different strokes for different folks, I guess.