Immersing yourself in the stream of discourse about SEO and success in the online space can be an dichotic experience; on the one hand you can read testimonials of companies that have drawn success through providing an excellent user experience and finely-tuned content, and on the other you’ll find horror stories of industries and niches dominated by manipulative tactics. You could be forgiven, then, for buying into the idea that the industry continues to be split into two camps, the White Hat and the Black Hat.
Whether you subscribe to this concept, or (perhaps more realistically) assume that most practitioners fall somewhere in the middle, both supposed camps would never deny that links, whether gained editorially or otherwise, are one of the driving forces of search engine algorithms; they’d likely also agree on the point that building or acquiring said links has become harder, and looks only to get harder still.
So if links are only going to get harder to build and this level of difficulty increases along with the quality of the link, as SEOs surely it’s of the utmost importance that we ensure that we’re getting the maximum benefit from every single one of these links. From an external perspective this may mean ensuring that it points to the most relevant page on your site, or that it uses an optimised anchor text, but what other factors do we need to consider? Could there be deeper issues on your site that keep these hard-earned links from taking their full effect? And could these issues be the difference between finding yourself on page 1 or page 10?
The short answer is yes, and a massive percentage of the sites that either you or I will ever work on – or visit in general – will suffer from some kind of architectural problem that is keeping the site from maximising its performance. This post is intended to detail some of the most common architectural mishaps and assist you in resolving them.
Index Bloat and Duplicated Content
It might not always seem this way but Google and its rivals are not blessed with an infinite capacity. The authority and market position of your site can have a major effect on how much of its resources a search engine is willing to devote to crawling and indexing your pages and content – this is often referred to as ‘crawl budget’. In theory this means that if your site consists of many pages, but lacks authority in your niche, search engines could run out of crawl budget before reaching some of your content, which ultimately means it won’t be crawled or indexed, which is massively reducing its worth as an asset (at least organically).
But Matt Cutts keeps telling us to produce more content, doesn’t he? Well, let’s add the qualifier “quality” to content and this should start to make more sense. Quality content, when produced and seeded thoughtfully, should attract more authority to your site by generating links, citations, user interaction, all that good stuff – and that means you get a bigger crawl budget, which means Google can crawl and index more of your site.
But where index bloat really becomes an issue is on sites that are large by their very nature, such as E-commerce sites that require an extensive product catalogue to drive conversions. With a site like this you can lose the simple A->B->C->D structure that a smaller site may have; perhaps your CMS allows for products or pages to be reached via multiple paths through the site, or allows users to apply filters and parameters to their search. These options are tempting, as they offer a useful function to users, but they can be confusing to a search engine bot. If the bot finds itself in a labyrinthine mess of repeated products and pathways, it may fail to reach the end point and your product could end up being excluded from the index.
There are a few ways you can resolve this issue, direct the bots to where you want them to go, and improve the authority of the pages that you do want them to find. The simplest way is to identify where you have multiple instances of a product or page and use the rel=”canonical” tag to tell the bot that these pages are identical and should be considered to be one and the same.
If you are using parameters such as limit, order, sort or allowing users to filter via colour, size or any other factor, you should check the ‘URL Parameters’ section in Google Webmaster Tools to ensure that you’re instructing Google to ignore URLs amended by each of these query strings, as otherwise you could end up blowing your crawl budget on multiple versions of what is effectively the same page. This is especially dangerous as even a relatively small number of filters can combine to create a tremendous number of URLs, and the signal to noise ratio can dilute the PageRank of even your most powerful pages (the ones with all those brilliant links you spent the last six months painstakingly building?), sending them plummeting in the rankings or swapped out for thin or inferior alternatives.
You can also block bots from accessing certain pages, which is useful if you have features intended for users but not search engines, such as print-friendly versions of content. Each site is set up differently and you may need to use a combination of these strategies to ensure that your site is adequately held back.
Illogical Internal Linking or Lack of Siloing
Internal linking might not have been a hot talking point in the way that external links have been lately, but ignore it as a ranking factor or approach it thoughtlessly and you risk missing out on the benefits of your hard work. Let’s say you’ve done the research, ego-baited all the right influencers, produced a beautiful and exceedingly useful resource and launched it on your site to industry-wide acclaim – everybody loves it and, more importantly, links to it. Except for Google and Bing, who haven’t even bothered to index it. Are they just being jerks or have you stuck it somewhere on your site that their bots can’t access? It’s probably the latter, unless you’ve ever fist-fought Sergey Brin (and won).
The bot that’s crawling through your URLs and deciding whether or not to index them? It’s not a mind reader, it just follows the links on your site – so if you’ve launched this awesome content on a page it can’t reach – an orphaned page – don’t expect to be feeling the impact of all those links any time soon.
That’s a worst case scenario, it’s more than likely that you do link to all your pages from somewhere (a decent CMS will probably bully you into it one way or another), but it’s worth approaching this with care and thoughtfulness. There are many ways in which the Google (and other search engines) algorithm establishes the theme and relevancy of your site and its constituent URLs and internal linking is an integral part of this process. To rank highly your site must reveal a level of expertise and relevance that the search engine deems superior to your peers; the external links you’ve worked hard to attain are one metric that contributes to this but these signals in turn are delivered and disseminated into your site via your internal linking structure.
Haphazard internal linking can muddy the waters and prevent a search engine from accurately identifying the theme of your site. If, for example, you sell apparel and your shoe sub-categories (smart, casual, work, etc.) frequently link internally to pages in the hat sub-categories, you run the risk of diluting the thematic signifier that tells the bot that these categories are about shoes and hats respectively. Creating distinct ‘silos’ for these categories, utilising an internal links structure whereby only the pages that are navigationally or contextually relevant link to the page in question on the other hand, maintains their integrity and should maintain or increase the authority that your site offers in this area.
You can reinforce this with your navigational structure, ensuring that all the pages within the silo are housed within an appropriate parent directory, and by using breadcrumb navigation – you should also ensure that best practice on-page considerations are met; this means using relevant anchor text and crafting page titles that reflect the specific theme of the silo. This reinforces to the search engine that these pages are semantically linked. This creates a strong relationship between all these pages within this silo, authority is passed down from the major category page (often amongst the most powerful on your site, as it receives authority directly from the homepage) but is also fed up from the bottom as your pages become more specific while retaining strong relevancy, establishing the spectrum of your expertise in that area.
Several different factors play into these two sections and the optimisation of them together imbues you with a much greater degree of control over the content on your site – effectively you’re making it possible to isolate and empower the pages that you think are your most worthwhile. You’re literally and semantically connecting them to optimise the flow of authority through your site, and you’re reinforcing your level of specialism every step of the way. Getting the kind of links you need to succeed in a tough vertical is undoubtedly becoming harder, so it’s important to ensure that your work isn’t being wasted. Even if you work primarily as an off-site consultant, it’s crucial that you understand the architecture of the sites you work on, or you run the very real risk of potentially great, campaign-defining links not delivering the kind of results you had pinned your hopes to.