July 21, 2020
Over the years, JavaScript (JS) has become more and more integral to website design. Whether improving user experience, accelerating the buyer journey or providing helpful information to assist your users. Get it right and you’ll reap the rewards. Get it wrong and it will turn into a wormhole of issues and a lack of rankings.
This guide is for those looking to get started with SEO on JavaScript websites. We’ll cover the fundamentals of what to look out for and what to avoid.
The use of JS on websites is increasing and many are now developing their websites purely in JavaScript variants such as React, Angular or Vue.
So how does this impact an SEO like you or me? Well, Google can render and index JavaScript websites. With this comes the need for technical SEO to make sure a JavaScript website can be correctly crawled and rendered to successfully rank in its index.
Put simply, if Google can’t “see” your website, you or your business will be missing out on a lot of traffic, leads and revenue.
Yes, they can, but only some. A test from Bartosz Góralewicz found only Google and Ask were set up to crawl, render and index JavaScript content as can be seen below:
Take this with a pinch of salt as JavaScript websites need to be set up in a particular way in order for the search engines to efficiently crawl it. There are several considerations to make especially in how you decide to render the website which we’ll go into next.
Single-page applications are becoming increasingly popular with the aim of creating a seamless and fast experience. Typically, they are made using client-side rendering methods. Taking an excerpt from Progressive Coder:
In single-page web applications, when the browser makes the first request to the server, the server sends back the index.html. And that’s basically it. That’s the only time a HTML file is served. The HTML file has a script tag for .js file which is going to take control of the index.html page. All subsequent calls return just the data usually in JSON format. The application uses this JSON data to update the page dynamically. However, the page never reloads.
The client (and not the server) handles the job of transforming data to HTML once the application has started. Basically, most of the modern SPA frameworks have a templating-engine of sorts running in your browser to generate the HTML.
This is the traditional method of requesting and rendering a website. As all your resources are housed on a server, the user requests the HTML which in turn references the CSS and JS files to be downloaded and the final render of the page appears for the user and search engine crawler.
In CSR, the rendering relies on JS being loaded by the user’s browser (client-side) via a framework (we mentioned a few earlier in the post). The client will request the source code which is a limited amount of HTML with a reference to a JS file. This JS file is then requested and this contains all the HTML and JS in strings which are then used to render the website.
At Google I/O in 2018, they unveiled their two-wave process for rendering and indexing JavaScript content. The first wave crawls and indexes the HTML and CSS whilst the second wave, which may occur hours or even weeks later once resources are available, will return to render the JS generated content.
The two main issues here:
If client-side rendering delays crawling and indexation, what can be done to speed this up? Well, two solutions are prerendering and Google’s preferred method of Isomorphic JavaScript.
Prerendering is a process whereby when a search engine crawler accesses your website, the HTML and accompanying assets are preloaded to assist the crawler in “seeing” the rendered version of the page.
This allows the user to enjoy the speed of client-side rendering whilst also serving the search engines a cached HTML version of the page to be indexed to avoid it going through the 2 wave process.
Isomorphic JS is recommended by Google and is adopted by Airbnb among others (their Engineering and Data Science team go into more detail on how they use this here).
In this method, both the user and the search engine receive prerendered versions of the website at the initial load. The JavaScript is then layered on top of this to allow quicker CSR performance.
It comes with a warning however as implementation is difficult and a lot of time and resource will need investing to successfully implement this method.
If you care about SEO and you can’t implement any form of prerendering solution, the only answer here is to avoid building a JavaScript website as the issues that come with an SPA will likely prevent your website from being correctly indexed.
Blocking JS files (and CSS files for that matter) in your robots.txt will make it more difficult for Google to properly analyse your website’s pages. The worst-case scenario for JavaScript websites is Google will see a blank screen when it tries to render. Here’s more from Google:
You can use robots.txt to block resource files such as unimportant image, script, or style files, if you think that pages loaded without these resources will not be significantly affected by the loss. However, if the absence of these resources make the page harder for Google’s crawler to understand the page, you should not block them, or else Google won’t do a good job of analyzing pages that depend on those resources.
So, the simple sense check here is if the JS file is required to render your website or anything on any of the pages, don’t block them from your robots.txt.
It is important to allow Google to crawl HTML links within your prerendered code that shows up a unique version of the requested page. Many a time, we see onClick events being used to push the user to a URL or load the desired JS page. The issue with doing this is Google struggles with the onClick function and won’t follow the URL. The image below shows what happened within 2 weeks of changing an onClick link to an HTML link:
*Note the ranking pages were available via a HTML sitemap interim solution before the onClick issue was resolved
As with HTML websites, it is always best practice to have user-friendly URLs targetted to the relevant keyword. It is also important that they are SEO friendly. By this, I mean your URLs should not change once the user or bot has visited the page – for example through Push State URLs to control onpage features – otherwise, the search engine will spend resources trying to figure out the canonical URL. If it can’t do this, don’t expect the page to be indexed or at a minimum rank particularly well.
Ultimately, each page should have its own unique URL to prevent duplication and indexation issues.
Testing your website is absolutely critical and you should ensure through the website build, or as post-launch fix, that your website’s pages appear in the DOM tree. What’s a DOM tree? It shows the pages content structure and each element’s relationship with one another.
To review this, you’ll need to use the browser’s Inspect tool to view the rendered HTML. You can use this information to review all the content on the page is populated within the HTML. If it is not, client-side rendering is likely taking place and Google may not be able to “see” this part, or the whole of the page.
Tools such as Screaming Frog, Sitebulb and Deepcrawl can collect both the original HTML and the rendered HTML to allow you to compare and discover any potential issues on the website.
For spot checks, Google Search Console’s URL Inspection Tool, the Mobile-Friendly Testing Tool and the Structured Data Testing Tool can be used. These tools use Google’s evergreen bot which will render JS as Googlebot would. If there are any issues within these tools it will then be worth investigating further.
In SPA’s it’s very easy to neglect the basics of SEO, as we’ve seen with URLs and URL structure. Another area often overlooked is the page title within the <title> tag. As with all websites, this title should be targetted to your pages’ content and topical relevance. It still remains one of the most important ranking factors for on-page SEO.
Through keyword research and content analysis, you’ll be able to review, create and implement well-written page titles to assist with your website’s overall performance.
Ultimately, JavaScript will increase in popularity as long as it improves UX, strengthens interactivity and speeds the build of websites. SEO will be a continuing need regardless of the code or platform a website is on and education and collaboration between developers and SEOs is a must.
Google is also continuing to increase their resources to improve the speed of rendering and indexation of JS websites at scale – remember, at present, they say they will revisit in the 2nd wave “when resources become available”.
There is a tendency for SEOs to become overwhelmed when they hear a website is built using JS (I was one of them!), but if you simply look past the JS and understand the nuances of it, it’s not all too different.
To further discuss the contents of this post, or for help with creating a SEO strategy for your JS site, get in touch!