We Heart It A Brief History of Social Bookmarking Hi! Its me again! I’m back with more social media goodness to share. This time round, I’m touching on the brief history of social bookmarking and the advent of the image bookmarking phenomenon, PLUS a list of 10 image bookmarking sites (and 2 more!) and the seo benefits of image bookmarking. Bargain! UPDATE 17th May: Rand fishkin at SMX London has just confirmed that image ALT tags weigh more than H1 tags. As SEOs we are very much aware of the benefits of using social bookmarking as part of linkbuilding. Sites like Digg, Reddit and Stumbleupon are considered mandatory: bookmarking your blog posts and websites not only helps increases traffic to your webpage, it helps create a good mix of backlinks in your collection. From Social To Viral (The term viral here does not exclusively refer to videos that has generated a considerable number of hits in a short period of time, rather, an umbrella marketing term that refers to the use of existing social networks to produce an increase number of mentions / awareness on a particular topic, brand or trend) Sites like Digg, especially, has the potential of making your bookmarked link go viral. Essentially, you’re not just bookmarking a link, you are creating conversations around the topic in the link: Digg allows its users to comment on the link and share it with friends on twitter and facebook. Its no surprise that its popularity has spawned a great many number of digg-clone sites, most of them perusing the pligg tool to create their own social bookmarking sites. Not all of them are great but some of them are getting there: you can check out this massive list of digg-clone social bookmarking sites sorted according to page rank, alexa rank, dofollow and popularity: Social Bookmarking Sites Listed in Order of Pagerank, Alexa Rank, Popularity and DoFollow . Now here’s the thing: like directories, social bookmarking can be useful but also tedious and boring. Going through that list of social bookmarking sites you realize that not all of them have that sense of community, they try hard to emulate Digg and may succeed at its basic function, but the end result is just a mind-numbing collection of spammy looking links. The other problem is that: how many real humans go through these sites to search for information and inspiration? The Start of Image Bookmarking Enter image bookmarking. I love image bookmarking. Everybody loves looking at images. They are colorful, beautiful and they speak louder than a 500-word keyword rich article in an article website nobody reads. Image bookmarking came about after the popularity of design blogs: people don’t just want to rely on the sometimes infrequent updates of design blogs to get their daily dose of inspiration, they want to submit and share their own finds too. A List of 10 Image Bookmarking Sites + 2 more At the moment, I can only find 10 image bookmarking sites on the net. I am quite surprised this technique hasn’t caught on yet. WeHeartIt A simple image bookmarking site, open to everyone. Simply create an account and start submitting. They have a special bookmarklet which you can drag and drop into your browser so the next time you trawl the web and spot an amazing image, just click on it to submit to the site. Allows its members to heart their favorite image from the pool. The more hearts an image gets, the more popular it is. mages in here fall mostly into the photography catergory, the kind that is heavily filtered, warm-lensed and vintage looking. Vi.sualize.us Supposedly the first ever image bookmarking website. The owner wanted to create a bookmarking site that is not elitist and is open to all as well as mantaining its credibility as a truly inspirational visual website. Simply create an account and start posting. You can also download a plugin for your browser. Members can like an image and even post comments about it. Typeish A closed bookmarking community - and for a good reason! This is an image bookmarking community that carefully selects the images it displays on the site. And you can tell: the images all fall into a sort of artistic / design theme. To join, you need to email them and ask / beg for an invite. FFFFound FFFFound! Probably the premier image bookmarking site on the internet right now. It emerged after Vi.sualize.us and started off as a pretty simple and straight to the point image bookmarking site that allows you to register an account and post images. Its popularity forced it to close registrations and now you can only join FFFFound if you have an invite. Images in here fall strictly into the design, artistic and inspiration theme. IMGFave A simple, WeHeartIt clone made on Tumblr. Condense A french image bookmarking site. Currently a closed community but it intends on opening registrations soon. Images strictly into the graphic design spectrum: typography, architecture, packaging and ads. Picocool Another closed community image bookmarking site, but I wouldn’t call it inspiring really. The website looks bland in comparison to the rest I have mentioned here. You need an invite before you can even register, which is a downer. Yayeveryday One of THE BEST image bookmarking sites out there, except that the emphasis is on the artists themselves: original works / images made and submitted by the users. It is a community of artists, designers, photographers and the people who appreciate them. Users get dedicated profile pages that credits their work, websites, fans, etc. Members can comment on each other’s submissions. Enjoysthin.gs Simply, a place to share and save things you enjoy. People submit their favorite image, and users can rate the image by enjoying it. The more enjoys an image gets, the more popular it is. And a few more similar ones: Lookbook.nu A fashion community site that allows users to submit images of themselves wearing fashionable or stylish items of clothing. Members can hype a particular image and share the image on twitter and facebook. This is a large growing community already with a japanese version. The site cross promotes each and every submission in its own various microsites and social profiles on tumblr, facebook, twitter etc. Polyvore Similar to Lookbook, except that you can also buy the looks. Users can create looks from available items for sale on the site and images of their own and create style inspiration called sets. deviantART A community site that emerged during the livejournal craze. Oh man, I still remember when livejournal was awesome. Nostalgia. Anyway, deviantART is where users can create profile pages, post, discuss share and rate each other’s submissions. It is one of the largest social networking sites for emerging, amatuer and established artists and art enthusiasts with more than 13 million registered users. The SEO Benefits of Image Bookmarking Image bookmarking has the added benefit of going viral quicker than a simple text link. This is because sites like those mentioned above don’t just display your images, it also saves the link in it as well. We Heart It does not use the nofollow attribute on its links. So does Typeish and Enjoythi.gs. All these sites are a minimum of PR 5, and FFFFound doesn’t just keep your link, its saves the alt tags and title of the post it was submitted from as well. The plus side is that you don’t need to be an artist, designer or photographer to participate. As long as the image / content is interesting enough, you’ll make the cut. This also inspires and motivates you to create interesting and unique ideas and ways to market your site / brand. Also, if you are clever enough to replicate these websites, you will see how easy it is to get free content easily, sub-automatic community-driven and daily at that. A great, simple and legit link-baiting technique! Example of Image that has received many Hypes When a member submits an image that has received many hypes, likes or enjoys, they are sure to link back to the post from their own blog to show this off. People like to be popular and people love it when they get good ratings. The backlinks for you will just keep pouring in. If you add a link (like your client’s) with the image and if it gets reblogged and goes viral, all you gotta do is just harvest the links that gets generated. There is also the added bonus that these backlinks are all dofollows. I have also noticed that sites like these get a high Pagerank quicker than normal blogs. (Some of those sites mentioned above, according to their whois records were only created recently, between late 2007-2008.) Of course, the age old argument that an image’s alt tag does not weigh as much as anchor text on a text link will surface, but at the end of the day, a link is still a link and spiders can only read images as text if you leave the alt tags in. How do I know this works? Coz I’v tried it, look: Image Bookmarking Linkbuilding Why create directories and bookmarking sites when you can create image bookmarking sites? 🙂

Blog

Technical SEO Audit Automation with Python

Home

Blog

Simran Gill

30 April 2019

Technical SEO audits can be an extensive and often repetitive task – using Python to automate the essentials can be a massive help for any SEO team.

Fair warning – this is a bit of a technical blog post. If you have questions about the process, we’re always happy to talk about the details of technical SEO – so get in touch ?

Python is a programming language that was first released in 1991 and (helpfully) has a massive standard library of functions. The ‘core philosophy’ of Python is a natural fit with how we like to carry out SEO at Blueclaw:

Beautiful is better than ugly
Explicit is better than implicit
Simple is better than complex
Complex is better than complicated
Readability counts

As programming languages go, Python is straightforward to learn and has some immediate benefits when it comes to fetching, extracting and optimising HTML to super power your SEO. We recommend reading a few of the basic tutorials to get to grips with Python, but for the purposes of this blog post – let’s dive right in!

Fetching the HTML

The project fetches the page using the Requests library and then extracts the data using BeautifulSoup4. These fantastic packages are included with Anaconda for python 3.6, but if you don’t have them installed, do so using pip.

$ pip install --upgrade beautifulsoup4
$ pip install --upgrade requests

Now we need to fetch the HTML of a page and then parse it using BeautifulSoup’s HTML parser.

import requests
from bs4 import BeautifulSoup
URL = 'https://www.blueclaw.co.uk'
response = requests.get(URL)
soup = BeautifulSoup(response.text, 'html.parser')

This works well, and we can already extract SEO data from this object (for example, the title can be found using `soup.title`), but we will make this much easier to expand on by using classes.

Extracting the important tags

class SeoAudit:
    def __init__(self, url):
        '''
        If the url isn't provided with a scheme, prepend with `https://`
        '''
        self.url = requests.utils.prepend_scheme_if_needed(url, 'https')
        self.domain = requests.utils.urlparse(self.url).netloc
        response = requests.get(self.url)
        self.soup = BeautifulSoup(response.text, 'html.parser')
    def get_title(self):
        '''
        Title comes still wrapped in <title> tags.
        '''
        title_tag = self.soup.title
        if title_tag is None:
            return title_tag
        '''
        Using `get_text` and `strip` to remove the <title> tag and any
        leading or trailing whitespace.
        '''
        return title_tag.get_text().strip()

By creating an instance of the SeoAudit class, initialised with our desired URL, we are able to work with the BeautifulSoup object.

page = SeoAudit('https://www.blueclaw.co.uk')
print(page.get_title())
# Expected output:
# <title>Award-Winning UK SEO Company, Blueclaw Search Agency, Leeds</title>

Now let’s start to expand our class! We will also write methods to pull out the h1, meta description and any links that are on the page. We will use python’s RegEx library to check if the links on the page contain our domain.

import re
class SeoAudit:
    # ...
    def get_first_h1(self):
        h1_tag = self.soup.title
        if h1_tag is None:
            return h1_tag
        return h1_tag.get_text().strip()
    def get_meta_description(self):
        meta_tag = self.soup.find('meta', attrs={
            'name': re.compile(r'(?i)description')
        })
        if meta_tag is None:
            return meta_tag
        return meta_tag.get('content')
    '''
    Don’t get too bogged down here by the following RegEx’s - just
    know that they are being used to classify the links!
    '''
    def find_links(self, link_type='all'):
        if link_type not in ['all', 'internal', 'external']:
            return []
        if link_type == 'all':
            '''
            Don’t extract scroll, telephone or email links.
            '''
            href_ex = re.compile(r'^(?!#|tel:|mailto:)')
        elif link_type == 'internal':
            '''
            Only extract links which match the domain name or use
            relative paths.
            '''
            href_ex = re.compile(
                r'((https?:)?//(.+\.)?%s|^/[^/]*)' % re.sub(
                    r'\.', '\\.', self.domain
                )
            )
        elif link_type == 'external':
            '''
            Uses the not_domain method below to only match absolute
            paths which do not match the domain name.
            '''
            href_ex = self.not_domain
        a_tags = self.soup.find_all('a', attrs={'href': href_ex})
        return [tags.get('href') for tags in a_tags]
    def not_domain(self, href):
        '''
        A RegEx to determine if the href is not a url belonging to the
        given domain.
        '''
        return href and (
            re.compile(
                r'^(https?:)?//').search(href
            ) and not re.compile(
                re.sub(r'\.', '\\.', self.domain)
            ).search(href)
        )

Stripping the title and h1 tags is becoming repetitive. Let’s write a decorator to make the process easier in the future and to improve readability.

def strip_tag(func):
    def func_wrapper(*args, **kwargs):
        rtn = func(*args, **kwargs)
        if rtn is None:
            return rtn
        return rtn.get_text().strip()
    return func_wrapper
class SeoAudit:
    # ...
    @strip_tag
    def get_title(self):
        return self.soup.title
    @strip_tag
    def get_first_h1(self):
        return self.soup.h1

Lovely.

Finally, we will make our lives easier by writing another method in order to get this useful data into a Python dictionary

class SeoAudit:
    # ...
    def get_seo_data(self):
        return {
            'title': self.get_title(),
            'metaDescription': self.get_meta_description(),
            'h1': self.get_first_h1(),
            'internalLinks': self.find_links('internal'),
            'internalLinksCount': len(self.find_links('internal')),
            'externalLinks': self.find_links('external'),
            'externalLinksCount': len(self.find_links('external')),
        }

In this demonstration, I will use a simple input to provide the script with a URL. There are many better ways of doing this! I will also write the results to a JSON file named seoData.json

Making it user-friendly

In this demonstration, I will use a simple input to provide the script with a URL. There are many better ways of doing this! I will also write the results to a JSON file named seoData.json

import json
url = input('Enter a URL to analyze: ')
page = SeoAudit(url)
out_obj = page.get_seo_data()
with open('seoData.json', 'w') as f:
    json.dump(out_obj, f, indent=2)

We can configure this in any way we want. For example, parsing the URL as a command line argument, looping through a CSV or JSON file or a myriad of others! Plus, by using classes, we have made it easy to add new methods to extract even more insight from the page.

This project is intended to be a starter for a more complete technical SEO tool. Feel free to build on it to suit your needs – Happy coding!

Written by

Simran Gill

Contact.

We’re always keen to talk search marketing.

We’d love to chat with you about your next project and goals, or simply share some additional insight into the industry and how we could potentially work together to drive growth.

from XLMedia

We believe that great work comes from a passion for all things digital and an absolute commitment to excellence.