151 items found for ""
- Does your hosting company want you to succeed?
So you’re finally becoming successful online. It’s taken months and months of effort and dedication to design, develop, and launch your website, and then many more months and months of effort and dedication to make that website successful. And it’s working! Traffic figures are booming and you’re generating more revenue every day! The problem is, back when you launched the website you chose a cheap, small hosting package. After all money was scarce in those early days and you couldn’t afford a big dedicated hosting contract. You didn’t even know if your website was going to take off at all, let alone become a smashing success. So your small hosting package doesn’t suffice any more. Your website receives more traffic than the package was designed for, and your hosting company wants you to upgrade. Now, tell me, how would you prefer your hosting company to inform you of the need to upgrade? Do you prefer this approach? Or this approach? Hi there!First of all, congratulations! We’re seeing massive traffic coming in on your [domain.com] website, which can only mean your site is becoming a big success. Well done!However, we’re afraid we might have some bad news for you. The hosting package you have with us only allows for bandwidth usage of [X]GB a month, and right now you’re using [Y]GB a month. Additionally, all this extra traffic to your site is slowing down the other websites on your shared hosting environment, which we’re sure you’ll understand is not entirely fair on the owners of those websites.But no fear, we have a solution! We can migrate your website to a dedicated hosting environment, and ensure the switch will be as smooth and hassle-free as possible. That way your website can grow unhindered, and no other sites will be affected negatively by your continued online success!All we ask is that you sign up for our upgraded Dedicated Hosting package at $xx.xx a month (or one lump sum of $yy.yy for the whole year, saving you Z%), and we can guarantee that your website’s traffic can continue to grow undisturbed.Sign up now through your hosting control panel, or get in touch with our sales department at 123-4567890 to talk about what we can to do contribute to your website’s success.Kind regards, Hosting Company support No contest really, is it? A hosting company that unilaterally throttles your website when you become too successful is only interested in its own bottom line, and doesn’t care about its customers. They’ll find this an increasingly untenable business ethos. Find a hosting company that doesn’t kill your website at the first sight of a traffic spike, but instead tries to accommodate your success. Because such a hosting company knows that by enabling your success, they’ll be enabling their own.
- Grab Attention With A Great Headline
One of the issues I come across most often when reviewing the websites of our clients, is that the website doesn’t immediately make clear what its purpose is. The average web user has a very short attention span. Your website has about 4 to 6 seconds to make an impact, or the user will click that back-button and go somewhere else. So you need to grab that user’s attention and not let go. The first thing a user will look for when he lands on your site’s homepage is the headline. That headline will tell the user what kind of site he’s on and if it fits with his current needs. If the headline falls short, chances are the user will go somewhere else. So the headline on your homepage is very important. It needs to describe what your site’s goal is and how a user can benefit from sticking around. Just putting your company slogan there is usually a bad idea, unless you have a really good slogan that perfectly nails your core business. Usually this is not the case. Let’s use an example. Imagine you’re looking for a special travel arrangement to take your spouse on for your 10th anniversary. You come across a website that has this headline: Simple Enjoyment This might work fine as a company slogan, but it doesn’t tell you anything about what that site’s purpose is and whether or not you can find what you’re looking for. Now let’s try a different headline: Great Travel Arrangements For A Unique Holiday That already tells you a great deal more. It’s obviously a travel site and it sells arrangements. Seems like this is a worthwhile site to take a closer look in. Don’t be afraid to experiment with different headlines to see which one works best. Many aspects of building a better website aren’t an exact science, it’s about trying out different things until you hit the right formula. There are many more factors that come into play when turning a user into a customer, but it starts with getting the user’s attention. And a great headline does just that. P.S. A good headline that contains relevant keywords will also help you with getting found in search engines.
- Google is under attack, Search Engine Land to the rescue
Yesterday the Wall Street Journal published a piece of investigative journalism [paywall] about the US’s Federal Trade Commission’s case against Google, which petered out with barely a sizzle in 2013. That’s what being the largest corporate lobbyist in Washington DC gets you. According to a leaked document, many people inside the FTC found Google to be engaging in fierce anticompetitive behaviour, and wanted to pursue the case further. Some examples of Google’s nefarious activities uncovered by the Wall Street Journal include blacklisting of competing websites, favouring of its own properties (well, duh), and illegally restrictive contract policies. The image below, from WSJ’s Twitter, illustrates some key elements where FTC staff found Google in breach of the law, but where the eventual settlement with Google failed to act decisively: Now that it’s revealed the FTC’s case against Google probably should have gone much further and the search engine was let off very lightly indeed, there is a strong case to be made for more far-reaching litigation against Google in Europe. But Google needn’t worry, because the waters are already being muddled by Google’s own propaganda machine, primarily in the form of its biggest cheerleading blog Search Engine Land and its sister blog Marketing Land. Greg Sterling’s initial piece on Search Engine Land starts casting doubt on the importance of the leaked FTC document straight in its subheader: The rest of the piece is fairly toothless, happily emphasising that the FTC refused to litigate against Google and instead settled the case. Unsurprisingly there’s no mentioning of the manifold objections against that settlement from various different parties, nor of Google’s abundant lobbying efforts in the nation’s capital. But Greg does make a point of quoting Google’s chief counsel, once again iterating the FTC decided not to pursue. Apparently thinking that Greg’s initial piece wasn’t pro-Google enough, Danny Sullivan then publishes a more in-depth piece on Marketing Land. The main headline starts encouragingly: But quickly Danny takes on his favourite role of Google defender and starts casting doubt on almost every aspect of the Wall Street Journal’s piece and the FTC document. In the process Danny tellingly reveals that he does not understand how antitrust investigations work, as he repeatedly says that what Google did was also being done by other search engines. Anyone with even a casual understanding of antitrust law will realise that this is entirely irrelevant: the rules change when you become a monopoly, which Google definitely is – even Eric Schmidt has had to admit that. What makes for acceptable (if immoral) competitive behaviour in a more egalitarian marketplace, becomes illegal under antitrust law when you’re a monopoly. In all fairness, Danny probably understands this but still feels it important to point out that “Google wasn’t doing anything that rivals weren’t also doing”, thus casting unwarranted doubt on the FTC staff’s conclusions. Danny then goes on to link to and quote liberally from earlier posts he wrote about Google, all with his favoured pro-Google slant of course, and then adds several post-scripts to further clarify Google’s defense and make abundantly clear that no, really, the WSJ piece’s most damning evidence was just part of a ‘test’. He concludes by liberally paraphrasing Google’s hastily penned PR spin. I have no doubt that when Google’s more polished official press release on this matter is released, probably in the course of today, Marketing Land and/or Search Engine Land will publish it almost entirely and make a big fuss of how it disproves the accusations made in the WSJ article and FTC document. Fortunately Search Engine Land and Marketing Land are just enthusiasts’ blogs rather than proper news organisations, so we can hope that few policymakers will actually read their distorted propaganda. But the SEO industry will lap it all up, as they’ve always done, which can help set the tone for future debates on this issue. I’d advise everyone not to rely on a single blog or news site to inform your opinions. Read multiple viewpoints from different trustworthy sources and make up your own damn mind.
- Who Will You Vote For? Whoever Google Wants You To.
Search Engine Optimisers have known for years that top rankings in Google are about more than just getting traffic to your website. Occupying the first spot in a competitive organic search result has additional and less tangible benefits, such as improved brand recognition, trust, and authority. All SERP click-through studies show that the top results get by far the majority of the clicks. This simple fact, taken for granted, actually has profound consequences. First of all we need to truly understand why the top results on Google are so dominant and why so few people bother scrolling down or, heaven forbid, go to the second page of results and beyond. There must be a strong sense of trust accorded to Google’s search results, in that the vast majority of people trust Google’s judgement and that the results shown present the best possible websites for that particular search query. It’s this inherent trust in Google’s search results that has such far-reaching and, until now, mostly unexplored repercussions. Research psychologist Robert Epstein is one of several researchers looking in to the effects of search results on the human psyche. More specifically, he’s looking at how Google’s search results can impact elections. In a great essay published on Aeon, he shows how Google’s rankings can have an immensely powerful impact on how undecided voters view political candidates, to such an extent that the outcome of elections can be decided by which webpages Google decides to rank at the top of their results: On average, we were able to shift the proportion of people favouring any given candidate by more than 20 per cent overall and more than 60 per cent in some demographic groups. Even more disturbing, 99.5 per cent of our participants showed no awareness that they were viewing biased search rankings – in other words, that they were being manipulated. As a veteran of the SEO industry, this doesn’t really surprise me. We’ve known for years that achieving top rankings on Google carry a lot of weight beyond the immediate traffic boost. What few of us have ever bothered to think about, however, is exactly how potent a force Google’s results can be in the wider context of society’s dynamics. This is what Robert Epstein’s research is showing, and it’s a terrifying thought. Algorithmic Bias The immediate focus will be on how Google decides to rank webpages, and how the inherent bias of their ranking algorithms will impact on political viewpoints. Google’s defense will be based on algorithmic independence, but this is a thin shield as the algorithm itself is of course fully created by people. Google’s search engineers have their own fair share of biases and personal beliefs, and those could very well influence how the algorithm decides which webpages deserve to rank. Every tweak made to the algorithm to improve the quality of search results is, in essence, an editorial decision made by a Google engineer that a certain type of webpage deserves to outrank another type of webpage. Even if these tweaks are made on the basis of objective quality metrics, it’s very easy for political beliefs to creep in to these algorithmic tweaks without engineers’ conscious awareness. After all, our beliefs and convictions influence everything we do on a subconscious level. Our behaviour and decisions are the end product of internal mental processes that we’re only superficially aware of – most human behaviour emerges from the unknown depths of the subconscious mind. And in the context of Silicon Valley’s homogeneous environment, it seems logical to conclude that Google’s ranking algorithm will have some inherent bias towards a certain belief system that most of its engineers adhere to. So Google cannot honestly claim its algorithms are entirely objective and free of bias. Even without any conscious manipulation of search results, there’s an implicit prejudice built in to the search engine. Legal Obligations Yet regardless of what Google claims about its ranking algorithms, there’s a second issue that’s much more important to discuss. This is Google’s legal obligation, as a publicly traded company, to maximise profit. Epstein refers to that incentive in his essay, when he discusses Facebook’s controversial political manipulation experiment: Is Facebook currently manipulating elections in this way? No one knows, but in my view it would be foolish and possibly even improper for Facebook not to do so. Some candidates are better for a company than others, and Facebook’s executives have a fiduciary responsibility to the company’s stockholders to promote the company’s interests. This hits the nail directly on its head. Like Facebook, Google has an obligation to its shareholders to maximise profit. Combine that with the power to sway elections in favour of specific candidates, and you have a recipe for electoral manipulation that ensures candidates are elected which promote policies favouring Google’s interests. This is the topic of an Evgeny Morozov column, in which he discusses how Silicon Valley giants like Facebook and Uber are using their reach to directly impact on the democratic process: But Uber also added a De Blasio feature to its app – an unmissable “NO CARS – SEE WHY” sign placed on New York’s map. On clicking it, users were told Uber would look like this if De Blasio won. Users were encouraged to email the mayor and the city council with a handy “EMAIL NOW” link. Eventually, De Blasio capitulated. So far the attention has been on Facebook and Uber, who have already actively used their immense reach for political purposes. It’s time we expand our attention and include Google as well, now that there’s abundant research showing exactly how powerful a tool their search results can be when it comes to influencing public opinion. Such manipulation is almost entirely undetectable, which begs the question; do we simply trust these technology giants to be neutral and not abuse this enormous power they’ve gathered, or do we find some way to ensure that Google and Facebook do not have decisive influence on who gets to be the next President or Prime Minister? And it’s not just election outcomes at stake here. From Epstein’s essay: We have also learned something very disturbing – that search engines are influencing far more than what people buy and whom they vote for. We now have evidence suggesting that on virtually all issues where people are initially undecided, search rankings are impacting almost every decision that people make. They are having an impact on the opinions, beliefs, attitudes and behaviours of internet users worldwide – entirely without people’s knowledge that this is occurring. Silicon Valley companies want to be part of everything we do, all the time, to monetise every aspect of our daily lives. But beyond the immediate commercial gains, this grants Silicon Valley a very real and direct control over what we believe, who we trust, and how we behave. In effect, with every search on Google, every post and like on Facebook, and every ride booked on Uber, we’re handing ever more power to a very small elite of men in California. I don’t know about you, but that thought makes me very uncomfortable indeed.
- Google: Screwed if They Do and Screwed if They Don’t
It’s not often I feel pity for Google. In fact, pity is one of the emotions I most rarely associate with Google. But sometimes I do feel a twinge of pity for their engineers and decision makers, because no matter what they do they’ll always be screwed. Take for example the travel industry. In a lengthy and heart-felt open letter, the owner of a small tourism company admonishes Google for pushing SMEs out of business by allowing Google search to be dominated by big brand aggregators: In its simplest form, it means the big bags of money control the opportunity not the product. You can’t necessarily find the right product anymore, you find what they want you to find. Google just facilitated the demise of countless businesses by search prominence on the main avenue to commerce. It sounds like supermarkets all over again, except it’s worse. I have to drive to a supermarket, but can see the other shops on the way, who interestingly are having a comeback for all these reasons. This situation is of course the inevitable end result of the manipulation of Google’s organic search algorithms. If a system can be manipulated for profit, inevitably it will be dominated by big companies who can afford to spend the most effort on manipulation. This is true for financial systems and commodities markets, as much as it is for Google’s search algos. The only way Google can level the playing field is to make their search algorithms smart enough to give SMEs the same authority as big brand websites, so that they can rank high for relevant searches. Google has been unable to do this, so they do the next best thing: they give small businesses an artificial leg up by introducing a Google-powered element of the SERPs that allows SMEs to claim some visibility on a relevant search result. This effort was called Google Local – since then renamed a few times, most recently called Google Places until a few weeks ago when the latest label has been slapped on it: Google My Business. The trouble with that is that it pissed off the big brand aggregators. Because it’s a Google-powered system that now pushes the aggregator websites down the search results, these big aggregators feel Google is cheating and rigging the game in its own favour. So the big brands combine forces, form lobby groups like FairSearch.org, and convince antitrust regulators that Google needs to be muzzled and controlled. Google simply can’t win. Either they do their best to give small businesses an advantage in search, and piss off the big brand aggregators, or they give in to the big brands (and the Pigeon update certainly seems to indicate that) and the small business owners are left behind. Either way, Google gets all the flak and none of the credit. I only feel a tiny bit sorry for Google though. Because after all, despite this hassle, they do seem to be doing just fine.
- When it comes to Google+, SEOs are never wrong
Google, by way of John Mueller, said yesterday it’ll be killing off the rich author snippet in search results. The author snippet, enabled by implementing rel=author tags on your content, gave you a rich search result showing your Google+ profile photo and circle count: This snippet will now disappear, apparently because Google wants to “clean up the visual design of our search results”. Since the entire SEO community has spent considerable amount of effort the last year or two to get clients to adopt rel=author precisely to get these rich snippets, I want to extend my sincere fuck you’s to Google for this move. Fuck you very much. But we really shouldn’t be surprised. After all, despite abundant claims to the contrary, Google+ as a social sharing platform is, as TechCrunch put it, ‘walking dead‘. This author snippet removal is just another nail in Google+’s coffin. Yet many SEOs continue to adhere to the view that Google+ is here to stay, despite all the writing on the wall. The thing is, the way these SEOs frame the debate, they’re right. Because certain aspects of what the Google+ name stood for will definitely stick around. You see, Google+ is not just the social sharing platform – it’s what most of us think of when we say ‘Google+’, but it’s only a small part of the amalgamation of systems and services that Google smashed together to make the lovely fragrant potpourri that Google+ is. After all, Google+ starts with having a Google account. If you have a Google account, any Google account on any of Google’s platforms, you are in effect a Google+ user. And since Google accounts are most assuredly not going to disappear, these Google+ fanboy SEOs claim that Google+ is a massive success. It doesn’t take a genius to spot the flaw in that argument, of course. No matter how you twist it, Google+ as a social platform is a disaster. So when we talk about the ‘death of Google+‘, we mean the demise of that social platform. No amount of moving the goalposts is going to make that any less true. But since SEOs hate to be proven wrong, the goalposts will continue to move, and Google+ will continue to be redefined and reshaped in the collective minds of the fanboys, so that they can claim they were right all along and Google+ is here to stay. Because SEOs are never wrong, you see. Even when they are.
- Google: Guardian of the World Wide Web?
When Google was first launched as a search engine in 1998, it fulfilled a great purpose: it helped make the information stored on the world wide web accessible and easy to find. Google’s link-based ranking algorithm resulted in much more relevant and high quality search results than other search engines could produce at the time. This gave Google an advantage that it has capitalised on ever since. But Google was not content to just passively crawl and index all the content it could find. The moment Google realised that webmasters cared about top rankings in Google – cared a lot in fact – they also realised that webmasters would do their very best to adhere to whatever guidelines Google set out as ‘best practices’ for ranking highly in their search results. Over the years Google has made many recommendations to webmasters about how websites should be structured, how content should be marked up and optimised, how sitemaps should be used, and so on. And, as Google’s marketshare in search grew and the search engine began to dominate the web as the default start page for almost every conceivable query, webmasters have put more and more effort in to adhering to Google’s every guideline. Here are some of the SEO guidelines Google has been proclaiming that have had a profound impact on the way websites are built and structured, and how content is being written and shared: * No content embedded in Flash, JavaScript, images, etc One of the oldest edicts of Google rankings has been to not ‘hide’ content inside code that Google can’t read. Things like flash-based websites, JavaScript-powered navigation, and text embedded in images are big no-nos on the web nowadays, simply because Google can’t read and index that content. * Websites need to load fast For years Google has been saying that websites need to have a short load time, and that slow loading websites are likely to see some sort of negative impact on their search rankings. So webmasters devote a lot of effort to making their websites fast to load. * Structured data In recent years Google has been encouraging websites to implement structured data, especially schema.org, to mark up information. The reward for implementing structured data is so-called ‘rich snippets‘ in search results that could provide a competitive advantage. * Responsive websites For a long time a separate mobile site was deemed the best solution for enabling a website to work on smartphones, followed closely by separate mobile apps. But Google decided it preferred to see so-called responsive websites: sites that adapt their layout and structure to the size of the screen it’s being shown on. As a result responsive design has become the de-facto standard of building websites. * Human-readable URLs One of the eternal areas of friction between developers and optimisers is the URL structure of a website. Parameter-driven URLs are often easier to implement, but Google prefers human-readable URLs (and preferably one URL per piece of unique content), which leads to all kinds of extra hoops for developers to jump through when building complicated websites. * Nofollowed links Since Google first introduced support for the rel=nofollow tag back in 2008, the recommendation of when to use it has significantly broadened in scope. Nowadays webmasters are encouraged to nofollow every link they don’t personally vouch for, and can see their sites penalised if they don’t. * SSL on all sites The latest guideline – unconfirmed as of yet – is that Google wants to see all websites use SSL encryption, and that sites without SSL will be ranked lower in Google’s search results. If this becomes official policy, no doubt SSL certificate providers will be laughing all the way to the bank. Nearly all of the guidelines listed above have had – or will have – a profound impact on standard practices in web design & development. And it would be fair to say that nearly all of these guidelines result in a better user experience for people on the web. But Google’s motives are not exactly altruistic. It is of course a company devoted to profit maximisation, and these guidelines almost always have a benefit for Google itself: Full indexing of all content: by ensuring websites do not ‘hide’ content in Flash or script languages, and that the content is marked up with structured data, Google can crawl and index more of the web and make sense of the content more easily. Faster crawling & indexing: fast-loading websites containing structured data, single human-readable URLs for all pieces of content, and no separate mobile sites all ensure that Google’s crawling & indexing systems can perform more efficiently and crawl the web faster and using less resources. Clean link graph: by encouraging webmasters to use the nofollow tag where there is doubt about the editorial value of a link, Google can outsource much of the filtering of the link graph to webmasters. The result is less spam in the link graph for Google to sort out. The main issue with all of the above is how Google’s guidelines are becoming the standard way of doing things on the web. By virtue of its immensely dominant power in online search, few companies can afford to ignore Google and do things their own way. And even when a company decides to focus its attention on building a powerful offline brand, thus reducing their reliance on Google, the search engine still finds ways to capitalise on the brand’s popularity, as evidenced by these examples of brand name searches on Google: Click image for larger version The inevitable end result is that Google’s ever-changing guidelines, no matter what their basis – be it improved usability or Google’s own profit-maximisation – will become web standards, and websites that fail to adhere will be ‘punished’ with lower rankings in Google’s search results. That in turn leads to lower traffic figures, sales, profits, etc. It’s a downward spiral with only one winner: Google itself. Obviously this is not how the web was envisioned. The world wide web as invented by Tim Berners-Lee was intended as a platform of liberation, of free flowing information and collaboration imbued with an ‘anything goes’ mentality that has enabled tremendous innovation. Increasingly, the web is now becoming enslaved to the whims of a few big corporations, with Google leading the pack. Governments are not alone in threatening the very foundation of the web (though they certainly do, albeit in a very different way). The world wide web is being forced to slavishly adhere to the mood-swings of a handful of mega corporations that serve as the portals to the vast wealth of content published on the web. The question is, are we content to let profit-seeking corporations decide for us how the web should be, or can we reclaim the web’s free spirit and anarchic roots to allow us to make our own destinies online?
- How to Find & Fix Crawl Optimisation Issues – #BrightonSEO
BrightonSEO is the largest and post popular SEO conference in the UK, and the April 2016 edition sold out in record time. I was fortunate enough to speak at this event, sharing a stage with the wonderful Dawn Anderson and Oliver Mason for the ‘crawl’ session. Dawn delivered a great talk about crawl rank, and Oliver showed us how to handle large server log files. My talk was about finding and fixing crawl optimisation issues on large websites, with examples from real client websites. Here are the slides from my talk: Judging from the feedback on Twitter, the session was a success and the attendees got plenty of actionable tips to help improve their SEO efforts. My talk got mentioned in several BrightonSEO roundup posts as well: Tecmark Impression Unwritten DeepCrawl Distilled
- What Every Web Developer Should Know About SEO
The problem with SEO is that it is often controlled by marketers. Marketers aren’t inherently bad people, but when you get a bad one (of which there are many) any information you receive around SEO is going to be filled with buzzwords and soft outcomes. From a development point of view SEO is the concern of how well a robot can read and understand your content. As we will see, a robot being able to read your content easily is normally a good thing for humans too. The following sections are going to explain several topics that are clearly within the developer’s remit and a good understanding of their impact for both humans and robots will help in any project you work on. Site Speed How fast your site loads and is perceived to have loaded is a highly technical challenge. Assets need to be as small as possible for transmission and maintain a high quality. You should care about how many network requests are being made per page load. You need to care about perceived page load, so getting content onto the screen as quickly as possible. The order things come down the network at is important. A global internet means not everyone is accessing your site on a broadband connection. Mobile internet means you can’t guarantee the transmission of data will even complete if it takes several cycles. Why Site Speed is good for SEO Site speed has been listed as one of Google’s ranking factors. Naturally the faster the site the higher potential score you will get for this one part of their algorithm. According to Moz’s breakdown of website speed and ranking the key factor is the time it takes for the first byte of information to come across the pipes. If a search engine’s crawlers can download the contents of your page quickly it is going to do it more often than if it takes seconds per request. When people are researching for an article they are writing, they are more likely to stick around and read a page that responded quickly. This means your content is being absorbed by more people and has a greater chance to be linked to by someone. Why we should care about Site Speed anyway Even if you don’t care about SEO you can’t argue that slower is better, there are several studies showing that faster page loads are better for everyone. Take this KissMetrics writeup for example. Slow speeds can be an indicator that there is a query that is taking too long or a memory leak happening somewhere, if so your site may not be using the resources on your server efficiently and you may be spending money on a package you don’t actually need. Redirects Redirects are the hoops that your server jumps through when a browser asks for a page at a particular URL but knows it lives at a different location. There are several things that need to be considered: Over the lifetime of your website potentially thousands of other sites will link to pages that you had long since forgotten about. You can do redirects at various levels, each one comes with maintainability issues. If done wrong can have a negative effect on your site. Can be broken for months before someone notices. Each redirect has an implied latency. Why Redirect are good for SEO Search engines like there to be one canonical place for everything, so if you have two paths that lead to the same content this is confusing for them. If instead you say that anytime someone types https://www.mysite.com/my-page and you change it to https://mysite.com/my-page automatically then the search engine doesn’t have to worry about several places. This comes into play heavily when content moves completely, perhaps between domains. Doing redirection well ensures that any past page authority is transferred to their new home. Why we should care about Redirects anyway Nobody likes dead links, this can easily happen when something major about the structure of your site changes (domain name, internal structure). If a user goes to your site and gets a 404 they are not going to try subtle variations of the URL in order to get to the content, they will go onto the next site. Even if the link isn’t dead, people don’t like jumping between 5 different URLs before getting to the content. If done poorly this can result in multiple network requests which is inefficient. Status Codes Status Codes are the codes returned from your server after a request has been made, as a developer you need to make sure you are returning the correct code at any given moment. If you return a status code of 500 but meaningful content still is returned, will a search engine index it? Will other services? Search engines care a lot about the 3xx redirection status codes. If you have used a CMS to build your site it sometimes isn’t apparent what codes are being used where. Why Status Codes are good for SEO The status code returned is one of the primary things a search engine has to know what to do next. If it gets a 3xx redirect notice it knows it needs to follow that path, if it gets a 200 it knows the page has been returned fine, etc. Making sure all your content is returning on the 200 code and all your redirects are appropriately using the 301 code means search engines will be able to efficiently spider and rank your content. Why we should care about Status Codes anyway We should care about status codes anyway because search engines are not the only thing that might care about the content on your site; browsers, plugins, other sites (if you have built an API) all could potentially care about what code is returned. They will behave in ways you might not expect if you return invalid or incorrect codes. Semantic Markup Semantic Markup is markup that has inherent meaning associated with it, a simple example would be to know that the element is going to be the overarching heading for the section you are in. There are some very subtle things that should be considered when choosing markup When content should use elements like , , , , etc. When does it make sense to additionally use semantic attributes, for example those suggested by schema.org. Be prepared to make CSS changes to accommodate the default styles, remember there is a difference between design and function. Don’t just use elements like because you can in place of a . You have to realise that all elements come with an inherent semantic value (even if that is to state “I have no semantic value”). Why Semantic Markup is good for SEO Semantic Markup is excellent for SEO because you are literally giving the content on your page meaning that a search engine can easily understand. When you use the schema.org suggestions for a review, search engines will know that when you say 3/5 at the end that what you mean is you have scored it 3 out of 5 and will potentially show those amount of stars on their search result page. Semantic markup lets you group and link content. The old way of thinking was that a page could have one element, and that was normally reserved for the name of the site. Now because of the likes of and we can have grouping that make sense. This means search engines can have a much easier time of parsing longer articles. Why we should care about Semantic Markup anyway We should care about this anyway because search engines are not the only things looking at our site. Assistive technologies such as screen readers can use semantically marked up documents a lot easier. For example, when you markup content with an element some assistive technologies know to leave that out of the main content when reading aloud to a visually impaired user. Maybe your user can’t concentrate on large articles with lots of information. By semantically breaking down this information they can clip what they need to view how they like to view things. Search engines aren’t the only robots out there looking at your site. Other services could hit your site and look for the likes of a CV, if you have used the correct markup and semantics that would be an easy task. URL Structures URL Structures are what you see when you look in the address bar, so they could be something like mysite.com/my-awesome-page/ or they could be mysite.com/?p=233432 Getting these structures right requires some thought and some technical knowledge. Do I want to have a deep structure like site.com/category/theme/page.html. Are the structures consistent across my site. Are the structures meaningful to anything but the site’s code. Is there a logic to them that a new developer could follow and add to. Why URL Structures are good for SEO A good URL structure is good for SEO because it is used as part of the ranking algorithm on most search engines. If you want a page to rank for “purple beans” and your URL is mysite.com/purple-beans/ then search engines will see that as a good sign that the page is going to be dedicated to the discussion of purple beans. The URL will appear in search results, if it makes sense people are more likely to click on it than if it is a jumble of IDs and keywords. A good URL will serve as its own anchor text. When people share the link often they will just dump it out onto the page, if the structure makes sense it will allow your page to rank for those terms even without someone setting it up correctly. Why we should care about URL Structures anyway Outside of the context of search engines, we encounter URLs all the time and as users of the web we appreciate it when things make it simple. Your users will appreciate it when they can look at a URL coming off your site that just makes sense, if they can look at a URL and remember why they have it in a list without needing to click into it, that is a big win. Over the lifetime of a website you will be surprised how much of your own admin you will need to do that will involve you looking at the structure of the URLs. If you have taken the time to do them right it will make your life much easier. A note about JavaScript and SEO I wanted to end by mentioning JavaScript very briefly. A lot of websites you will create are going to be JavaScript driven or at least rely on it very heavily. There are various schools of thought on if this is a good thing or not but the fact is JavaScript has happened! It used to be that search engines couldn’t even follow a link that was using a JavaScript onClick function, they have come a long way since then and can do an excellent job of ranking sites that are completely made in JavaScript. That being said search engines are not perfect at this task yet so the current advice still has to be that if you want something to be seen by search engines then you should try and make sure there are as few things blocking them by seeing it as possible. his blog and on Twitter.
- Protect your Staging Environments
A lot of web design agencies use online staging environments, where a development version of a website resides for clients to view and comment on. Some agencies use their own domain to host staging environments, usually on a subdomain like staging.agency.com. There is a risk involved with online staging environments: Google could crawl & index these subdomains and show them in its search results. This is a Bad Thing, as often these staging websites contain unfinished designs and incomplete content. Public access to these staging websites could even damage a business if it leads to premature exposure of a new campaign or business decision, and could get you in to legal trouble. Today, whilst keeping tabs on some competitors of mine, I came across this exact scenario: The name has been redacted to protect the guilty – I’ve sent them an email to notify them of this problem, because I want to make sure their clients are protected. A business shouldn’t suffer because of an error made by their web agency. How To Protect Your Staging Sites Protecting these staging environments is pretty simple, so there really isn’t an excuse to get it wrong. Robots.txt blocking For starters, all your staging environments should block search engine access in their robots.txt file: User-agent: * Disallow: / This ensures that the staging website will not be crawled by search engines. However, it doesn’t mean the site won’t appear in Google’s search results; if someone links to the staging site, and that link is crawled, the site could still appear in search results. So you need to add extra layers of protection. You could use the ‘noindex’ directive in your robots.txt as well: User-agent: * Disallow: / Noindex: / This directive basically means Google will not only be unable to access the site, it’ll also not be allowed to include any page in its index – even if someone else links to it. Unfortunately the ‘noindex’ directive isn’t 100% fullproof; Tests have shown that Google doesn’t always comply with it. Still, it won’t hurt to include it in your robots.txt file. Htaccess login The next step I recommend is to put a password on it. This is easily done on Apache servers with an .htaccess login. Edit the staging site’s .htaccess file (or, if there isn’t one, create it first) and put the following text in the file: AuthType Basic AuthName "Protected Area" AuthUserFile /path/to/.htpasswd Require valid-user Then create a .htpasswd file in the path you’ve specified in the .htaccess file. The .htpasswd file contains the username(s) and password(s) that allow you to access the secured staging site, in the [username]:[password] format. For example: john:4ccEss123 However you probably want to encrypt the password for extra security, so that it can’t be read in plain-text. A tool like the htpasswd generator will allow you to create encrypted passwords to include in your .htpasswd file: john:$apr1$jRiw/29M$a4r3bNJbrMpPhtVQWeVu30 When someone wants to access the staging site, a username and password popup will appear: This will make your staging environment much more secure and will prevent unauthorised access. IP address restriction Lastly, as an additional layer of protection, you can restrict access to your staging sites to specific IP addresses. By limiting access to the staging sites to users coming from specific networks, such as your own internal office network and the client’s network, you can really nail the security down and make it impervious to access for all but the most determined crackers. First of all you’ll want to know your office’s internal IP address as well as that of your client’s. This is pretty simple – you can just Google ‘what is my ip address’ and it’ll be shown straight in search results: Have your client do the same and get their office’s IP address from them. Check that you’re both using fixed IP addresses, though – if you’re on a dynamic IP address, yours could change and you’d lose access. Check with your internet service provider to make sure. Once you’ve got the IP addresses that are allowed access, you need to edit the staging website’s .htaccess file again. Simply add the following text to the .htaccess file: order allow,deny allow from 123.456.789.012 deny from all This directive means that your webserver will allow access to the site for the specified IP addresses (and you can have as many as you want there, one per line) and deny access to everyone else. With those three security measures in place, your staging environments won’t be so easily found any more – and certainly not with a simple ‘site:’ command. How To Remove Staging Sites from Google’s Index Once you’ve secured your staging environments, you’ll also want to remove any staging websites from Google’s search results in case they already show up. There are several ways of doing this: Use Google’s URL removal tool In Google Search Console (formerly known as Webmaster Tools) you can manually enter specific URLs that you want removed from Google’s search index: Simply create a new removal request, enter the URL you want deleted from Google’s index, and submit it. Usually these requests are processed after a few days, though I’ve seen them handled within a few hours of submitting them. The downside of the URL removal tool is that you need to do it manually for every URL you want deleted. If entire staging sites are in Google’s index, this can be a very cumbersome process. Noindex meta robots tag Another way to get pages out of Google’s index is to include a so-called meta robots tag with the ‘noindex’ value in the HTML code of every page on your staging site. This meta tag is specifically intended for crawlers and can provide instructions on how search engines should handle the page. With the following meta robots tag you instruct all search engines to remove the page from their indeces and not show it in search results, even if other sites link to it: When Google next crawls the staging site, it’ll see the ‘noindex’ tag and remove the page from its index. Note that this will only work if you have not blocked access in your robots.txt file – Google can’t see and act on the noindex tag if it can’t re-crawl the site. X-Robots-Tag HTTP Header Instead of adding the meta robots tag to your website – and running the risk of forgetting to remove it when you push the site live – you can also use the X-Robots-Tag HTTP header to send a signal to Google that you don’t want the site indexed. The X-Robots-Tag header is a specific HTTP header that your website can send to bots like Googlebot, providing instructions on how the bot is allowed to interact with the site. Again you can use the Apache .htaccess file to configure the X-Robots-Tag. With the following rule you can prevent Google from crawling and indexing your staging site: Header set X-Robots-Tag "noindex, nofollow" With this rule, your Apache webserver will serve the ‘noindex,nofollow’ HTTP header to all bots that visit the site. By having this .htaccess rule active on your staging site, but not on your live site, you can prevent your staging websites from being crawled and indexed. Note that, like the meta noindex tag, the X-Robots-Tag header only works if bots are not blocked from accessing the site in the first place through robots.txt. 410 Gone status code Finally, another approach is to serve a 410 HTTP status code. This code tells search engines like Google that the document is not there anymore, and that there is no alternative version so it should be removed from Google’s index. The way to do this is to create a directive in your .htaccess file that detects the Googlebot user-agent, and serves a 410 status code. RewriteEngine On RewriteCond %{HTTP_USER_AGENT} googlebot [NC] RewriteRule .* - [R=410,L] This detects the Googlebot user-agent, and will serve it a 410 HTTP status code. Note that this also will only work if there’s no robots.txt blocking in place, as Google won’t quickly remove pages from its index if it doesn’t find the new 410 status code when trying to crawl them. So you might want to move the staging site to a different subdomain and secure it, then serve a 410 on the old subdomain. The 410 solution is a bit overkill, as Google will remove a page from its index after a few weeks if it keeps getting a 401 Access Denied error and/or if it’s blocked in robots.txt, but it’s probably worth doing just to get the site out of Google’s index as soon as possible. Security Through Obscurity = Fail In summary, don’t rely on people not knowing about your staging servers to keep them safe. Be pro-active in securing your clients’ information, and block access to your staging sites for everyone except those who need it. These days, you simply can’t be safe enough. Determined crackers will always find a way in, but with these security measures in place you’ll definitely discourage the amateurs and script kiddies, and will prevent possible PR gaffes that might emerge from a simple ‘site:’ search in Google.
- Open Letter to the SEO Industry
Dear SEO Industry, How are you all doing? I’ve now been a member of the industry for, well, most of my professional life, and I feel it’s time I share some things with you. You see, I’ve been thinking a lot about our industry in the past few years, and I’ve come to an important realisation: I love being an SEO. And the reason I love being an SEO is because of you, the industry. And I feel I need to explain why. Because this industry of ours, it’s not like any other industry. It’s something special. And it’s important for me to try and describe why it’s special. Maybe I’m trying to describe the smell of clouds, or how the colour purple makes me feel, but I’ll do my best nonetheless. SEO is an amazing industry filled with amazing people, and I want to pay tribute to that somehow. So consider this my love letter to the SEO industry. I love the SEO industry because… It’s never boring. This is such a fast-moving discipline that there is no such thing as routine. The way we do things now will be different in six months, and considered obsolete in a year or two at most. What we do is make things easily findable. Our primary purpose is to ensure that search mechanisms such as web search engines show our clients’ content first. And because search mechanisms change so rapidly, we have to change too. But we’re not a reactive industry. Yes, most of what we do is done in response to what search engines want, but often we can outsmart search engines and come up with novel and interesting ways to make content rank. Search engines don’t always like it when we do this. But I love it. We’re valuable. It always baffles me that there are so many people in other industries, from developers to classic marketers, that proclaim SEO to be useless. Yet these same people will use Google several times a day to find what they are looking for, and never realise the irony of their proclamations. SEO is important because, regardless of what some say, content doesn’t rank on its own merits. Content needs our help. Without SEO, search engines would struggle to find and index most of the web. Without SEO, most websites would struggle to find an audience. Without SEO, the web would be a smaller place, with fewer websites that would dominate it all. Every time I see a business grow and prosper because they improved how people found them and interacted with their content, I feel a sense of pride. We matter because what other people do matters, and we help them do it better and on a larger scale. In the grand scheme of things, what we do amounts to a very modest contribution to the world, but it’s a positive one nonetheless. Be proud of it. We’re a family. This, probably more than anything else, is what I love about being an SEO. We care about one another, like one huge, world-wide, slightly dysfunctional but ultimately very supportive family. It’s that sense of shared values and community that makes the SEO industry such a special family. We want other SEOs to succeed, even when we compete with them. We care about what happens to members of our SEO family, and will come together to support them when needed. Not everyone who says they’re an SEO are actually part of this family. We all recognise these wannabe-SEOs when we see them – in fact, we can smell them a mile away. The slick salesman who tries to steal your clients. The fly-by-night outfit that was a social media marketing firm only a week ago but gets a whiff of SEO and suddenly ‘pivots’. The self-proclaimed ‘leading SEO agency’ that no one has ever heard of. The ‘experts’ that never attend a conference, never contribute any insights, never say who they work for. Yes, we have plenty of people who say they are SEOs – but we know they’re not. They are not a part of this family. They’re not a part of it because they haven’t earned it. We have to earn it. Being an SEO is not something you just do on a whim. Yes, there are plenty of folks out there who take on the title and think they’re one of us. But we know better. Becoming an SEO is not something you do overnight. It’s something that has to be earned. And the way you have to earn it is another reason I love this industry. You earn your stripes as an SEO by being very good at it, by sharing your knowledge and expertise generously, and by supporting and mentoring other SEOs. It’s that generosity of time and knowledge that makes SEO so special. Most other professions are intensely competitive, where people jealously guard their secrets and see every other practitioner as an enemy. Yet in the SEO industry, we share our expertise, often without asking for anything in return. We help other practitioners solve problems, we share our experiences in blogs and at conferences, and we provide guidance and assistance as a matter of course. And often we do this for other SEOs that compete with us for the same client contracts or work in the same niche. Because they’re part of our family. And we support our family, no matter what. I didn’t choose SEO, it chose me. I never planned to become an SEO, it just sort of happened. But now that I’m a part of it, I can’t imagine working in any other industry. I’m proud to be an SEO, proud to be part of this family. And, from the bottom of my heart, thank you all for being so awesome. I love you, SEO industry. All the best, Barry
- Prevent Google From Indexing Your WordPress Admin Folder With X-Robots-Tag
I recently wrote an article for State of Digital where I lamented the default security features in WordPress. Since it is such a popular content management system, WordPress is targeted by hackers more than any other website platform. WordPress websites are subjected to hacking attempts every single day. According to Wordfence’s March 2017 attack report, there were over 32 million attempted brute force attacks against WordPress sites in that month alone. Out of the box, WordPress has some severe security flaws leaving it vulnerable to brute force attacks. One of these flaws is how WordPress prevents search engines like Google from crawling back-end administration files: through a simple robots.txt disallow rule. User-agent: * Disallow: /wp-admin/ While at first glance this may seem perfectly sensible, it is in fact a terrible solution. There are two major issues with the robots.txt disallow rule: Because a website’s robots.txt file is publicly viewable, a disallow rule points hackers to your login folder. A disallow rule doesn’t actually prevent search engines from showing blocked pages in its search results. I don’t recommend using robots.txt blocking as a method to protect secure login folders. Instead there are other, more elegant ways of ensuring your admin folders are secure and cannot be crawled and indexed by search engines. X-Robots-Tag HTTP Header In the context of SEO, the most common HTTP headers people have heard of are the HTTP status code and the User-Agent header. But there are other HTTP headers which can be utilised by clever SEOs and web developers to optimise how search engines interact with a website, such as Cache-Control headers and the X-Robots-Tag header. The X-Robots-Tag is a HTTP header that informs search engine crawlers (‘robots’) how they should treat the page being requested. It’s this tag that can be used as a very effective way to prevent login folders and other sensitive information from being shown in Google’s search results. Search engines like Google support the X-Robots-Tag HTTP header and will comply to the directives given by this header. The directives the X-Robots-Tag header can provide are almost identical to the directives enabled by the meta robots tag. But, contrary to the meta robots tag, the X-Robots-Tag header doesn’t require the inclusion of an HTML meta tag on every affected page on your site. Additionally, you can configure the X-Robots-Tag HTTP header to work for files where you can’t include a meta tag, such as PDF files and Word documents. With a few simple lines of text in your website’s Apache htaccess configuration file, we can prevent search engines from including sensitive pages and folders in its search results. For example, With the following lines of text in the website’s htaccess file, we can prevent all PDF and Word document files from being indexed by Google: Header set X-Robots-Tag "noindex, nofollow" It’s always a good idea to configure your website this way, to prevent potentially sensitive documents from appearing in Google’s search results. The question is, can we use the X-Robots-Tag header to protect a WordPress website’s admin folder? X-Robots-Tag and /wp-admin The X-Robots-Tag doesn’t allow us to protect entire folders in one go. Unfortunately, due to Apache htaccess restrictions, the header only triggers on rules applying to file types and not for entire folders on your site. Yet, because all of WordPress’s back-end functionality exists within the /wp-admin folder (or whichever folder you may have changed that to) we can create a separate htaccess file for that folder to ensure the X-Robots-Tag HTTP header to all webpages in that folder. All we need to do is create a new htaccess file containing the following rule: Header set X-Robots-Tag "noindex, nofollow" We then use our preferred FTP programme to upload this .htaccess file to the /wp-admin folder, and voila. Every page in the /wp-admin section will now serve the X-Robots-Tag HTTP header with the ‘noindex, nofollow’ directives. This will ensure the WordPress admin pages will never be indexed by search engines. You can also upload such an htaccess file configured to serve X-Robots-Tag headers to any folder on your website that you want to protect this way. For example, you might have a folder where you store sensitive documents you want to share with specific 3rd parties, but don’t want search engines to see. Or if you run a different CMS, you can use this to protect that system’s back-end folders from getting indexed. To check whether a page on your site serves the X-Robots-Tag HTTP header, you can use a browser plugin like HTTP Header Spy [Firefox] or Ayima Redirect Path [Chrome], which will show you a webpage’s full HTTP response. I would strongly recommend you check several different types of pages on your site after you’ve implemented the X-Robots-Tag HTTP header, because a small error can result in every page on your website serving that header. And that would be a Bad Thing. To check if Google has indexed webpages on your site in the /wp-admin folder, you can do a search with advanced operators like this: site:website.com inurl:wp-admin This will then give a search result listing all pages on website.com that have ‘wp-admin’ anywhere in the URL. If all is well, you should get zero results: The X-Robots-Tag HTTP header is a simple and more robust approach to secure your WordPress login folders, and can also help optimise how search engines crawl and index your webpages. While it adds to your security, it’s by no means the only thing you need to do to secure your site. Always make sure you have plenty of security measures in place – such as basic authentication in addition to your CMS login – and install a plugin like Wordfence or Sucuri to add extra layers of protection. If you liked this post, please share it on social media. You might also like to read this post about protecting your staging environments.