Business & Finance Advertising & sales & Marketing

Understanding Duplicate Content

Step 1 - Understand what you're dealing with So what is this dreaded duplicate content? Ok.
Simply put duplicate content is when two pages of a website are considered similar enough by search engines to not warrant being indexed.
NB - "similar enough" they do not have to be identical to be considered duplicate.
Many problems can arise from a page not being listed in Google's search index.
It cannot be found by searching, it cannot pass link reputation to other pages on your site, and it cannot pass page rank to other pages.
Basically put you are holding yourself back more than you need to be.
What's worse is that if you have one case of duplicate content you will no doubt have more and more constantly popping up.
The most common cases for duplicate content are on eCommerce stores, when products are listed to similarly.
As more products are added, more duplicate content arises, your link authority and page rank slowly dies, until you are left with a store non-existent in search engines.
Step 2 - Identify the Problem So how do we identify this problem? OK open up your websites homepage.
Look at the entire website.
See how it is made up of certain areas? Most websites are made up of a header at the top, navigation down the left, footer at the bottom and possibly a sidebar to the right.
Easy.
We are going to call this your template.
Obviously there are many variations on this; some websites have the navigation at the top (like we do) at the right etc.
But in general particularly for stores this is the standard layout.
This leaves right in the middle of the screen the content area.
The area of the webpage that fills with content with each new page you visit.
Simple.
Now.
When you look at any page of your website you focus immediately on the content area right? You know the template doesn't change so you focus on the obvious differences between each page.
A search engine spider doesn't do that.
Each page that a spider trawls it reads everything, all your navigation, all your header and footer, every time, before it even looks at your content area.
An aside -If your navigation has 200+ words within the template area you will have major indexing problems.
In general spiders only check the first 200 (ish) words of a page.
This means that every page will be read as a duplicate as your hard worked content will never be seen.
Now remember, a search engine spider also cannot see pictures, or flash files.
So if you have two pages, with the same template (containing 100 words in total), a unique picture each, and a few lines of text (20 words in total).
From a spiders perspective each page contains 120 words, 100 of which are a duplicated from various other pages.
Not good! We estimate that if a spider notices two of your pages have less than 10% differences it could well consider them duplicates.
Now you must be getting it.
You would need substantially more content in the content area to make the page look unique.
Step 3 - How to solve it We mentioned it briefly above.
You need to calculate how many words are in your template.
Remember these are selectable words that the spiders can process.
Not words in pictures or words in flash.
The easiest way to do this, is to go to any page on your site, select all (ctrl+a), copy (ctrl+c), and the paste straight into word (ctrl+v).
Not exactly pretty.
Simply delete all the text from the content area and leave all the text in your template.
Once you've done that check your word count.
Tada! That number is the amount of text that is repeated on every page.
Make sure you write it down or get it tattooed somewhere memorable.
That number is the amount of unique words you want to be contained in your content area.
So each page is 50% unique text and 50% your duplicated template text.
Obviously more is better, but if you aim for this, spiders will no doubt understand that the page is completely unique from any other page listed.
You will also want to check your page titles.
They must be unique! Imagine being blind and having 100 pages of text to read each of which had the same title, you'd skip them wouldn't you? Not good! OK a silly example but it makes the point.
If you have any duplicate page titles Google could well skip over them.
NB - Remember, Page titles are one of the most important founding factors of a website regarding SEO.
Keyword position in the Page title is incredibly important.
The further to the left a keyword is the higher priority it is in search engines.
Remember most people aren't going to be searching for your business name, so consider not having this at the highest priority in the title.
Look at the following titles.
All About Search Engine Optimization Search Engine Optimization Home They look very different don't they? Well, yes they do.
But again in our blind friend's eyes they could look identical.
There is a concept called stop words and the above examples are full of them.
A stop word is a word that search engines consider so unimportant and common that they don't keep it in their index.
They do this to save space and speed up searching.
As of Jan 2008 Google has stopped using stop words.
This means slight variances in page title should be noticed.
We still recommend avoiding common English words as other search engines still use them.
You should also be using the full length of the title tag to your advantage to maximize the uniqueness of each page.


Leave a reply