SEOmoz is a Seattle-based Search Engine
Optimization (SEO) firm and community resource for
those seeking knowledge in the SEO/M field. You can
learn more about SEOmoz here.
We provide a great variety of free information via a daily
blog, automated
tools and advanced
articles.
This article is offered as a resource to help
individuals, organizations, and companies
inexperienced with search engine optimization learn
the basics of how the service and process operate. It
is our goal to improve your ability to drive search
traffic to your site and debunk major myths about SEO.
We share this knowledge to help businesses,
government, educational, and non-profit organizations
benefit from being listed in the major search engines.
SEOmoz provides advanced
SEO services. If you are new to SEO, have read
through this document, and require an SEO firm's
assistance, you may learn
more about us here. Along with the optimization
services we provide, we also recommend a number of very
effective SEO firms who follow the best practices
described in this document.
SEO is the active practice of optimizing a web site
by improving internal and external aspects in order to
increase the traffic the site receives from search
engines. Firms that practice SEO can vary; some have a
highly specialized focus, while others take a more
broad and general approach. Optimizing a web site for
search engines can require looking at so many unique
elements that many practitioners of SEO (SEOs)
consider themselves to be in the broad field of
website optimization (since so many of those elements
intertwine).
This guide is designed to describe all areas of SEO
- from discovery of the terms and phrases that will
generate traffic, to making a site search engine
friendly, to building the links and marketing the
unique value of the site/organization's offerings.
Why
does my company/organization/website need SEO?
The
majority of web traffic is driven by the major
commercial search engines - Yahoo!,
MSN, Google
& AskJeeves
(although AOL gets nearly 10% of searches, their
engine is powered by Google's results). If your site
cannot be found by search engines or your content
cannot be put into their databases, you miss out on
the incredible opportunities available to websites
provided via search - people who want what you have
visiting your site. Whether your site provides
content, services, products, or information, search
engines are a primary method of navigation for almost
all Internet users.
Search queries, the words that users type into the
search box which contain terms and phrases best suited
to your site, carry extraordinary value. Experience
has shown that search engine traffic can make (or
break) an organization's success. Targeted visitors to
a website can provide publicity, revenue, and exposure
like no other. Investing in SEO, whether through time
or finances, can have an exceptional rate of return.
Why
can't the search engines figure out my site without
SEO help?
Search engines are always working towards improving
their technology to crawl the web more deeply and
return increasingly relevant results to users.
However, there is and will always be a limit to how
search engines can operate. Whereas the right moves
can net you thousands of visitors and attention, the
wrong moves can hide or bury your site deep in the
search results where visibility is minimal. In
addition to making content available to search
engines, SEO can also help boost rankings so that
content that has been found will be placed where
searchers will more readily see it. The online
environment is becoming increasingly competitive, and
those companies who perform SEO will have a decided
advantage in visitors and customers.
How
much of this article do I need to read?
If you are serious about improving search traffic
and are unfamiliar with SEO, I recommend reading this
guide front-to-back. There's a printable
MS Word version for those who'd prefer, and dozens
of linked-to resources on other sites and pages that
are worthy of your attention. Although this guide is
long, I've attempted to remain faithful to Mr.
Strunk's famous quote:
"A sentence should contain no
unnecessary words, a paragraph no unnecessary
sentences, for the same reason that a drawing should
have no unnecessary lines and a machine no
unnecessary parts."
Every section and topic in this report is critical
to understanding the best known and most effective
practices of search engine optimization.
Search engines have a short list of critical
operations that allows them to provide relevant web
results when searchers use their system to find
information.
- Crawling
the Web
Search engines run automated programs, called
"bots" or "spiders", that use
the hyperlink structure of the web to
"crawl" the pages and documents that
make up the World Wide Web. Estimates are that of
the approximately 20 billion existing pages,
search engines have crawled between 8 and 10
billion.
- Indexing
Documents
Once a page has been crawled, its contents can be
"indexed" - stored in a giant database
of documents that makes up a search engine's
"index". This index needs to be tightly
managed so that requests which must search and
sort billions of documents can be completed in
fractions of a second.
- Processing
Queries
When a request for information comes into the
search engine (hundreds of millions do each day),
the engine retrieves from its index all the
document that match the query. A match is
determined if the terms or phrase is found on the
page in the manner specified by the user. For
example, a search for car
and driver magazine at Google returns 8.25
million results, but a search for the same phrase
in quotes ("car
and driver magazine") returns only 166
thousand results. In the first system, commonly
called "Findall" mode, Google returned
all documents which had the terms "car",
"driver", and "magazine" (they
ignore the term "and" because
it's not useful to narrowing the results), while
in the second search, only those pages with the
exact phrase "car and driver magazine"
were returned. Other advanced operators (Google
has a list
of 11) can change which results a search
engine will consider a match for a given query.
- Ranking
Results
Once the search engine has determined which
results are a match for the query, the engine's
algorithm (a mathematical equation commonly used
for sorting) runs calculations on each of the
results to determine which is most relevant to the
given query. They sort these on the results pages
in order from most relevant to least so that users
can make a choice about which to select.
Although a search engine's operations are not
particularly lengthy, systems like Google, Yahoo!,
AskJeeves, and MSN are among the most complex,
processing-intensive computers in the world, managing
millions of calculations each second and funneling
demands for information to an enormous group of users.
Speed
Bumps & Walls
Certain types of navigation may hinder or entirely
prevent search engines from reaching your website's
content. As search engine spiders crawl the web, they
rely on the architecture of hyperlinks to find new
documents and revisit those that may have changed. In
the analogy of speed bumps and walls, complex links
and deep site structures with little unique content
may serve as "bumps." Data that cannot be
accessed by spiderable links qualify as
"walls."
Possible "Speed Bumps" for SE Spiders:
- URLs with 2+ dynamic
parameters; i.e. http://www.url.com/page.php?id=4&CK=34rr&User=%Tom%
(spiders may be reluctant to crawl complex URLs
like this because they often result in errors with
non-human visitors)
- Pages with more than
100 unique links to other pages on the site
(spiders may not follow each one)
- Pages buried more
than 3 clicks/links from the home page of a
website (unless there are many other external
links pointing to the site, spiders will often
ignore deep pages)
- Pages requiring a
"Session ID" or Cookie to enable
navigation (spiders may not be able to retain
these elements as a browser user can)
- Pages that are split
into "frames" can hinder crawling and
cause confusion about which pages to rank in the
results.
Possible "Walls" for SE Spiders:
- Pages accessible
only via a select form and submit button
- Pages requiring a
drop down menu (HTML attribute) to access them
- Documents accessible
only via a search box
- Documents blocked
purposefully (via a robots meta tag or robots.txt
file - see more
on these here)
- Pages requiring a
login
- Pages that re-direct
before showing content (search engines call this
cloaking or bait-and-switch and may actually ban
sites that use this tactic)
The key to ensuring that a site's contents are
fully crawlable is to provide direct, HTML links to
each page you want the search engine spiders to index.
Remember that if a page cannot be accessed from the
home page (where most spiders are likely to start
their crawl), it is likely that it will not be indexed
by the search engines. A sitemap (which is discussed
later in this guide) can be of tremendous help for
this purpose.
Measuring
Relevance and Popularity
Modern commercial search engines rely on the
science of information retrieval (IR). That science
has existed since the middle of the 20th century, when
retrieval systems powered computers in libraries,
research facilities, and government labs. Early in the
development of search systems, IR scientists realized
that two critical components made up the majority of
search functionality:
Relevance - the degree
to which the content of the documents returned in a
search matched the user's query intention and terms.
The relevance of a document increases if the terms
or phrase queried by the user occurs multiple times
and shows up in the title of the work or in
important headlines or subheaders.
Popularity - the
relative importance, measured via citation (the act
of one work referencing another, as often occurs in
academic and business documents) of a given document
that matches the user's query. The popularity of a
given document increases with every other document
that references it.
These two items were translated to web search 40
years later and manifest themselves in the form of
document analysis and link analysis.
In document analysis, search engines look at
whether the search terms are found in important areas
of the document - the title, the meta data, the
heading tags, and the body of text content. They also
attempt to automatically measure the quality of the
document (through complex systems beyond the scope of
this guide).
In link analysis, search engines measure not only
who is linking to a site or page, but what they are
saying about that page/site. They also have a good
grasp on who is affiliated with whom (through
historical link data, the site's registration records,
and other sources), who is worthy of being trusted
(links from .edu and .gov pages are generally more
valuable for this reason), and contextual data about
the site the page is hosted on (who links to that
site, what they say about the site, etc.).
Link and document analysis combine and overlap
hundreds of factors that can be individually measured
and filtered through the search engine algorithms (the
set of instructions that tells the engines what
importance to assign to each factor). The algorithm
then determines scoring for the documents and
(ideally) lists results in decreasing order of
importance (rankings).
Information
Search Engines Can Trust
As search engines index the web's link structure
and page contents, they find two distinct kinds of
information about a given site or page - attributes of
the page/site itself and descriptives about that
site/page from other pages. Since the web is such a
commercial place, with so many parties interested in
ranking well for particular searches, the engines have
learned that they cannot always rely on websites to be
honest about their importance. Thus, the days when
artificially stuffed meta tags and keyword-rich pages
dominated search results (pre-1998) have vanished and
given way to search engines that measure trust via
links and content.
The theory goes that if hundreds or thousands of
other websites link to you, your site must be popular,
and thus, have value. If those links come from very
popular and important (and thus, trustworthy)
websites, their power is multiplied to even greater
degrees. Links from sites like NYTimes.com, Yale.edu,
Whitehouse.gov, and others carry with them inherent
trust that search engines then use to boost your
ranking position. If, on the other hand, the links
that point to you are from low-quality, interlinked
sites or automated garbage domains (aka link farms),
search engines have systems in place to discount the
value of those links.
The most well-known system for ranking sites based
on link data is the simplistic formula developed by
Google's founders - PageRank. PageRank, which relies
on a mathematical formula (based around finding a
given document in a random pattern of clicking on
links), is described
by Google in their technology section:
PageRank relies on the
uniquely democratic nature of the web by using its
vast link structure as an indicator of an individual
page's value. In essence, Google interprets a link
from page A to page B as a vote, by page A, for page
B. But, Google looks at more than the sheer volume
of votes, or links a page receives; it also analyzes
the page that casts the vote. Votes cast by pages
that are themselves "important" weigh more
heavily and help to make other pages
"important."
Google uses a PageRank “proxy” value, which
logarithmically translates the actual PageRank of a
document to a value between 1 and 10, to rank Web
sites listed in its directory
(which offers a PageRank order or an Alphabetical
order for listings) and in its toolbar (below).

Google's toolbar (available
here) includes an icon that shows a PageRank value
from 0-10
PageRank is, in essence, a rough system for
estimating the value of a given link based on the
links that point to the host page. Since PageRank's
inception in the late '90s, more subtle and
sophisticated link analysis systems have taken the
place of PageRank. Thus, in the modern era of SEO, the
PageRank measurement in Google's toolbar, directory,
or through sites that query the service is of limited
value. Pages with PR8 can be found ranked 20-30
positions below pages with a PR3 or PR4. In addition,
the toolbar numbers are updated only every 3-6 months
by Google, making the values even less useful. Rather
than focusing on PageRank, it's important to think
holistically about a link's worth.
Here's a small list of the most important factors
search engines look at when attempting to value a
link:
- The Anchor Text of Link -
Anchor text describes the visible characters and
words that hyperlink to another document or
location on the web. For example, in the phrase
"CNN
is a good source of news, but I actually prefer the
BBC's take on events," two unique pieces
of anchor text exist - "CNN" is the
anchor text pointing to http://www.cnn.com,
while "the BBC's take on events" points
to http://news.bbc.co.uk. Search engines
use this text to help them determine the subject
matter of the linked-to document. In the example
above, the links would tell the search engine that
when users search for "CNN", SEOmoz.org
thinks that http://www.cnn.com is a
relevant site for the term "CNN" and
that http://news.bbc.co.uk is relevant to
"the BBC's take on events". If hundreds
or thousands of sites think that a particular page
is relevant for a given set of terms, that page
can manage to rank well even if the terms NEVER
appear in the text itself (for example, see the
BBC's explanation of why Google ranks certain
pages for the term "Miserable
Failure").
- Global Popularity of the Site -
More popular sites, as denoted by the number and
power of the links pointing to them, provide more
powerful links. Thus, while a link from SEOmoz may
be a valuable vote for a site, a link from
bbc.co.uk or cnn.com carries far more weight. This
is one area where PageRank (assuming it was
accurate) could be a good measure, as it's
designed to calculate global popularity.
- Popularity of Site in Relevant
Communities - In the example above, the
weight or power of a site's vote is based on its
raw popularity across the web. As search engines
became more sophisticated and granular in their
approach to link data, they acknowledged the
existence of "topical communities";
sites on the same subject that often interlink
with one another, referencing documents and
providing unique data on a particular topic. Sites
in these communities provide more value when they
link to a site/page on a relevant subject rather
than a site that is largely irrelevant to their
topic.
- Text Directly Surrounding the Link
- Search engines have been noted to weight the
text directly surrounding a link with greater
important and relevant than the other text on the
page. Thus, a link from inside an on-topic
paragraph may carry greater weight than a link in
the sidebar or footer.
- Subject Matter of the Linking Page
- The topical relationship between the subject of
a given page and the sites/pages linked to on it
may also factor into the value a search engine
assigns to that link. Thus, it will be more
valuable to have links from pages that are related
to the site/page's subject matter than those that
have little to do with the topic.
These are only a few of the many factors search
engines measure and weigh when evaluating links. For a
more complete list, see SEOmoz's
search engine ranking factors article.
Link metrics are in place so that search engines
can find information to trust. In the academic world,
greater citation meant greater importance, but in a
commercial environment, manipulation and conflicting
interests interfere with the purity of citation-based
measurements. Thus, on the modern WWW, the source,
style, and context of those citations is vital to
ensuring high quality results.
The
Anatomy of a HyperLink
A standard hyperlink in HTML code looks like this:
<a href="http://www.seomoz.org">SEOmoz</a>
SEOmoz
In this example, the code simply indicates that
the text "SEOmoz" (called the "anchor
text" of the link) should be hyperlinked to the
page http://www.seomoz.org. A search engine would
interpret this code as a message that the page
carrying this code believed the page http://www.seomoz.org
to be relevant to the text on the page and
particularly relevant to the term "SEOmoz".
A more complex piece of HTML code for a link may
include additional attributes such as:
<a href="http://www.seomoz.org"
title="Rand's Site" rel="nofollow">SEOmoz</a>
SEOmoz
In this example, new elements such as the link
title and rel attribute may influence how a search
engine views the link, despite its appearance on the
page remaining unchanged. The title attribute may
serve as an additional piece of information, telling
the search engine that http://www.seomoz.org, in
addition to being related to the term "SEOmoz",
is also relevant to the phrase "Rand's
Site". The rel attribute, originally designed
to describe the relationship between the linked-to
page and the linking page, has, with the recent
emergence of the "nofollow" descriptive,
become more complex.
"Nofollow" is a tag designed
specifically for search engines. When ascribed to a
link in the rel attribute, it tells the engine's
ranking system that the link should not be
considered an editorially approved "vote"
for the linked-to page. Currently, 3 major search
engines (Yahoo!, MSN, & Google) all support
"nofollow". AskJeeves, due to its unique
ranking system, does not support nofollow, and
ignores its presence in link code. For more
information about how this works, visit Danny
Sullivan's description of nofollow's inception
on the SEW blog.
Some links may be assigned to images, rather than
text:
<a href="http://www.seomoz.org/randfish.php"><img
src="rand.jpg" alt="Rand Fishkin of
SEOmoz"></a>

This example shows an image named "rand.jpg"
linking to the page - http://www.seomoz.org/randfish.php.
The alt attribute, designed originally to display in
place of images that were slow to load or on
voice-based browsers for the blind, reads "Rand
Fishkin of SEOmoz" (in many browsers, you can
see the alt text by hovering the mouse over the
images). Search engines can use the information in
an image-based link, including the name of the image
and the alt attribute to interpret what the
linked-to page is about.
Other types of links may also be used on the web,
many of which pass no ranking or spidering value due
to their use of re-direct, Javascript, or other
technologies. A link that does not have the classic
<a href="URL">text</a> format,
be it image or text, should be generally considered
not to pass link value via the search engines
(although in rare instances, engines may attempt to
follow these more complex style links).
<a href="redirect/jump.php?url=%2Fgro.zomoes.www%2F%2F%3Aptth"
title="http://www.seomoz.org/"
target="_blank" class="postlink">SEOmoz</a>
In this example, the redirect used scrambles the
URL by writing it backwards, but unscrambles it
later with a script and sends the visitor to the
site. It can be assumed that this passes no search
engine link value.
<a href="redirectiontarget.htm">SEOmoz</a>
This sample shows the very simple piece of
Javascript code that calls a function referenced in
the document to pull up a specified page. Creative
uses of Javascript like this can also be assumed to
pass no link value to a search engine.
It's important to understand that, based on a
link's anatomy, search engines can (or cannot)
interpret and use the data therein. Whereas the right
sort of links can provide great value, the wrong sort
will be virtually useless (for search ranking
purposes). More detailed information on links is
available at this resource - anatomy
and deployment of links.
Keywords
and Queries
Search engines rely on the terms queried by users
to determine which results to put through their
algorithms, order, and return to the user. But, rather
than simply recognizing and retrieving exact matches
for query terms, search engines use their knowledge of
semantics (the science of language) to construct
intelligent matching for queries. An example might be
a search for loan providers that also
returned results that did not contain that specific
phrase, but instead had the term lenders.
The engines collect data based on the frequency of
use of terms and the co-occurrence of words and
phrases throughout the web. If certain terms or
phrases are often found together on pages or sites,
search engines can construct intelligent theories
about their relationships. Mining semantic data
through the incredible corpus that is the Internet has
given search engines some of the most accurate data
about word ontologies and the connections between
words ever assembled artificially. This immense
knowledge of language and its usage gives them the
ability to determine which pages in a site are
topically related, what the topic of a page or site
is, how the link structure of the web divides into
topical communties, and much, much more.
Search engines' growing artificial intelligence on
the subject of language means that queries will
increasingly return more intelligent, evolved results.
This heavy investment in the field of natural language
processing (NLP) will help to achieve greater
understanding of the meaning and intent behind their
users' queries. Over the long term, users can expect
the results of this work to produce increased
relevancy in the SERPs (Search Engine Results Pages)
and more accurate guesses from the engines as to the
intent of a user's queries.
Sorting
the Wheat from the Chaff
In the classic world of Information Retrieval, when
no commercial interests existed in the databases, very
simplistic algorithms could be used to return high
quality results. On the world wide web, however, the
opposite is true. Commercial interests in the SERPs
are a constant issue for modern search engines. With
every new focus on quality control and growth in
relevance metrics, there are thousands of individuals
(many in the field of SEO) dedicated to manipulating
these metrics in order to control the SERPs, typically
by aiming to list their sites/pages first.
The worst kind of results are what the industry
refers to as "search spam" - pages and sites
with little real value that contain primarily
re-directs to other pages, lists of links, scraped
(copied) content, etc. These pages are so irrelevant
and useless that search engines are highly focused on
removing them from the index. Naturally, the monetary
incentives are similar to email spam - although few
visit and fewer click on the links (which are what
provide the spam publisher with revenue), the sheer
quantity is the decisive factor in producing income.
Other "spam" results range from sites
that are of low quality or affiliate status that
search engines would prefer not to list, to high
quality sites and businesses that are using the link
structure of the web to manipulate the results in
their favor. Search engines are focused on clearing
out all types of manipulation and hope to eventually
achieve fully relevant and organic algorithms to
determine ranking order. So-called "search engine
spammers" engage in a constant battle against
these tactics, seeking new loopholes and methods for
manipulation, resulting in a never-ending struggle.
This guide is NOT about how to manipulate the
search engines to achieve rankings, but rather how to
create a website that search engines and users will be
happy to have ranking permanently in the top
positions, thanks to its relevance, quality, and user
friendliness.
Paid
Placement and Secondary Sources in the Results
The search engine results pages contain not only
listings of documents found to be relevant to the
user's query, but other content, including paid
advertisements and secondary source results. Google,
for example, serves up ads from its well-known AdWords
program (which currently fuels more than 99% of
Google's revenues), as well as secondary content from
its local
search, product
search (called Froogle), and image
search results.
Below is a screenshot of Google's search engine
results page. Hover on any of the areas of the image
to reveal the source of the content:
The sites/pages ranking in the "organic"
search results receive the lion's share of searcher
eyeballs and clicks - between 60-70%, depending on
factors such as the prominence of ads, relevance of
secondary content, etc. The practice of optimization
for the paid search results is called SEM, or Search
Engine Marketing, while optimizing to rank in the
secondary results requires unique, advanced methods of
targeting specific searches in arenas such as local
search, product search, image search, and others.
While all of these practices are a valuable part of
any online marketing campaign, they are beyond the
scope of this guide. Our sole focus remains on the
"organic" results, although links at the
bottom of this paper can help direct you to resources
on other subjects.
Keyword research is critical to the process of SEO.
Without this component, your efforts to rank well in
the major search engines may be mis-directed to the
wrong terms and phrases, resulting in rankings that no
one will ever see. The process of keyword research
involves several phases:
- Brainstorming
- Thinking of what your customers/potential
visitors would be likely to type in to search
engines in an attempt to find the
information/services your site offers (including
alternate spellings, wordings, synonyms, etc).
- Surveying
Customers - Surveying past or potential
customers is a great way to expand your keyword
list to include as many terms and phrases as
possible. It can also give you a good idea of
what's likely to be the biggest traffic drivers
and produce the highest conversion rates.
- Applying
Data from KW Research Tools - Several
tools online (including Wordtracker
& Overture
- both described below) offer information about
the number of times users perform specific
searches. Using these tools can offer concrete
data about trends in keyword selection.
- Term
Selection - The next step is to create a
matrix or chart that analyzes the terms you
believe are valuable and compares traffic,
relevancy, and the likelihood of conversions for
each. This will allow you to make the best
informed decisions about which terms to target.
SEOmoz's KW
Difficulty Tool can also aid in choosing terms
that will be achievable for the site.
- Performance
Testing and Analytics - After keyword
selection and implementation of targeting,
analytics programs (like Indextools
and ClickTracks)
that measure web traffic, activity, and
conversions can be used to further refine keyword
selection.
Wordtracker
& Overture
|
Overture
Keyword Selection Tool

|
Wordtracker
Simple Search Utility

|
Currently, the two most popular sources of keyword
data are Wordtracker,
whose statistics come primarily from use of the
meta-search engine Dogpile
(which has ~1% of the share of searches performed
online) and Overture
(recently re-branded as Yahoo! Search Marketing),
which offers data collected from searches performed on
Yahoo!'s engine (with a 22-28% share). While neither's
data is flawless or entirely accurate, both provide
good methods for measuring comparative numbers. For
example, while Overture and Wordtracker may disagree
on numbers and say that "red bicycles" gets
240 vs. 380 searches per day (across all engines),
both will generally indicate that this is a more
popular term than "scarlet bicycles",
"maroon bicycles", or even "blue
bicycles."
In Wordtracker, which provides more detail but has
a considerably smaller share of data, terms and
phrases are separated by capitalization, plurality,
and word ordering. In the Overture tool, multiple
search phrases are combined. For example, Wordtracker
would independently show numbers for "car
loans", "Car Loans", "car
loan", and "cars Loan", whereas
Overture would give a single number that encompasses
all of these. The granularity of data can be more
useful for analyzing searches that may result in
unique results pages (plurals often do and different
word orders almost always do), but capitalization is
of less consequence as the search engines don't
deliver different results based on capitalization.
Remember that Wordtracker and Overture are both
useful tools for relative keyword data, but can be
highly inaccurate when compared to the actual number
of searches performed. In other words, use the tools
to select which terms to target, but don't rely on
them for predicting the amount of traffic you can
achieve. If your goal is estimating traffic numbers,
use programs like Google's
Adwords and Yahoo!
Search Marketing to test the number of impressions
a particular term/phrase gets.
Targeting
the Right Terms
Targeting the best possible terms is of critical
importance. This encompasses more than merely
measuring traffic levels and choosing the highest
trafficked terms. An intelligent process for keyword
selection will measure each of the following:
- Conversion Rate - the percent
of users searching with the term/phrase that
converts (click an ad, buy a product, complete a
transaction, etc.)
- Predicted Traffic - An estimate
of how many users will be searching for the given
term/phrase each month
- Value per Customer - An average
amount of revenue earned per customer using the
term or phrase to search - comparing big-ticket
search terms vs. smaller ones.
- Keyword Competition - A rough
measurement of the competitive environment and the
level of difficulty for the given term/phrase.
This is typically measured by metrics that include
the number of competitors, the strength of those
competitors' links, and the financial motivation
to be in the sector. SEOmoz's Keyword
Difficulty Tool can assist in this process.
Once you've analyzed each of these elements, you
can make effective decisions about the terms and
phrases to target. When starting a new site, it's
highly recommended to target only one or possibly two
unique phrases on a single page. Although it is
possible to optimize for more phrases and terms, it's
generally best to keep separate terms on separate
pages, as you can provide individualized information
for each in this manner. As websites grow and mature,
gaining links and legitimacy with the engines,
targeting multiple terms per page becomes more
feasible.
The
Long Tail of Search
The "long tail" is a concept pioneered by
Chris Anderson (the editor-in-chief of Wired magazine,
who runs the Long
Tail blog). From Chris's description:
The theory of the Long Tail is that our
culture and economy is increasingly shifting away
from a focus on a relatively small number of
"hits" (mainstream products and markets)
at the head of the demand curve and toward a huge
number of niches in the tail. As the costs of
production and distribution fall, especially online,
there is now less need to lump products and
consumers into one-size-fits-all containers. In an
era without the constraints of physical shelf space
and other bottlenecks of distribution,
narrowly-targeted goods and services can be as
economically attractive as mainstream fare.
This concept relates exceptionally well to keyword
search terms in the major engines. Although the
largest traffic numbers are typically for broad terms
at the "head" of the keyword curve, great
value lies in the thousands of unique, rarely used,
niche terms in the "tail." These terms can
provide higher conversion rates and more interested
and valuable visitors to a site, as these specific
terms can relate to exactly the topics, products, and
services your site provides.
For example:
|
Keyword Term/Phrase
|
# of Searches per Month
|
| men's suit |
27,770 |
| armani men's suit |
723 |
| italian men's suit |
615 |
| Jones New York Men's Suit |
424 |
| Men's 39S Suit |
310 |
| Gucci Men's Suit |
222 |
| Versace Men's Suit |
178 |
| Hugo Boss Men's Suit |
138 |
| Men's Custom Made Suit |
126 |
|
|
In the scenario in the table above, the traffic for
the term "men's suit" may be far greater,
but the value of more specific terms is greater. A
searcher for "Hugo Boss Men's Suit" is more
likely to make a purchase decision than one searching
for simply a "men's suit." There are also
thousands of other terms, garnering far fewer monthly
searches, that, when taken together, have a value
greater than the terms garnering the most searches.
Thus, targeting many dozens or hundreds of smaller
terms individually can be both easier (on a
competitive level) and more profitable.
Sample
Keyword Research Chart
The following chart diagrams how we conduct basic
keyword research at SEOmoz. You are welcome to copy
and use this format for your own keywords:
|
Term/Phrase
|
KW Difficulty
|
Top 3 OV Bids
|
OV Mthly Pred. Traf.
|
WT Mthly Pred. Traf.
|
Relevance Score
|
| San Diego Zoo |
63%
|
$0.41
$0.41
$0.40
|
116,229
|
42,360
|
25%
|
| Joe Dimaggio |
51%
|
$0.28
$0.19
$0.11
|
5,847
|
7,590
|
10%
|
| Starsky and Hutch |
53%
|
$0.16
$0.00
$0.00
|
19,769
|
16,950
|
30%
|
| Art Museum |
77%
|
$0.51
$0.50
$0.25
|
19,244
|
7,410
|
5%
|
| DUI Attorney |
52%
|
$1.63
$1.62
$1.60
|
13,923
|
3,960
|
60%
|
| Search Engine Marketing |
83%
|
$4.99
$3.26
$3.25
|
1,183,633
|
74,430
|
40%
|
| Microsoft |
89%
|
$0.69
$0.51
$0.32
|
1,525,265
|
256,620
|
10%
|
| Interest Only Mortgage Loan |
50%
|
$4.60
$4.39
$4.39
|
3,745
|
8,910
|
75%
|
Key
- KW Difficulty - The score
from SEOmoz's tool
- Top 3 OV Bids - The bid
amount from the top 3 listings in Yahoo!'s PPC
results
- Overture Monthly Predicted Traffic
- The amount of traffic estimated via Overture
for the previous month's data
- Wordtracker Monthly Predicted Traffic
- The amount of traffic estimated via
Wordtracker (note that you must add up all terms
in their database that match and multiply by the
number of days in the month - the
"exact/precise search" function can
help make this easier)
- Relevance Score - The % of
searchers using this term/phrase that you feel
are likely to be interested in your site's
products/services/offerings. Although this is a
subjective number, you can use conversion rates
or click-through rates from previous campaigns
to more accurately estimate this in the future.
In selecting final terms, those with lower
difficulty, higher relevance, and more traffic will
offer the greatest value.
Optimizing a Site
Each of the following components are critical
pieces to a site's ability to be crawled, indexed, and
ranked by search engine spiders. When properly used in
the construction of a website, these features give a
site/page the best chance of ranking well for targeted
keywords.
Accessibility
An accessible site is one that ensures delivery of
its content successfully as often as possible. The
functionality of pages, validity of HTML elements,
uptime of the site's server, and working status of
site coding and components all figure into site
accessibility. If these features are ignored or
faulty, both search engines and users will select
other sites to visit.
The biggest problems in accessibility that most
sites encounter fit into the following categories.
Addressing these issues satisfactorily will avoid
problems getting search engines and visitors to and
through your site.
- Broken Links - If an HTML link
is broken, the contents of the linked-to page may
never be found. In addition, some surmise that
search engines negatively degrade rankings on
sites & pages with many broken links.
- Valid HTML & CSS - Although
arguments exist about the necessity for full
validation of HTML and CSS in accordance with W3C
guidelines, it is generally agreed that code
must meet minimum requirements of functionality
and successful display in order to be spidered and
cached properly by the search engines.
- Functionality of Forms and Applications
- If form submissions, select boxes, javascript,
or other input-required elements block content
from being reached via direct hyperlinks, search
engines may never find them. Keep data that you
want accessible to search engines on pages that
can be directly accessed via a link. In a similar
vein, the successful functionality and
implementation of any of these pieces is critical
to a site's accessibility for visitors. A
non-functioning page, form, or code element is
unlikely to receive much attention from visitors.
- File Size - With the exception
of a select few documents that search engines
consider to be of exceptional importance, web
pages greater than 150K in size are typically not
fully cached. This is done to reduce index size,
bandwidth, and load on the servers, and is
important to anyone building pages with
exceptionally large amounts of content. If it's
important that every word and phrase be spidered
and indexed, keeping file size under 150K is
highly recommended. As with any online endeavor,
smaller file size also means faster download speed
for users - a worthy metric in its own right.
- Downtime & Server Speed -
The performance of your site's server may have an
adverse impact on search rankings and visitors if
downtime and slow transfer speeds are common.
Invest in high quality hosting to prevent this
issue.
URLs,
Title Tags & Meta Data
URLs, title tags and meta tag components are all
information that describe your site and page to
visitors and search engines. Keeping them relevant,
compelling and accurate are key to ranking well. You
can also use these areas as launching points for your
keywords, and indeed, successful rankings require
their use.
The URL of a document should ideally be as
descriptive and brief as possible. If, for example,
your site's structure has several levels of files and
navigation, the URL should reflect this with folders
and subfolders. Individual pages' URLs should also be
descriptive without being overly lengthy, so that a
visitor who sees only the URL could have a good idea
of what to expect on the page. Several examples
follow:
Comparison of URLs for a Canon Powershot
SD400 Camera
Amazon.com - http://www.amazon.com/gp/product/B0007TJ5OG/102-8372974-
4064145?v=glance&n=502394&m=ATVPDKIKX0DER&n=3031001&s=photo&v=glance
Canon.com - http://consumer.usa.canon.com/ir/controller?
act=ModelDetailAct&fcategoryid=145&modelid=11158
DPReview.com - http://www.dpreview.com/reviews/canonsd400/
With both Canon and Amazon, a user has virtually
no idea what the URL might point to. With DPReview's
logical URL, however, it is easy to surmise that a
review of a Canon SD400 is the likely topic of the
page.
In addition to the issues of brevity and clarity,
it's also important to keep URLs limited to as few
dynamic parameters as possible. A dynamic parameter is
a part of the URL that provides data to a database so
the proper records can be retrieved, i.e. n=3031001,
v=glance, categoryid=145, etc.
Note that in both Amazon and Canon's URLs, the
dynamic parameters number 3 or more. In an ideal site,
there should never be more than two. Search engineer
representatives have confirmed on numerous occasions
that URLs with more than 2 dynamic parameters may not
be spidered unless they are perceived as significantly
important (i.e. have many, many links pointing to
them).
Well written URLs have the additional benefit of
serving as their own anchor text when copied and
pasted as links in forums, blogs, or other online
venues. In the DPReview example, a search engine might
see the URL http://www.dpreview.com/reviews/canonsd400/
and give ranking credit to the page for terms in the
URL like dpreview, reviews, canon, sd, 400. The
parsing and breaking of terms is subject to the search
engine's analysis, but the chance of earning this
additional credit makes writing friendly, usable URLs
even more worthwhile.
Title tags, in addition to their invaluable use in
targeting keyword terms for rankings, also help drive
click-through-rates (CTRs) from the results pages.
Most of the search engines will use a page's title tag
as the blue link text and headline for a result (see
image below), and thus it is important to make them
informative and compelling without being overly "salesy".
The best title tags will make the targeted keywords
prominent, help brand the site, and be as clear and
concise as possible.
Examples and Recommendations for Title
Tags
Page on Red Pandas from the Wellington
Zoo:
- Current Title: Red Panda
- Recommended: Red Panda - Habitat, Features,
Behavior | Wellington Zoo
Page on Alexander Calder from the
Calder Foundation:
- Current Title: Alexander Calder
- Recommended: Alexander Calder - Biography of the
Artist from the Calder Foundation
Page on Plasma TVs from Tiger
Direct:
- Current Title: Plasma Televisions, Plasma TV,
Plasma Screen TVs, SONY Plasma TV, LCD TV at
TigerDirect.com
- Recommended: Plasma Screen & LCD Televisions
at TigerDirect.com
For each of these, the idea behind the
recommendations is to distill the information into the
clearest, most useful snippet while retaining the
primary keyword phrase as the first words in the tag.
The title tag provides the first impression of a web
page and can either serve to draw the visitor in or
compel him or her to choose another listing in the
results.
Meta Tag Recommendations:
Meta
tags once held the distinction of being the primary
realm of SEO specialists. Today, the use of meta tags,
particularly the meta keywords tag, has diminished to
an extent that search engines no longer use them in
their ranking of pages. However, the meta description
tag can still be of some importance, as several search
engines use this tag to display the snippet of text
below the clickable title link in the results pages.
In the image to the left, an illustration of a
Google SERP (Search Engine Results Page) shows the use
of the meta description and title tags. It is on this
page that searchers generally make their decision as
to which result to click, and thus, while the meta
description tag may have little to no impact on where
a page ranks, it can significantly impact the # of
visitors the page receives from search engine traffic.
Note that meta tags are NOT always used on the SERPs,
but can be seen (at the discretion of the search
engine) if the description is accurate, well-written,
and relevant to the searcher's query.
Search-Friendly
Text
Making the visible text on a page
"search-friendly" isn't complicated, but it
is an issue that many sites struggle with. Text styles
that cannot be indexed by search engines include:
- Text embedded in a Java Application or
Macromedia Flash file
- Text in an image file - jpg, gif, png, etc
- Text accessible only via a form submit or other
on-page action
If the search engines can't see your page's text,
they cannot spider and index that content for visitors
to find. Thus, making search-friendly text in HTML
format is critical to ranking well and getting
properly indexed. If you are forced to use a format
that hides text from search engines, try to use the
right keywords and phrases in headlines, title tags,
URLs, and image/file names on the page. Don't go
overboard with this tactic, and never try to hide text
(by making it the same color as the background or
using CSS tricks). Even if the search engines can't
detect this automatically, a competitor can easily
report your site for spamming and have you de-listed
entirely.
Along with making text visible, it's important to
remember that search engines measure the terms and
phrases in a document to extract a great deal of
information about the page. Writing well for search
engines is both an art and a science (as SEOs are not
privy to the exact, technical methodology of how
search engines score text for rankings), and one that
can be harnessed to achieve better rankings.
In general, the following are basic rules that
apply to optimizing on-page text for search rankings:
- Make the primary term/phrase prominent
in the document - Measurements like
keyword density are useless (see kw
density myth thread), but general frequency
can help rankings.
- Make the text on-topic and high quality
- Search engines use sophisticated lexical
analysis to help find quality pages, as well as
teams of researchers identifying common elements
in high quality writing. Thus, great writing can
provide benefits to rankings, as well as visitors.
- Use an optimized document structure
- The best practice is generally to follow a
journalistic format wherein the document starts
with a description of the content, then flows from
broad discussion of the subject to narrow. The
benefits of this are arguable, but in addition to
SEO value, they provide the most readable and
engaging informational document. Obviously, in
situations where this would be inappropriate, it's
not necessary.
- Keep text together - Many folks
in SEO recommend using CSS rather than table
layouts in order to keep the text flow of the
document together and prevent the breaking up of
text via coding. This can also be achieved with
tables - simply make sure that text sections
(content, ads, navigation, etc.) flow together
inside a single table or row and don't have too
many "nested" tables that make for
broken sentences and paragraphs.
Keep in mind that the text layout and keyword usage
in a document no longer carries high importance in
search engine rankings. While the right structure and
usage can provide a slight boost, obsessing over
keyword placement or layout will provide little
overall benefit.
Information
Architecture
The document and link structure of a website can
provide benefits to search rankings when performed
properly. The keys to effective architecture are to
follow the rules that govern human usability of a
site:
- Make Use of a Sitemap - It's
wise to have the sitemap page linked to from every
other page in the site, or at the least from
important high-level category pages and the home
page. The sitemap should, ideally, offer links to
all of the site's internal pages. However, if more
than 100-150 pages exist on the site, a wiser
system is to create a sitemap that will link to
all of the category level pages, so that no page
in a site is more than 2 clicks from the home
page. For exceptionally large sites, this rule can
be expanded to 3 clicks from the home page.
- Use a Category Structure that Flows from
Broad > Narrow - Start with the
broadest topics as hierarchical category pages,
then expand to deep pages with specific topics.
Using the most on-topic structure tells search
engines that your site is highly relevant and
covers a topic in-depth.
For more information on segmenting document
structure and link hierarchies, see Dr. Garcia's
excellent guide
to on-topic analysis.
Canonical
Issues & Duplicate Content
One of the most common and problematic issues for
website builders, particularly those with larger,
dynamic sites powered by databases, is the issue of
duplicate content. Search engines are primarily
interested in unique documents and text, and when they
find multiple instances of the same content, they are
likely to select a single one as "canonical"
and display that page in their results.
If your site has multiple pages with the same
content, either through a content management system
that creates duplicates through separate navigation,
or because copies exist from multiple versions, you
may be hurting those pages' chances of ranking in the
SERPs. In addition, the value that comes from anchor
text and link weight, through both internal and
external links to the page, will be diluted by
multiple versions.
The solution is to take any current duplicate pages
and use a 301 re-direct (described
in detail here) to point all versions to a single,
"canonical" edition of the content.
One very common place to look for this error is on
a site's homepage - oftentimes, a website will have
the same content on http://www.url.com, http://url.com,
and http://www.url.com/index.html. That separation
alone can cause lost link value and severely damage
rankings for the site's homepage. If you find many
links outside the site pointing to both the non-www
and the www version, it may be wise to use a 301
re-write rule to affect all pages at one so they point
to the other.
One of the most important (and often overlooked)
subjects in SEO is building a site deserving of top
rankings at the search engines. A site that ranks #1
for a set of terms in a competitive industry or market
segment must be able to justify its value or risk
losing out to competitors who offer more. Search
engines' goals are to rank the best, most usable,
functional, and informative sites first. By
intertwining your site's content and performance with
these goals, you can help to ensure its long-term
prospects in the search engine rankings.
Usability
Usability represents the ease-of-use inherent in
your site's design, navigation, architecture, and
functionality. The idea behind the practice is to make
your site intuitive so that visitors will have the
best possible experience on the site. A whole host of
features figure into usability, including:
- Design
The graphical elements and layout of website have
a strong influence on how easily usable the site
is. Standards like blue, underlined links, top and
side menu bars, logos in the top, left-hand corner
may seem like rules that can be bent, but
adherence to these elements (with which web users
are already familiar) will help to make a site
usable. Design also encompasses important topics
like visibility & contrast, affecting how easy
it is for users to interest the text and image
elements of the site. Separation of unique
sections like navigation, advertising, content,
search bars, etc. is also critical, as users
follow design cues to help them understand a
page's content. A final consideration would also
take into account the importance of ensuring that
critical elements in a site's design (like menus,
logos, colors, and layout) were used consistently
throughout the site.
- Information Architecture
The organizational hierarchy of a site can also
strongly affect usability. Topics and
categorization impact the ease with which a user
can find the information they need on your site.
While an intuitive, intelligently designed
structure will seamlessly guide the user to their
goals, a complex, obfuscated hierarchy can make
finding information on a site disturbingly
frustrating.
- Navigation
A navigation system that guides users easily
through both top-level and deep pages and makes a
high percentage of the site easily accessible is
critical to good usability. Since navigation is
one of a website's primary functions, provide
users with obvious navigation systems:
breadcrumbs, alt tags for image links, and
well-written anchor text that clearly describes
what the user will get if he or she clicks a link.
Navigation standards like these can drastically
improve usability performance.
- Functionality
To create compelling usability, ensure that tools,
scripts, images, links, etc. all function as they
are intended and don't provide errors to
non-standard browsers, alternative operating
systems, or uninformed users (who often don't know
what/where to click).
- Accessibility
Accessibility refers primarily to the technical
ability of users to access and move through your
site, as well as the ability of the site to serve
disabled or impaired users. For SEO purposes, the
most important aspects are limiting code errors to
a minimum and fixing broken links, making sure
that content is accessible and visible in all
browsers and without special actions.
- Content
The usability of content itself is often
overlooked, but its importance cannot be
overstated. The descriptive nature of headlines,
the accuracy of information and the quality of
content all factor highly into a site's likelihood
to retain visitors and gain links.
Overall, usability is about gearing a site towards
the potential users. Success in this arena garners
increased conversion rates, a higher chance that other
sites will link to yours, and a better relationship
with your users (fewer complaints, lower instance of
problems, etc.). For improving your knowledge of
usability and the best practices, I recommend Steve
Krug's exceptionally impressive book, "Don't
Make Me Think"; possibly the best $30 you can
spend to improve your website.
Professional
Design
Elegant, high quality, high impact design is
critical to gaining the trust of your users. If your
site appears "low budget" or only marginally
professional, it can hurt the chances of gaining a
link and, more importantly, the chances of engendering
trust in your visitors. The first impression of a
website by a user occurs in less than 7 seconds.
That's all the time you have to convey the importance
and authority of your company through the site's
design. I've prepared two examples below:
|

|

|
|
Workplace
Office UK's Website
- Amateur Logo Styles
- Discordant Colors
- No Clear Navigation Element
- Basic Stock Photography
- Template-Like Layout
|
Haworth
Furniture's Online Catalog
- Well-Defined Navigation
- Elegant Color Scheme
- Attractive Lines & Shading
- High-Quality Photography
- Design Creates Intuitive Flow to
Information
|
Although the above examples are not perfect (note
that Haworth is missing a critical element - a search
bar, while Workplace Office UK has one), it's easy to
see why consumers visiting websites like these would
be more inclined to trust and buy from Haworth rather
than Workplace Office. The application of professional
design to sites can induce greater numbers of links
from visiting content creators, greater number of
users who return to the site, higher conversion rates,
and a better overall perception of your site by
visitors.
Although high quality, professional design is not
one of the factors directly ranked by search engines,
it indirectly influences many factors that do affect
the rankings (i.e. link-building, trust, usability,
etc).
Authoring
High Quality Content
Why Should a Search Engine Rank Your Site Above All
the Others in its Field?
If you cannot answer this question clearly and
precisely, the task of ranking higher will be
exponentially more difficult. Search engines attempt
to rank the very best sites with the most relevant
content first in their results, and until your site's
content is the best in its field, you will always
struggle against the engines rather than bringing them
to your doorstep.
It is in content quality that a site's true
potential shows through, and although search engines
cannot measure the likelihood that users will enjoy a
site, the vote via links system operates as a proxy
for identifying the best content in a market. With
great content, therefore, come great links and,
ultimately, high rankings. Deliver the content that
users need, and the search engines will reward your
site.
Content quality, however, like professional design,
is not always dictated by strict rules and guidelines.
What passes for "best of class" in one
sector may be below average in another market. The