Andreoni.com
Search Engines
Search Engine Optimization
By Spider Man
Traffic from Google has increased at an astonishing rate over the past year: Jakob Nielsen's search engine referrals to his Useit site confirm this, as do the unpublished reports from retail sites like Stylata. Google, once considered a niche site for nerds, is the Wall Street Journal's pick for best search engine on the Net, and the traffic numbers seem to agree.
Inktomi, the number two traffic generator, doesn't run its own search site. Instead, the company provides the technology behind MSN Search and AOL Search, two top referrers, as well as Hotbot and over a dozen more.
Portal sites like Excite, Lycos, and AltaVista still draw lots of traffic, but together Google and Inktomi outweigh the entire rest of the field. Add it up and it's pretty clear how to maximize your traffic for the least effort:
Make sure your site is thoroughly crawled by Google and Inktomi.
Get lots of links to your site from domains that a lot of other sites link to -- that's how Google and Inktomi determine relevance when ranking search results. Links Manager can setup a reciprocal links page for your site and facilitate all of the tasks associated with maintain it.
For all other search engines, implement a blanket strategy that gets you reasonable results. By not chasing each one of them separately, you can put your company's time and money to more important uses.
All of this can be accomplished with one, three-step process. And it really is as easy as 1-2-3.
STEP 1
There are quite a few things you can do to grab the attention of search engines and directories:
Clean Up Your URLs
Frames used to be the biggest roadblock to getting crawled, but no more: Both Google and Inktomi now crawl them (the section of Inktomi's support FAQ that claims this isn't so is out of date, according to the company). Instead, the problem with most e-commerce sites today is that their product pages are dynamically generated. While Google will crawl any URL that a browser can read, most of the other search engines balk at links with "?" and "&" characters that separate CGI variables (such as "artloop.com/store?sku=123&uid=456"). As a result, many individual product pages don't show up outside of Google.
One way to circumvent this difficulty is to create static versions of your site's dynamic pages for search engines to crawl. Unfortunately, duplicating your pages is a huge amount of extra work and a constant maintenance chore, plus the resulting pages are never quite up-to-date — all the headaches dynamic pages were designed to eliminate.
Many readers have written in to to ask if the search engines will begin crawling and indexing Flash content soon. The answer, as you might guess, is no. Unlike PDF files, Flash files rarely contain information in text format. Search developers don't want to clutter up their indexes with a million "Skip Intro" pages.
Submit your Site
There are a lot of automated search engine submission services and software that you can use to submit your site to as many search engines as possible. The one most recommended by people I talked to is WebPosition Gold
Don't Forget the Directories
Yahoo still offers free submissions, except for business categories, which cost $199. But even the fee doesn't guarantee they'll accept your site, just that they'll decide on it within a week — with free submissions, you don't even get the promise that they'll ever get around to evaluating it, given the incredible volume of submissions.
Once you've submitted your pages, be ready to wait a month, two, or three before they're crawled and indexed. It's frustrating, but processing a billion Web pages takes time — at a nonstop rate of one hundred per second, it would still take almost four months.
Make a Crawler Page
It isn't necessary to submit every page on your site to the search engines. Just make sure they can find all the pages that matter by hopping links from your front door. To do that, make a "crawler page" that contains nothing but a link to every page you want search engines to crawl. Use the page's TITLE info as the link text — this helps improve your site score. For an example, check out Artloop's crawler page.
Basically, the crawler page is a site map that lists all the pages on your site — it may be a bit too big for humans to read through, but it will be no problem for a search engine. Add an obscure link to the crawler page on one of your site's top-level pages, using a small amount of text. MSN used to use 1x1 images for this trick, but the Google geeks warned us to avoid such obviously invisible tags. "Why not just label it 'site map?'" one asked. Search engine spiders will find it as soon as they get to your site, and suck down all the pages it finds on it.
Don't worry, the crawler page won't show up in search results. It does get pulled into the search engine's index, but because it has no text or tags to match a query, it isn't listed as a result. The pages it links to, however, will appear because the search engine's spider found them right after it visited the crawler page. WiredNews, for example, uses hierarchical sets of crawler pages to make sure every story ever published is crawlable from the top of the site.
Pay to Play?
Not too long ago, in response to years of complaints from commercial site owners who demanded their pages be indexed and up to date, Inktomi announced a new service that lets site owners pay to have individual URLs crawled and indexed quickly. If you're wondering whether paid listings are worth it, I suggest trying just a couple of your URLs first — pick the ones you feel are poised to make the most money — to see if the return on investment meets your needs.
Remember that Inktomi will rank search results largely on the links to your page from other domains. And if no one is linking to you, expect to see your page appear at the end of the results list, not at the top. Links Manager can provide you resources to locate webmasters willing to exchange links you.
There are ways, however, to get your site moving up through the ranks.
Step 2: Get Ranked
Most people that are concerned with search engine optimization focus obsessively on keywords and HTML tags. But when it comes to getting ranked by search engines, the only tags that matter are TITLE, and the META tags KEYWORDS and DESCRIPTION. And you have to be very careful about how you handle each one.
TITLE tag
TITLE makes a big difference, especially with Google. It should be short (less than 40 characters seems to work best) and, most importantly, should match the search queries people will be using to find your site. This could lead to a struggle with the marketing managers: They'll want your site's page titles to contain the company name and/or a positioning statement. Ask them what good that will do if no one ever sees the pages.
This is a good TITLE tag that will generate traffic from people searching for "picasso":
<Title>Pablo Picasso</Title>
This is a mediocre one:
<Title>Artstuff: Pablo Picasso<Title>
This one will put you out of business:
<Title>Artstuff: Your Number One Online Resource for Fine Art Solutions!!!<Title>
META NAME="Keywords"
Keyword spamming is the number one favorite trick for search engine optimization. But many of the sites that stuff a zillion keywords into their pages are hoping to get clicks to their pages just to show ads -- they don't care if they get any repeat business. But if you want to draw real customers, focus on the keywords you think your users will be searching for.
For our Picasso page, something like this would work (note that uppercase letters don't matter):
<META NAME="keywords" content="Pablo Picasso, Pablo, Picasso, painting, cubist, painting, ceramics, collage, Spain, Guernica, Paris, 20th century, Girl Before a Mirror">
Repeating the most important keyword twice seems to work with some search engines, but repeating more than that will cause some of them to ignore the whole page.
What keywords are people searching for? It's important to focus on the right ones. Zipf's Law predicts that traffic for any particular keyword on a search engine will be proportional to its popularity rank. That is, the number of queries (and hence potential clickthroughs to your site) for the most popular keyword will be ten times greater than that for the tenth most popular term. And traffic to term #10 will be 1,000 times higher than traffic to term number 10,000. Search engine logs don't quite match Zipf's curve, and they vary from one engine to the next. But the lesson remains: If you're not matching the top keywords, forget it.
Where to find the top keywords? Two free resources are searchterms.com and a weekly emailing from Wordtracker. Keyword popularity varies from search engine to search engine, but across the Web (and according to a few well-placed contacts at search engines) these listings are close enough.
META NAME="Description"
This field gets used for the page summary on Inktomi and some other engines, so don't cram it with keywords: A scary-looking description on a search engine's results page could discourage people from clicking through to your page, even if it scores high. (We'll cover more on descriptions in Step 3.)
Page Text
It never hurts to have the search terms you want to match near the top of the page. But cramming in a list of spam-style keywords can also backfire -- Google will display them under the page title on its results page, and Inktomi will show them (as do many others) if there is no DESCRIPTION tag.
Stuffing long strings of repeated keywords into pages used to magically get them to the top of search engine results, but that was before the search engineers realized what was going on and learned how to prevent this from happening. Once in a while you'll see a "spamdexed" page near the top of your results, but this trick works less and less frequently these days.
Links from Other Domains
Look at the top results for the terms you most want to match. Will those sites link to you from their domain? If they do, some of their relevance will rub off on your pages. There are ways to use this dishonestly but usually sites only link to other sites they're comfortable being associated with.
Even if your site does manage to claw its way to a plum position in the search results, that doesn't guarantee that users will follow the link -- that still takes some convincing.
Step 3: Get Clicked
All of the work you've done to get your site crawled at the top of rankings is meaningless if you neglect the final step: Getting the searcher to click through to your site. These days, few users will click on a page described as "Pablo Picasso Pablo Picasso Pablo Picasso art art art art" in search engine results. But if you use TITLE to specify the most likely search term that matches the page, and DESCRIPTION to provide a quick (50 words max) synopsis of the info on the page, your site will attract a lot more clicks.
Don't Scare Them Away
This is where gateway pages, redirects, shadow domains, and other trickery often fail: The would-be customer gets to your site only to discover it contains confusing pages, poor navigation, gratuitous redirects, or exactly the same content as the last site they looked at — huh? When users find pages of such a dubious nature, do you think they're going to trust the site with their credit card number on, say, a $1400 order for two DJ turntables? I sure didn't: When I landed at a site like that recently, I immediately clicked Back and wound up dropping my money on a pair of pricey Technics decks at a site that looked like a real, honest company, rather than a network of sites designed to capture me.
Another mistake new Web marketers make is trying to stop search engines from sending users directly to individual pages on the site — something they huffily call "deep linking." They'll force their Webmaster to redirect anyone who hasn't come through the site's front door back to the home page, as if the site were a brick-and-mortar store. This is usually justified as "customer experience" and "branding," but all it really says is the site doesn't trust its customers to know what they want.
I'm guessing most sites abandon this practice once they look at their log files and see their would-be customers abandoning the site after being pulled away from a product they were ready to buy.
All that said, there are ways to beat the system, as long as you don't mind getting your hands a little dirty.
How to Cheat Honestly
As much as I talk up Google, their ranking system isn't foolproof. In short, it ranks individual URLs based on which other URLs link to them, which URLs link to those, and so on. That's the simplified explanation — you can read about eigenvectors and normal link matrices in this paper written by Google's creators.
While the system works better than old search engine rankings based on keywords and page content, it's not perfect. Links from popular sites can count more than they should, or not enough if the link comes from an obscure page.
But when Google's engineers read the original version of this article, they bristled at some of our suggestions — even though we'd tested them. Emails led to phone calls, and eventually we spent a caffeinated afternoon at the Googleplex in Mountain View, CA, using whiteboards and napkins to sketch out what actually raises your rankings, and what doesn't. We came away with some solid suggestions for where to invest your time wisely:
::Make sure your dynamic pages are crawlable (see above), and make sure the URLs remain constant. If you use one URL on the site map, another for the dynamically generated page, and yet another after giving the user a cookie, the URLs other sites use to link to your pages may not be the same as the one Google indexes. URL inconsistency keeps your pages from being ranked as high as they should be.
::Google crawls the Web in descending order of PageRank, meaning the highest ranked pages are crawled first and most often. So while a crawler page will make your pages findable, getting other sites to link to the individual pages will get them crawled more completely, and thus raise their scores.
::Focus on getting pages that are considered the authoritiy on the topic that you cover to link to your pages. Notice we said pages, not sites. For example, I have a page that's listed by Yahoo, but it's on an obscure part of the directory that no one else links to, so it doesn't help me as much as that link from Dave Winer's blog.
::Ranking trickles down through popular domains with lots of interpage links, raising the value of all pages on a popular site — and hence any page it links to. This is something all bloggers have realized. For example, let's say a post on my blog gets Slashdotted. Not many Web pages will link to the actual Slashdot post, so you'd think it wouldn't do much for my site's scores. But the value of the many links to Slashdot's home page trickles down through to the navigable links inside the site, and eventually to the posting about my page.
::Creating fake domains is a popular trick people use to try to raise their Google scores, hoping to make it appear that other domains are linking to them. The Google guys giggle at this obvious scam: If you understand how vectors work, spreading your pages across multiple domains, or building duplicate sites, does no better than if you'd simply added those pages to your original domain. That's because it's the number of inbound links from elsewhere on the Web that raises your overall score, and it's unlikely that fake domains will make that number go up. Google does make some score adjustments concerning URLs within the same domain to improve the overall results quality, but spreading your pages across ten domains won't do much. And according to Google's anti-spam cop, duplicate domains are the easiest scam to spot.
See? There are a lot of ways to improve your site ranking, and they're all relatively easy. So why on earth would you ever pay someone else to do it?
RESOURCES
WebPosition Gold - Demo available for download.
Links Manager - Free for 30 day trial.
|