Making Your Site Crawlable

Share on Google+Share on LinkedInTweet about this on TwitterShare on StumbleUponShare on Facebook

Because it’s so important that search engines can read your content, it’s key to know what a search engine can and can’t do. There’s not much point building something awesome in the hope of getting organic visits if it can’t physically be ranked.

At the moment, there’s still a good number of things which search engines, like Google’s bot, can’t read. Because I don’t work at Google I have no idea exactly whether they can or can’t read the below items, but everyone else who doesn’t work at Google seems to be in the same camp as me.

When making your site crawlable, try to avoid building entire pages, or entire sites with any of the below if you want a chance of ranking.

What Does a Page Look Like to a Crawler?

A crawler sees a text-only version of each of your webpages. To get a good idea of how your page is seen according to search engines, just type “cache:” without the quotation marks, before your site URL. You can do this for any page.

text only cache

Now click that grey box that has appeared at the top, where you will see ‘text only version’. Click this to see the text version which you need to see in order to decide whether what you want to be crawled is being crawled.

text only version

Can Search Engines Crawl AJAX?

I have a personal dislike of AJAX, not least because I can only barely get my head around it. It means ‘Asynchronous JavaScript and XML’, and I’ve seen a good number of client sites teaming with it in the wrong places.

I’ve found it’s most used with the Google Maps API – Google Maps is probably the most famous AJAX user, although it’s also used in Gmail.

AJAX will often be chosen because it can reduce the load time required, and it works by only delivering content once it is requested by a user. This makes for a speedy site. However, because the search engine has to ‘click’ content in order for it to be read, this causes problems.

In order to get AJAX ‘crawled’, you need to do a bit of fancy footwork with coding – coding which I’m far too early on in my TreeHouse course to fill you in on. Take a look at Google’s guide to making AJAX applications crawlable.

Can Search Engines Read JavaScript?

You will find that JavaScript crops up a great deal on websites these days, often used for things like dynamic navigation and exciting stuff like sliders carousels. However, if you have JavaScript links, search engines won’t be able to crawl the links.

To find out what is going on, disable JavaScript in your browser. If the links in a drop-down nav don’t appear any more, that’s going to be a problem.

Or, you can look at a text only cache and see if the links have disappeared.

To turn off JavaScript in Chrome, go into settings and click ‘advanced settings’ at the bottom of the page. Under ‘Privacy’, click ‘content settings…’ and you can turn JavaScript off here.

turn off javascript

Can Search Engines Read Flash?

This has long been a bone of contention for SEO analysts – people seem to be on the fence about whether it’s a good idea to build a site with Flash. However, while it does make sites look awesome, it is dreadful for SEO. Just dreadful.

All that lovely design and stuff just can’t be crawled, and you will find that a text only cache will probably be entirely blank. Only choose Flash if you don’t want the site to rank – it’s often used for subdomains and the like which are an extension of a main site to show off some art.

Can Search Engines Crawl Images?

Can Google crawl images? I’ve asked that question often and get a different answer each time.

Well, the answer is ‘kind of’.

If you have text on an image, Google can’t pull that text from the image. However, Google’s algorithm is able to pull relevant images into a Google image search by crawling the content on the page, and looking at the page title in order to make an educated guess on what the image contains.

For example, I have noticed my cat has started appearing in a Google image search for ‘seo cat’. However, if I wrote ‘Walrus cakes’ on the image and uploaded it as a .JPG, it won’t ever start ranking for ‘Walrus cakes’ unless there’s a load of links pointing to it with the exact match anchor text, or I mention ‘Walrus cakes’ repeatedly in the page content.

Can Google Crawl iFrames?

At the moment, no. Google Webmaster Forums says of iFrames, “frames can cause problems for search engines because they don’t correspond to the conceptual model of the web.”

This means, to paraphrase, that it’s confusing to crawlers when one page displays multiple URLs thanks to these frames.

“Google tries to associate framed content with the page containing the frames,” it continues, “but we won’t guarantee that we will.”

So take that as a no for now.




Written by Sarah Chalk

Sarah Chalk

Sarah is an SEO Account Manager at 360i and has a keen interest in all things SEO. She has also written for a number of sites, including Vue cinema’s film blog and a number of tech websites.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>