Because it’s so important that search engines can read your content, it’s key to know what a search engine can and can’t do. There’s not much point building something awesome in the hope of getting organic visits if it can’t physically be ranked.
At the moment, there’s still a good number of things which search engines, like Google’s bot, can’t read. Because I don’t work at Google I have no idea exactly whether they can or can’t read the below items, but everyone else who doesn’t work at Google seems to be in the same camp as me.
When making your site crawlable, try to avoid building entire pages, or entire sites with any of the below if you want a chance of ranking.
What Does a Page Look Like to a Crawler?
A crawler sees a text-only version of each of your webpages. To get a good idea of how your page is seen according to search engines, just type “cache:” without the quotation marks, before your site URL. You can do this for any page.
Now click that grey box that has appeared at the top, where you will see ‘text only version’. Click this to see the text version which you need to see in order to decide whether what you want to be crawled is being crawled.
Can Search Engines Crawl AJAX?
I’ve found it’s most used with the Google Maps API – Google Maps is probably the most famous AJAX user, although it’s also used in Gmail.
AJAX will often be chosen because it can reduce the load time required, and it works by only delivering content once it is requested by a user. This makes for a speedy site. However, because the search engine has to ‘click’ content in order for it to be read, this causes problems.
In order to get AJAX ‘crawled’, you need to do a bit of fancy footwork with coding – coding which I’m far too early on in my TreeHouse course to fill you in on. Take a look at Google’s guide to making AJAX applications crawlable.
Or, you can look at a text only cache and see if the links have disappeared.
Can Search Engines Read Flash?
This has long been a bone of contention for SEO analysts – people seem to be on the fence about whether it’s a good idea to build a site with Flash. However, while it does make sites look awesome, it is dreadful for SEO. Just dreadful.
All that lovely design and stuff just can’t be crawled, and you will find that a text only cache will probably be entirely blank. Only choose Flash if you don’t want the site to rank – it’s often used for subdomains and the like which are an extension of a main site to show off some art.
Can Search Engines Crawl Images?
Can Google crawl images? I’ve asked that question often and get a different answer each time.
Well, the answer is ‘kind of’.
If you have text on an image, Google can’t pull that text from the image. However, Google’s algorithm is able to pull relevant images into a Google image search by crawling the content on the page, and looking at the page title in order to make an educated guess on what the image contains.
For example, I have noticed my cat has started appearing in a Google image search for ‘seo cat’. However, if I wrote ‘Walrus cakes’ on the image and uploaded it as a .JPG, it won’t ever start ranking for ‘Walrus cakes’ unless there’s a load of links pointing to it with the exact match anchor text, or I mention ‘Walrus cakes’ repeatedly in the page content.
Can Google Crawl iFrames?
At the moment, no. Google Webmaster Forums says of iFrames, “frames can cause problems for search engines because they don’t correspond to the conceptual model of the web.”
This means, to paraphrase, that it’s confusing to crawlers when one page displays multiple URLs thanks to these frames.
“Google tries to associate framed content with the page containing the frames,” it continues, “but we won’t guarantee that we will.”
So take that as a no for now.