22 Oct Dynamic Websites – Can Google index them?
There is a popular misconception that dynamic URLs cannot be crawled. This is outdated thinking! Google claims it can crawl dynamic URLs and interpret the different parameters. About 10 years ago, most search engines would see a page that looked like: mysite.com/index.cfm?category=1&id=23 and as soon as the spider hit the “?” it would ignore anything after that, and you would only get index.cfm to be indexed by Google and not the individual pages. This has not been the case for many years, at least with Google. I believe this is one of the reasons why WordPress and other content management systems have become so popular. Per Google:
Much of the Internet is now dynamically generated, and being able to index that content effectively is one of the key factors in Google’s success. Google will index your dynamic website or web application by following links to the dynamically generated pages. The dynamically generated pages will have links to other dynamically generated pages, allowing Google to “crawl” your dynamically generated content.
This is Google’s official statement: “Google now indexes URLs that contain the “&id=” parameter. So if your site uses a dynamic structure that generates it, don’t worry about rewriting it — we’ll accept it just fine as is.”
If that is the case, why do we need to worry about “pretty urls”, changing the name of “mysite.com/?page_id=21” to “mysite.com/services/”? The only real reason is for humans – computers don’t care.
Think about Amazon.com for a moment. Do you really think that someone builds every page of the Amazon website in order to get the search engines to crawl it? Of course not! Here is a typical url from Amazon:
and here is one from eBay:
<a href=”mysite.com/index.cfm?id=1>Page 1</a>
<a href=”mysite.com/index.cfm?id=2>Page 2</a> …etc.
then any modern search engine should be able to crawl that content.
Are there any down-sides to dynamic websites? Of course – many analytics packages (including SmarterStats) ignore dynamic content, which can be frustrating if you are trying to manage which pages of your site are getting the most traffic. Many “robot simulators” are old and will provide you with false readings as to what Google sees. You need to be aware of duplicate content, so you don’t get penalized by the search engines. And pages created by clicking on a “submit” button are still not seen by search engines.
As I said, I believe that the reason why WordPress and other CMS systems have become so popular is the ability to search engines to crawl these sites. Combined with the ease of use for website owners to update their own content, the benefits of dynamically generated content out-weigh that of static content. But a good menu structure combined with a site map will still help the spiders navigate your site just like humans do, so it is a best practice to assist them when you can.