{"id":1205,"date":"2007-09-14T04:17:28","date_gmt":"2007-09-14T12:17:28","guid":{"rendered":"http:\/\/www.krunk4ever.com\/blog\/2007\/09\/14\/sitemap\/"},"modified":"2007-09-14T04:33:56","modified_gmt":"2007-09-14T12:33:56","slug":"sitemap","status":"publish","type":"post","link":"https:\/\/www.krunk4ever.com\/blog\/2007\/09\/14\/sitemap\/","title":{"rendered":"Sitemap"},"content":{"rendered":"<p>So a few days ago, I was playing with my robots.txt and started to do some research. While reading the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Robots.txt\">Wikipedia entry<\/a>, I noticed that I could provide a <a href=\"http:\/\/en.wikipedia.org\/wiki\/Robots.txt#Sitemaps_auto-discovery\">Sitemap<\/a>, which apparently Google, Yahoo!, and MSN would read. A sitemap is basically a list of all your pages so search engine bots don&#8217;t have to slowly crawl to find every page. For some reason, both Yahoo! and MSN\/Live has problems indexing my <a href=\"http:\/\/www.hd-trailers.net\">HD-Trailers.net<\/a> site, so I thought maybe a sitemap would help. Google also has a problem indexing my <a href=\"http:\/\/gallery.krunk4ever.com\/\">Gallery<\/a> as it&#8217;s slowly increases about 100 pages a week, while still missing 2000+.<\/p>\n<p>So a quick search revealed that both WordPress and Gallery had automatic sitemap generators:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.arnebrachhold.de\/2005\/06\/05\/google-sitemaps-generator-v2-final\">Google Sitemap Generator for WordPress v2 Final<\/a><\/li>\n<li><a href=\"http:\/\/codex.gallery2.org\/Gallery2:Modules:sitemap\">Gallery2:Modules:sitemap<\/a><\/li>\n<\/ul>\n<p>Installing the plug-in\/module was rather simple and enabling either was just a few clicks. After the sitemaps were generated, I used the <a href=\"http:\/\/www.validome.org\/google\/\">Google Sitemap Validator<\/a> to see if there were any problems. Apparently the WordPress plug-in issues a priority of 1 instead of 1.0 which the validator didn&#8217;t like. I began looking at the code to see where I could fix it, but it seemed lke more hassle as they had some weird calculation converting ints to strings and vice versa. I ended up just setting the homepage to 0.9 in the control, thinking 0.9 isn&#8217;t that much different than 1.0.<\/p>\n<p>Now I had to create a site index for my main <a href=\"http:\/\/www.hd-trailers.net\/\">HD-Trailers.net<\/a> page. The <a href=\"http:\/\/www.sitemaps.org\/protocol.php\">protocol<\/a> <a href=\"https:\/\/www.google.com\/webmasters\/tools\/docs\/en\/protocol.html\">documentations<\/a> were pretty helpful and given that I already had 3 sitemaps as reference guides, I whipped up some code to create the sitemap for the main page.<\/p>\n<p>Reading on, it turns out that robots.txt has to be in the root directory and it only supports 1 sitemap per robots.txt. So given that there&#8217;s both a blog sitemap and the main page sitemap, I needed to merge the sitemaps into one, which wasn&#8217;t too difficult of a task.<\/p>\n<p>However, I found out later that there&#8217;s also this sitemap index format which I could&#8217;ve used to point to multiple sitemaps instead of merging them. Maybe I&#8217;ll change that later. For now, it should do its job fine.<\/p>\n<p>After your sitemaps are ready, you can submit them to <a href=\"http:\/\/www.google.com\/webmasters\/\">Google<\/a> and <a href=\"http:\/\/siteexplorer.search.yahoo.com\/\">Yahoo!<\/a>. I couldn&#8217;t find one for MSN\/Live, but maybe they&#8217;ll be able to pick it up from my robots.txt.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So a few days ago, I was playing with my robots.txt and started to do some research. While reading the Wikipedia entry, I noticed that I could provide a Sitemap, which apparently Google, Yahoo!, and MSN would read. A sitemap is basically a list of all your pages so search engine bots don&#8217;t have to &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.krunk4ever.com\/blog\/2007\/09\/14\/sitemap\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Sitemap&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[11],"tags":[],"_links":{"self":[{"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/posts\/1205"}],"collection":[{"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/comments?post=1205"}],"version-history":[{"count":0,"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/posts\/1205\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/media?parent=1205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/categories?post=1205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.krunk4ever.com\/blog\/wp-json\/wp\/v2\/tags?post=1205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}