Recently I logged into Google Webmaster Tools and realized that my sitemap for one of my websites had not been updated since I first launched that project years ago. More than likely I launched the site, went to a sitemap generation website, and downloaded the results and left it at that. And while that's perfectly alright, it isn't the most ideal way of handling sitemaps. For one, it'll usually be in a different format than what you might want. Some of the sitemap properties might not come out the way you want them also. And if you have a big website, it's going to take a while for those crawlers to hit every page, potentially leaving some out even.
The following is a simple guideline to how to go about creating your your own sitemap in ASP.NET.
A Few Sitemap Guidelines
To start off, here are a couple of rules to follow if you want to create a good bot friendly sitemap for your website. These guidelines are from the Google Webmasters support pages.
- Follow the Sitemap Protocol xml schema
- Your sitemap file must be UTF-8 encoded
- Begin with an opening <urlset> tag and end with a closing </urlset> tag
- Your sitemap must contain no more than 50,000 urls and be no larger than 50MB
- Sitemaps can be broken down into smaller subsets if they are indeed too large
- Specify the namespace (protocol standard) within the tag
- Include a <url> entry for each URL, as a parent XML tag.
- Include a <loc> child entry for each <url> parent tag.
- Your sitemap must specify the following namespace: xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
Tag Definitions For Your Sitemap
Tag |
Required? |
Description |
<urlset> |
Required |
Encloses all information about the set of URLs included in the Sitemap. |
<url> |
Required |
Encloses all information about a specific URL. |
<loc> |
Required |
Specifies the URL. For images and video, specifies the landing page (aka play page, referrer page). Must be a unique URL. |
<lastmod> |
Optional |
The date the URL was last modifed, in YYYY-MM-DDThh:mmTZD format (time value is optional). |
<changefreq> |
Optional |
Provides a hint about how frequently the page is likely to change. Valid values are:
always . Use for pages that change every time they are accessed.
hourly
daily
weekly
monthly
yearly
never. Use this value for archived URLs.
|
<priority> |
Optional |
Describes the priority of a URL relative to all the other URLs on the site. This priority can range from 1.0 (extremely important) to 0.1 (not important at all).
Does not affect your site's ranking in Google search results. Because this value is relative to other pages on your site, assigning a high priority (or specifying the same priority for all URLs) will not help your site's search ranking. In addition, setting all pages to the same priority will have no effect.
|
Quick Example
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The Code
public void function CreateSitemap()
{
// Declare an XmlDocument object (using System.Xml)
XmlDocument doc = new XmlDocument();
//Create an XML declaration (xml version="1.0"?)
XmlDeclaration xmldecl;
xmldecl = doc.CreateXmlDeclaration("1.0",null,null);
//Add the new node to the document
//declaration must be the first line in the file
XmlElement root = doc.DocumentElement;
doc.InsertBefore(xmldecl, root);
// adding the urlset namespace attributes to the urlset element
XmlNode urlset = doc.CreateNode(XmlNodeType.Element, "urlset", "");
XmlAttribute att = doc.CreateAttribute("xmlns");
att.Value = "http://www.sitemaps.org/schemas/sitemap/0.9";
XmlAttribute att2 = doc.CreateAttribute("xmlns:image");
att2.Value = "http://www.google.com/schemas/sitemap-image/1.1";
urlset.Attributes.Append(att);
urlset.Attributes.Append(att2);
I used the XmlDocument (in the System.Xml namespace) to store the sitemap before it is saved to the server. Any approach though is really valid. You could just as easily declare a string variable, and then stitch together all of the data and write that out to a file. Not that I would recommend that method, but it is totally a valid approach. That's the one time-consuming part with .NET and XML. Picking a solution and sticking with it.
// let's assume these cars have their own product pages on a website
string strSql = string.Format(@"SELECT * FROM Car WHERE Active = 1 ORDER BY ID DESC");
SqlDataReader reader = null;
try
{
reader = GetDataReader(strSql); // some custom function that returns a datareader
while (reader.Read())
{
// declaring each sitemap tag that we'll be using
XmlNode node = doc.CreateNode(XmlNodeType.Element, "url", "");
XmlNode loc = doc.CreateNode(XmlNodeType.Element, "loc", "");
XmlNode image = doc.CreateNode(XmlNodeType.Element, "image", "");
XmlNode priority = doc.CreateNode(XmlNodeType.Element, "priority", "");
XmlNode lastmod = doc.CreateNode(XmlNodeType.Element, "lastmod", "");
XmlNode changefreq = doc.CreateNode(XmlNodeType.Element, "changefreq", "");
// DateTime object used to format our date later
DateTime dt = DateTime.Parse(reader["PublishDate"].ToString());
// whichever url format you may have
loc.InnerText = string.Format("http://www.yoursite.com/{0}/{1}", reader["ID"].ToString(), reader["Name"].Tostring());
lastmod.InnerText = dt.ToString("yyyy-MM-dd"); // date must be in this format to be valid
priority.InnerText = "0.9";
changefreq.InnerText = "daily";
node.AppendChild(loc);
node.AppendChild(lastmod);
node.AppendChild(priority);
node.AppendChild(changefreq);
urlset.AppendChild(node); // everything gets appended to our first urlset element
}
doc.AppendChild(urlset); // urlset is appended to our main document
// sitemaps are normally found in the root of your website
// but feel free to save the sitemap in whatever location makes sense to your website
doc.Save(Server.MapPath("~/sitemap.xml"));
}
catch (Exception ex)
{
}
finally
{
reader.close();
}
}
Just run the script whenever you have content updates and you'll have yourself an updated sitemap to call your very own. It's much more rewarding than visiting a website, entering a url and hopefully getting back accurate results.
Walter Guevara is a Computer Scientist, software engineer, startup founder and previous mentor for a coding bootcamp. He has been creating software for the past 20 years.
Last updated on:
Have a question on this article?
You can leave me a question on this particular article (or any other really).
Ask a question