Google Sitemap Generation From A Custom SiteMapResult
There are a lot of classes associated with creating feeds within the System.ServiceModel.Syndication namespace, but we actually only need to borrow inspiration from 3 of them for Google sitemaps. I decided to go with creating a high level SiteMapFeed object, which contains a collection of SiteMapFeedItems and a Formatter to render the XML. The SiteMap Protocol defines the elements required for a valid site map.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.mikesdotnetting.com/</loc>
<lastmod>2009-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.mikesdotnetting.com/Contact.aspx</loc>
<priority>0.3</priority>
</url>
</urlset>
You can hopefully see that each <url> node consists of up to 4 elements: a URL, an indication of how often the content at the URL changes and a "priority". This forms the basis for the SiteMapFeedItem class:
using System; namespace MikesDotnetting.Models.SiteMap { public class SiteMapFeedItem { private Double _priority = 0.5; public Uri Url { get; set; } public DateTime LastMod { get; set; } public Enum ChangeFreq { get; set; } public Double Priority { get { return _priority; } set { if (value < 0.0 || value > 1.0) { throw new ArgumentOutOfRangeException("Priority","Priority must be between 0.0 and 1.0"); } _priority = value; } } } }
The protocol specifies that the Priority value must be between 0 and 1. I have defaulted it to 0.5. Any values outside of that range will throw an exception. Bearing in mind that the protocol also specifies that there are a limited set of values that are acceptable for the changefreq element, I chose to make that property an Enumeration:
namespace MikesDotnetting.Models.SiteMap { public enum ChangeFrequency { Always, Hourly, Daily, Weekly, Monthly, Yearly, Never } }
The class for the SiteMapFeed is very simple:
using System.Collections.Generic; namespace MikesDotnetting.Models.SiteMap { public class SiteMapFeed { public List<SiteMapFeedItem> Items { get; set; } } }
I could have done without this and just passed a List<SiteMapFeedItem> around, but at some stage, I might want to cater for different types of site map. Finally the Formatter:
using System.Linq; using System.Xml; using System.Xml.Linq; namespace MikesDotnetting.Models.SiteMap { public class GoogleSiteMapFormatter { private SiteMapFeed siteMap; public GoogleSiteMapFormatter(SiteMapFeed feedToFormat) { siteMap = feedToFormat; } public void WriteTo(XmlWriter writer) { XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9"; var sitemap = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), new XElement(ns + "urlset", siteMap.Items .Select(item => new XElement(ns + "url", new XElement(ns + "loc", item.Url), new XElement(ns + "lastmod", item.LastMod.ToW3CDate()), new XElement(ns + "changefreq", item.ChangeFreq.ToString().ToLower()), new XElement(ns + "priority", item.Priority) ) ) ) ); sitemap.Save(writer); } } }
This one is called a GoogleSiteMapFormatter. I may want to add other formatters specific to other outputs at some stage in the future. The constructor accepts a SiteMapFeed object, which is passed to a private field. When the WriteTo method is invoked, this feed is then iterated over and an XDocument is created according to the specification for the protocol.
Following a comment from Greg I amended the lastmod value to provide a valid W3C date via an extension method:
public static string ToW3CDate(this DateTime dt) { return dt.ToUniversalTime().ToString("s") + "Z"; }
Prior to that, I was generating a value using the ShortDateString() method, which while it produced warnings from Google, did not stop the sitemap from being indexed.
The XDocument is then passed to the XmlWriter object that was passed into the WriteTo method, which is called in a custom ActionResult:
using System; using System.Web.Mvc; using System.Xml; using MikesDotnetting.Models.SiteMap; namespace MikesDotnetting.Models { public class SiteMapResult : ActionResult { public SiteMapFeed Feed { private get; set; } public override void ExecuteResult(ControllerContext context) { if (context == null) { throw new ArgumentNullException("context"); } context.HttpContext.Response.ContentType = "text/xml"; var siteMapFormatter = new GoogleSiteMapFormatter(Feed); using (var writer = XmlWriter.Create(context.HttpContext.Response.Output)) { siteMapFormatter.WriteTo(writer); } } } }
This works in exactly the same way as the RssResult introduced in my previous article, and just like in the previous article, I added a method to my BaseXmlController that wraps a call to instantiate the SiteMapResult:
protected static SiteMapResult SiteMap(SiteMapFeed feed) { return new SiteMapResult { Feed = feed }; }
The Action method within the XmlController (which inherits from BaseXmlController) looks like this:
public SiteMapResult SiteMapFeed() { const string url = "http://www.mikesdotnetting.com/Article/{0}/{1}"; var entries = repository.GetAllArticleTitles(); var feed = new List<SiteMapFeedItem>(); foreach (var entry in entries) { var item = new SiteMapFeedItem { ChangeFreq = ChangeFrequency.Monthly, LastMod = entry.DateAmended ?? entry.DateCreated, Url = new Uri(string.Format(url, entry.ID, entry.Head.ToCleanUrl())) }; feed.Add(item); } feed.Add(new SiteMapFeedItem { ChangeFreq = ChangeFrequency.Always, LastMod = DateTime.Now, Priority = 1.0, Url = new Uri("http://www.mikesdotnetting.com") }); feed.Add(new SiteMapFeedItem { ChangeFreq = ChangeFrequency.Never, LastMod = DateTime.Now, Priority = 0.3, Url = new Uri("http://www.mikesdotnetting.com/Contact") }); feed.Add(new SiteMapFeedItem { ChangeFreq = ChangeFrequency.Yearly, LastMod = DateTime.Now, Priority = 0.3, Url = new Uri("http://www.mikesdotnetting.com/About") }); var siteMap = new SiteMapFeed { Items = feed }; return SiteMap(siteMap); }
In this example, I have retrieved all the articles from the database and built a List<SiteMapFeedItem> with them. I have then added 3 static pages as SiteMapFeedItems as well. For relatively simple sites like this blog, that will work well. However, you wouldn't want to keep recompiling the site to add more static pages to the site map, so it would probably be a good idea to keep the static page information in another place, perhaps the database or another xml file, and read that into SiteMapFeedItems whenever the site map is generated.
Currently rated 4.44 by 9 people
Rate Now!
Date Posted:
31 May 2010 20:42
Last Updated:
14 June 2010 21:10
Posted by:
Mikesdotnetting
Total Views to date:
3993



Comments
14 June 2010 17:37 from Greg
Great Article, a quick question.).
I've implemented your solution and have come across a problem. Google Webmaster Tools is saying that the sitemap doesn't properly declare the namespace. The initial Declaration is correct, but it looks like there are empty namespace attributes in the 'url' elements. i.e. (
1) Have you had any problems with submitting your sitemap to google?
2) I'm guessing that the WriteTo method in the SiteMapFormatter is the culprit, any ideas besides building it from scratch?
14 June 2010 18:21 from Mikesdotnetting
@Greg
Google objected to my sitemap too. I have made a couple of amendments to the WriteTo() method, which you correctly identified as the culprit. These are to ensure that the empty namespace declaration doesn't appear in the url element, and to format the date differently.
The good news is that it indexed all the resources but only showed the errors as warnings.
14 June 2010 21:40 from Anthony
Mike,
Could this be adapted for use in a Web Forms application project, where pages are being updated by users? I would like to be able to update the sitemap in response to new content being added.
Anthony :-)
14 June 2010 22:36 from Mikesdotnetting
@Anthony
Sure it can be used in web forms. You could put the SiteMapResult code (with a modification or two) in the ProcessRequest() method of an HttpHandler. The SiteMapFeed() code could go there too.