Google Sitemap Generation From A Custom SiteMapResult

My previous article showed how to use a custom ActionResult and the classes within WCF to generate an RSS feed. There are no similar classes to help build a valid Google Site Map for an ASP.NET MVC application. So here's how your can build your own.

There are a lot of classes associated with creating feeds within the System.ServiceModel.Syndication namespace, but we actually only need to borrow inspiration from 3 of them for Google sitemaps. I decided to go with creating a high level SiteMapFeed object, which contains a collection of SiteMapFeedItems and a Formatter to render the XML. The SiteMap Protocol defines the elements required for a valid site map.

 

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

  <url>

    <loc>http://www.mikesdotnetting.com/</loc>

    <lastmod>2009-01-01</lastmod>

    <changefreq>monthly</changefreq>

    <priority>0.5</priority>

  </url>

  <url>

    <loc>http://www.mikesdotnetting.com/Contact.aspx</loc>

    <priority>0.3</priority>

  </url>

</urlset>

 

You can hopefully see that each <url> node consists of up to 4 elements: a URL, an indication of how often the content at the URL changes and a "priority". This forms the basis for the SiteMapFeedItem class:

using System;


namespace MikesDotnetting.Models.SiteMap
{
    public class SiteMapFeedItem
    {
        private Double _priority = 0.5;

        public Uri Url { get; set; }
        public DateTime LastMod { get; set; }
        public Enum ChangeFreq { get; set; }
        public Double Priority
        {
            get { return _priority; }
            set
            {
                if (value < 0.0 || value > 1.0)
                {
                    throw new ArgumentOutOfRangeException("Priority","Priority must be between 0.0 and 1.0");
                }
                _priority = value;
            }
        }
    }
}


The protocol specifies that the Priority value must be between 0 and 1. I have defaulted it to 0.5. Any values outside of that range will throw an exception. Bearing in mind that the protocol also specifies that there are a limited set of values that are acceptable for the changefreq element, I chose to make that property an Enumeration:

namespace MikesDotnetting.Models.SiteMap
{
    public enum ChangeFrequency
    {
        Always,
        Hourly,
        Daily,
        Weekly,
        Monthly,
        Yearly,
        Never
    }
}

The class for the SiteMapFeed is very simple:

using System.Collections.Generic;

namespace MikesDotnetting.Models.SiteMap
{
    public class SiteMapFeed
    {
        public List<SiteMapFeedItem> Items { get; set; }
    }
}


I could have done without this and just passed a List<SiteMapFeedItem> around, but at some stage, I might want to cater for different types of site map. Finally the Formatter:


using System.Linq;
using System.Xml;
using System.Xml.Linq;

namespace MikesDotnetting.Models.SiteMap
{
    public class GoogleSiteMapFormatter
    {
        private SiteMapFeed siteMap;

        public GoogleSiteMapFormatter(SiteMapFeed feedToFormat)
        {
            siteMap = feedToFormat;
        }

        public void WriteTo(XmlWriter writer)
        {
            XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
            var sitemap = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
                    new XElement(ns + "urlset",
                         siteMap.Items
                         .Select(item => new XElement(ns + "url",
                                  new XElement(ns + "loc", item.Url),
                                  new XElement(ns + "lastmod", item.LastMod.ToW3CDate()),
                                  new XElement(ns + "changefreq", item.ChangeFreq.ToString().ToLower()),
                                  new XElement(ns + "priority", item.Priority)
                                )
                              )
                            )
                        );
            sitemap.Save(writer);
        }
    }
}

This one is called a GoogleSiteMapFormatter. I may want to add other formatters specific to other outputs at some stage in the future. The constructor accepts a SiteMapFeed object, which is passed to a private field. When the WriteTo method is invoked, this feed is then iterated over and an XDocument is created according to the specification for the protocol.

Following a comment from Greg I amended the lastmod value to provide a valid W3C date via an extension method:


public static string ToW3CDate(this DateTime dt)
{
    return dt.ToUniversalTime().ToString("s") + "Z";
}

Prior to that, I was generating a value using the ShortDateString() method, which while it produced warnings from Google, did not stop the sitemap from being indexed.

The XDocument is then passed to the XmlWriter object that was passed into the WriteTo method, which is called in a custom ActionResult:

using System;
using System.Web.Mvc;
using System.Xml;
using MikesDotnetting.Models.SiteMap;

namespace MikesDotnetting.Models
{
    public class SiteMapResult : ActionResult
    {
        public SiteMapFeed Feed { private get; set; }

        public override void ExecuteResult(ControllerContext context)
        {
            if (context == null)
            {
                throw new ArgumentNullException("context");
            }
            context.HttpContext.Response.ContentType = "text/xml";

            var siteMapFormatter = new GoogleSiteMapFormatter(Feed);
            using (var writer = XmlWriter.Create(context.HttpContext.Response.Output))
            {
                siteMapFormatter.WriteTo(writer);
            }
        }
    }
}


This works in exactly the same way as the RssResult introduced in my previous article, and just like in the previous article, I added a method to my BaseXmlController that wraps a call to instantiate the SiteMapResult:

protected static SiteMapResult SiteMap(SiteMapFeed feed)
{
    return new SiteMapResult
    {
        Feed = feed
    };
}

The Action method within the XmlController (which inherits from BaseXmlController) looks like this:

public SiteMapResult SiteMapFeed()
{
    const string url = "http://www.mikesdotnetting.com/Article/{0}/{1}";
    var entries = repository.GetAllArticleTitles();

    var feed = new List<SiteMapFeedItem>();
    foreach (var entry in entries)
    {
        var item = new SiteMapFeedItem
                       {
                           ChangeFreq = ChangeFrequency.Monthly,
                           LastMod = entry.DateAmended ?? entry.DateCreated,

                           Url = new Uri(string.Format(url, entry.ID, entry.Head.ToCleanUrl()))
                       };
        feed.Add(item);
    }
    feed.Add(new SiteMapFeedItem
                 {
                     ChangeFreq = ChangeFrequency.Always, 
                     LastMod = DateTime.Now, 
                     Priority = 1.0, 
                     Url = new Uri("http://www.mikesdotnetting.com")
                 });
    feed.Add(new SiteMapFeedItem
                 {
                     ChangeFreq = ChangeFrequency.Never, 
                     LastMod = DateTime.Now, 
                     Priority = 0.3, 
                     Url = new Uri("http://www.mikesdotnetting.com/Contact")
                 });
    feed.Add(new SiteMapFeedItem
                 {
                     ChangeFreq = ChangeFrequency.Yearly, 
                     LastMod = DateTime.Now, 
                     Priority = 0.3, 
                     Url = new Uri("http://www.mikesdotnetting.com/About")
                 });
    var siteMap = new SiteMapFeed { Items = feed };
    return SiteMap(siteMap);
}


In this example, I have retrieved all the articles from the database and built a List<SiteMapFeedItem> with them. I have then added 3 static pages as SiteMapFeedItems as well. For relatively simple sites like this blog, that will work well. However, you wouldn't want to keep recompiling the site to add more static pages to the site map, so it would probably be a good idea to keep the static page information in another place, perhaps the database or another xml file, and read that into SiteMapFeedItems whenever the site map is generated.