RSS Feeds and Google Sitemaps for ASP.NET MVC with LINQ To XML

For the majority of personal web sites, two uses of XML are commonplace: creating an RSS feed and a Google Sitemap. Here, I look at how to create both of those using LINQ to XML for an ASP.NET MVC web site.

I've already looked at both RSS Feeds and Google Sitemaps before, using an XMLTextWriter object to generate the output, and it can be a fairly laborious task. LINQ to XML requires far less code to achieve the same thing. Both of the documents will be generated on demand - that is as a result of someone requesting the appropriate URL. This is so that the documents incorporate the most up-to-date content as articles are added and amended regularly. The content will be drawn from a database and the Entity Framework is the mechanism I have chosen to act as my data access technology.

My full blown Article class consists of the text of the article and a lot of meta-data that just isn't necessary for RSS or a sitemap, and I have already created a couple of classes to cater for the retrieval of a small subset of Article data, so I shall use those. The actual data retrieval is in a class called ArticleRepository, which is in the Model area of the application. Starting with the RSS feed, I have created a method that lists the most recent 20 items that have been added:


public IEnumerable<ArticleSummary> GetRSSFeed()
{
  return (de.ArticleSet
              .OrderByDescending(a => a.DateCreated)
              .Select(a => new ArticleSummary
                              {
                                ID = a.ArticleID,
                                Head = a.Headline,
                                Intro = a.Abstract,
                                CreatedDate = a.DateCreated
                              }))
              .Take(20)
              .ToList();
}

I also need to create a Controller to handle this (and the sitemap), so I shall call it XMLController:


using System;
using System.Linq;
using System.Web.Mvc;
using System.Xml.Linq;
using MikesDotnetting.Helpers;
using MikesDotnetting.Models;


namespace MikesDotnetting.Controllers
{
  public class XMLController : Controller
  {
    private IArticleRepository repository;

    public XMLController() : this(new ArticleRepository())
    {

    }

    public XMLController(IArticleRepository rep)
    {
      repository = rep;
    }


  }
}

This controller is making use of the Repository pattern, in that instead of invoking the ArticleRepository and calling it directly, I am programming against an interface instead. If I decide to get fed up with Entity Framework, and choose to use LINQ to SQL, or even ADO.NET code for data access, I will only have to change this class in one place, rather than have to go through every method and unhook the concrete ArticleRepository class from them. So now I need to add a method to generate the RSS feed, which is actually just a streamed XML document. We will look at the code that does that, then examine it:


public ContentResult RSS()
{
  const string url = "http://www.mikesdotnetting.com/Article/Show/{0}/{1}";
  var items = repository.GetRSSFeed();
  var rss = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
    new XElement("rss",
      new XAttribute("version", "2.0"),
        new XElement("channel",
          new XElement("title", "Mikesdotnetting News Feed"),
          new XElement("link", "http://www.mikesdotnetting.com/rss"),
          new XElement("description", "Latest additions to Mikesdotnetting"),
          new XElement("copyright", "(c)" + DateTime.Now.Year + ", Mikesdotnetting. All rights reserved"),
        from item in items
        select
        new XElement("item",
          new XElement("title", item.Head),
          new XElement("description", item.Intro),
          new XElement("link", String.Format(url, item.ID, UrlTidy.ToCleanUrl(item.Head))),
          new XElement("pubDate", item.CreatedDate.ToString("R"))

        )
      )
    )
  );
  return Content(rss.ToString(), "text/xml");
}

If you compare LINQ to XML with using the XmlTextWriter, you can see that a lot less code is required. The savings mainly derive from not having to explicitly close elements in the document, and not having to write out the individual parts of an element using different methods such as WriteElementString(), WriteStartElement() etc. Not only that, but you can almost see the outline of the finished XML document from the code. The link node within the item elements has an odd method applied to the item.Head to create a link. UrlTidy.ToCleanUrl takes the existing article title, and replaces spaces with dashes etc to give a clean SEO-friendly url. The code for the method is available in this previous article (first block of code). The result of the action is returned as a ContentResult, which allows for any type of data. In this case, the content type is also specified as text/xml. I have seen some examples of RSS feed that use application/rss+xml. This works for a large number of rss readers, but cannot be relied upon all the time - application/rss+xml is NOT a standard MIME type.

Now to the second action on the controller:


public ContentResult Sitemap()
{
  XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
  const string url = "http://www.mikesdotnetting.com/Article/Show/{0}/{1}";
  var items = repository.GetAllArticleTitles();
  var sitemap = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
      new XElement(ns + "urlset",
          from item in items
          select
          new XElement("url",
            new XElement("loc", string.Format(url, item.ID, UrlTidy.ToCleanUrl(item.Head))),
            item.DateAmended != null ?
                new XElement("lastmod", String.Format("{0:yyyy-MM-dd}", item.DateAmended)) :
                new XElement("lastmod", String.Format("{0:yyyy-MM-dd}", item.DateCreated)),
            new XElement("changefreq", "monthly"),
            new XElement("priority", "0.5")
            )
          )
        );
  return Content(sitemap.ToString(), "text/xml");
}

The main differrence between this method and the previous one is the presence of a namespace in the XML. The RSS sepcification doesn't require one to be present, but the Sitemap 0.9 specification does:


<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

Lots of developers see a namespace in XML and treat it in the same way as attributes, with code similar to this:


string ns = "http://www.sitemaps.org/schemas/sitemap/0.9";

var sitemap = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
    new XElement("urlset",new XAttribute("xmlns", ns),

And then they see an exception message along these lines:

The prefix '' cannot be redefined from '' to 'http://www.sitemaps.org/schemas/sitemap/0.9' within the same start element tag.

Namespaces are different to attributes, and need to be defined as an XNamespace object before being passed in to the document.

Now that the actions have been created, all that is needed is to register a couple of routes to point to them:


routes.MapRoute(
  "Sitemap", "sitemap",
  new { controller = "XML", action = "Sitemap" });
  
routes.MapRoute(
  "RSS", "rss",
  new{controller="XML", action="RSS"});

And we are done.

There are some improvements that can be made such as use of caching or perhaps even generating a file that sits on disk. This will save having the sitemap or rss feed being generated anew unnecessarily when there have been no modifications to it since the last request. Also, building on the comments from Jim Wooley and John Sheehan below, the approach I have taken does in fact technically result in the View being built within the Controller action, which is OK if you have a simple requirement. You could almost see LINQ to XML as acting as an Html.Helper extension method in this case (which you would usually find in Views). However, as Jim points out, passing a strongly typed collection to a View and then formatting the results there is a much better way of doing things if you intend to expose multiple feeds. John's approach using the Argotic Syndication Framework to build a custom ViewResult is particularly sweet, and, like mine, means there is no View in the project.

You might also like...