Data Access Choices For WebMatrix

When WebMatrix was originally launched, it spurred a lot of activity from people who do not like seeing raw SQL statements in code files, or who perceived a problem working with dynamic objects. Some of these people threw their hands up in horror and tweeted or blogged their displeasure, while others felt the urge to do something about the "problem" as they saw it, and a number of open source data access projects have resulted. I've had a quick look at three projects to see what they might bring to the party.

First - a little context. WebMatrix data access is based on a library called WebMatrix.Data. It was originally called Microsoft.Data when the whole thing was in Beta. The main feature of WebMatrix.Data is that it works with raw SQL. Depending on which side of the fence you peer over, this is both a "good thing" and a "bad thing". One of the core principals behind the design of the Web Pages framework was to keep the "concept count" low. Microsoft knew through research that ASP.NET could be a forbidding experience to people who know little about server side web development. In the Web Forms framework, there are plenty of new things to learn which are foreign to all other frameworks, such as ViewState, the Page Lifecycle, an eventing model and so on. MVC is an advanced architecture designed for ease of testability, separation of concerns and what have you, which takes a lot of learning if all your sole programming experience up to now has been the odd function in Excel. If you look at classic ASP and PHP, for example, you get a totally different experience. At the simplest level, data access is often performed through direct calls to the database within code files which show a mixture of programming logic and HTML, and for relatively simple sites, this is acceptable. There are vast numbers of "simple" sites out there, and huge numbers of people who only ever want to build such simple sites. They do not need the architectural embellishments that come with MVC. Not every site is a "software engineering project". But the key thing is that if you are just learning how to build dynamic web sites, finding help on SQL is simple. There are millions of examples of SQL all over the web.

On the other hand, complex sites (applications, really) do need to be carefully architected. A lot of bloggers work full-time on such projects, and know the pitfalls of intermixing logic with HTML and inlining SQL in these kinds of projects. They know, for example, that they do not want to have to change code all over the application if a field name changes in the database. And they know that applying automated tests to this type of project is nearly impossible. Some of these people have forgotten (it seems to me) about the other side of web development, or what it was like when they first started out. They think that every web site is indeed a software engineering project, and all should follow "best practice", although what that is depends on your point of view. It's generally agreed, though, that having inline SQL in these types of projects is not a best practice.

The three projects I looked at are:

As of now, they are at varying levels of completeness. But they are all available via the Web Pages Package Manager. If you want to play with them yourself, you need to make sure that you choose "Default (All)" as your Source option. It would be interesting to see if any of the packages make it into the Default listing at some stage....

Each of the packages offers different solutions to different problems. Consequently, each one of them has their own set of features, and either embraces WebMatrix.Data, or replaces it completely. I'll look at Jeremy Skinner's library first.

WebMatrix.Data.StronglyTyped

The concern that Jeremy primarily wanted to address was that WebMatrix.Data returns a Dynamic object. The dynamic type is new in C# 4.0, and offers some huge benefits, but has some downsides. For one thing, you lose any compile time checking and IntelliSense. For most WebMatrix users, this won't be an issue. WebMatrix itself offers no features in this area (in v 1.0), but if you decide to use the Launch button in the main menu to open your site in Visual Web Developer or Visual Studio, a new world of information awaits you. As you hover over objects in code, tooltips give you more information about them, and dropdowns appear which allow you to pick properties and methods which the object is known to expose. When you run a query using WebMatrix.Data, objects are created dynamically from the resulting data. The object is populated with properties which are generated from the schema information that comes with the returned data. This is great, because you can treat database field names as if they are actually properties on a type. But since the properties don't exist until the database call has been made, when you are typing code, no help is available to tell you if the "property" you are currently referencing in your code will actually exist at runtime. Here's a typical WebMatrix.Data example:

@{
    var db = Database.Open("Northwind");
    var sql = "Select * From Products Where CategoryId = @0 Order By ProductName";
    var products = db.Query(sql, 4);
}

@foreach(var product in products){
    @product.ProductName<br />
}

There is a field in the Products table in the database called "ProductName", but when you type product.ProductName, WebMatrix (or the C# compiler, really) doesn't know this. You could just as easily type product.ProductsName (notice the extra "s"), and you wouldn't be told anything was wrong until you ran the page. All you are told at the time you type (in VS) is that ProductsName is a dynamic expression which will be resolved at runtime. WebMatrix.Data.StronglyTyped is a wrapper around WebMatrix.Data which enables you to convert the dynamic type to a strongly typed object. You need to define your strongly typed object in a class, but that is simple. Here's one that shows a Product class definition to match the database table:

using System;

/// <summary>
/// Summary description for Product
/// </summary>
public class Product
{
    public int ProductID { get; set; }
    public string ProductName { get; set; }
    public int SupplierID { get; set; }
    public Category Category { get; set; }
    public string QuantityPerUnit { get; set; }
    public decimal UnitPrice { get; set; }
    public Int16 UnitsInStock { get; set; }
    public Int16 UnitsOnOrder { get; set; }
    public Int16 ReorderLevel { get; set; }
    public bool Discontinued { get; set; }

}

Now when you use WebMatrix.Data.StronglyTyped, you can project the result of the query into a series of Product objects:

@{
    var db = Database.Open("Northwind");
    var products = db.Query<Product>("Select * From Products Where CategoryID = @0", 4);
}
        

@foreach(var product in products){
    @product.ProductName<br />
}

As you type the dot after product, IntelliSense provides a list of properties you have defined in the class that you may use. No more runtime errors resulting from typos. But there is another thing this library helps with. Let"s assume that you want to flag up all products that have fewer units in stock than their reorder level. Using WebMatrix.Data, there are a couple of ways to do this. You could change the SQL to perform a calculation on the field values and return an additional column of data which would become an additional property on your dynamic object - depending on the functions that your database supports. Or you could put a conditional in your Razor code:

 
@foreach(var product in products){
    if(product.UnitsInStock < product.ReorderLevel){
        @product.ProductName<br />
    }
}

Now that's OK, but what if the condition becomes more complex, or you need to apply this test in multiple pages? Your Razor code will start to become less and less manageable. WebMatrix.Data.StronglyTyped answers this problem because it allows you to define additional properties on your object, so you can enhance your Product class with other properties:

public class Product
{
    public int ProductID { get; set; }
    public string ProductName { get; set; }
    public int SupplierID { get; set; }
    public Category Category { get; set; }
    public string QuantityPerUnit { get; set; }
    public decimal UnitPrice { get; set; }
    public Int16 UnitsInStock { get; set; }
    public Int16 UnitsOnOrder { get; set; }
    public Int16 ReorderLevel { get; set; }
    public bool Discontinued { get; set; }
    public bool ShouldReorder { 
       get { return UnitsInStock < ReorderLevel ? true : false; } 
    }

 

When Product objects are created, they now have a ShouldReorder property which will be true or false, depending on the comparison between two other fields. And of course, that property appears in IntelliSense. If, after a while, you also want to take account of the number of units on order to determine whether the product has reached the point where a new order should be made, you only need to alter your code in one place.

WebMatrix.Data.StronglyTyped doesn't alter the fact that you are using WebMatrix.Data. You don't really need to learn anything new to use it. You still use SQL and the standard WebMatrix.Data.Database methods. That means that relationships can be navigated using familiar Joins in SQL. Notice that the Product class has a Category property? That is defined in a separate class:

/// <summary>
/// Summary description for ClassName
/// </summary>
public class Category
{
    public int CategoryID { get; set; }
    public string CategoryName { get; set; }
    public string Description { get; set; }
    public byte[] Image { get; set; }
    public int TotalProducts { get; set; }
}

This mirrors the Categories table in the database, but features an additional property called TotalProducts. If you wanted to get the total number of products by category, you simply do this:

@{
    var db = Database.Open("Northwind");
    var categories = db.Query<Category>(@"SELECT CategoryName, Count(Products.CategoryID) AS TotalProducts FROM Products
                    INNER JOIN Categories ON Categories.CategoryID = Products.CategoryID GROUP BY CategoryName");
}

Writing the results to the browser is equally simple:

@foreach(var category in categories){
    @:@category.CategoryName (@category.TotalProducts)<br />
}

Massive

Rob Conery has approached things slightly differently. The Massive library (it's one C# code file) doesn't use WebMatrix.Data. It replaces it. It is based on the Active Record pattern, which treats the underlying database tables as representing objects with properties based on the column names. The properties on the objects are generated dynamically when data is retrieved from the database. There is no formal definition of the object as was the case for WebMatrix.Data.StronglyTyped. So far, there is little difference between Massive and WebMatrix in terms of how objects and properties are generated, but Massive also provides a number of methods on those objects, such as All, Insert, Update, Delete and so on, which is much more characteristic of the Active Record pattern.

When using Massive, you need to define a connection string in your web.config file. The WebMatrix team have done a lot to shield beginners from web.config files, but anyone who has used one of the other flavours of ASP.NET will be familiar with them. They contain general settings and configurations for your application. When you add a new file to WebMatrix, pick the "Suggested" or "All" option and find Web.Config (4.0). This is what adding a connection string looks like. Notice that the connection string has been given a name:

<?xml version="1.0"?>
<configuration>
  <system.web>
    <compilation debug="true" targetFramework="4.0"/>
  </system.web>
  <connectionStrings>
    <add name="northwind" connectionString="Data Source=|DataDirectory|Northwind.sdf;" providerName="System.Data.SqlServerCe.4.0" />
  </connectionStrings>
</configuration>

The |DataDirectory| part is a substitution string which ASP.NET knows points by default to the App_Data folder within a web site.

You need to set up something else for Massive, and that is a definition for the objects that map to database tables. The following code maps to the Products table:

using Massive;
public class Products : DynamicModel {
    public Products() : base("northwind") {
        PrimaryKeyField = "ProductID";
    }
}

This is not as involved as WebMatrix.Data.StronglyTyped. It maps the primary key field, and passes the name of the connection string to Massive. As I type, there appears to be an update to Massive which means that the connection string won't be required. Massive will look for the first connection string in your web.config file, but that change hasn't been pushed up to NuGet yet. Your Products class inherits from DynamicModel (that's what the colon is for), which is what actually the object that contains definitions for the Insert, All, Update etc methods. In the first example, we got all products with a CategoryID of 4 using SQL. You can do exactly the same with Massive by using its Query method:

@{
    var Products = new Products();
    var products = Products.Query("Select * From Products Where CategoryId = @0 Order By ProductName", 4);
}

Alternatively, you can use The All method and pass in some optional parameters:

@{
    var Products = new Products();
    var products = Products.All(where: "CategoryID = @0", args: 4);
}

This will construct the SQL dynamically, using parameters to prevent any chance of the threat of SQL injection. The All method will return the entire table, so it's the equivalent of SELECT *. But each of the methods are available through IntelliSense, which can help save some time once you have learned what the methods do and what parameters they accept. There aren't that many, so it shouldn't take long. You can always fall back on an SQL statement if you only wanted to return some columns. Again, the latest update will introduce another parameter: columns, where you can pass just the columns that you would like to return. The return type is dynamic, so you get no IntelliSense or compile time type checking in Visual Studio. If you wanted to return the total number of products in each category, SQL is probably the best way to go:

@{
    var Categories = new Categories();
    var categories = Categories.Query(@"SELECT CategoryName, Count(Products.CategoryID) AS TotalProducts FROM Products
                    INNER JOIN Categories ON Categories.CategoryID = Products.CategoryID GROUP BY CategoryName");
}

Massive also supports transactions, which means that if you want to execute multiple commands consecutively, but only want them to succeed if all succeed, you can do so.

There seems little point in using Massive if you are happy using SQL. It's primary attraction is the set of methods that translate to SQL commands behind the scenes. If you want to move away from using SQL directly, it's a nice little tool.

Simple.Data

I'm going to start off by pointing out that this library is in its early stages, having just reached version 0.5. It is not complete. The library again follows the Active Record pattern, and attempts to provide a relatively easy way for beginners to access data without plugging SQL in their code, which Mark feels is just "wrong". It seems that Mark mistakenly believed that WebMatrix.Data did not offer any support for parameters (which is does), but despite that, it is possible to use WebMatrix in an unsafe way if you concatenate user input directly into your SQL. The result of Mark's work is a separate data access framework, which like Rob's doesn't rely on WebMatrix.Data.

Simple.Data works with dynamic types just like Massive and WebMatrix.Data. It is likely to appeal more to people who migrate from PHP to ASP.NET, in that they may be more familiar with the Active Record pattern and how it works. There are no options to fall back on SQL. Simple.Data supports a growing range of databases, not just SQL Server and SQL CE, and as a lightweight data access framework, it holds a fair amount of promise. I feel though that it is unlikely to appeal to people who know nothing about databases and SQL.

Simple.Data needs a connection string. Again, this can be defined in web.config, but must have a specific name (at least, at the moment):

<?xml version="1.0"?>
<configuration>
  <system.web>
    <compilation debug="true" targetFramework="4.0"/>
  </system.web>
  <connectionStrings>
    <add name="Simple.Data.Properties.Settings.DefaultConnectionString" 
         connectionString="Data Source=<full path to db>\Northwind.sdf;" 
         providerName="System.Data.SqlServerCe.4.0"/>
  </connectionStrings>
</configuration>

Alternatively, you can provide the connection string in the OpenFile method:

@using SD = Simple.Data;
@{
    var db = SD.Database.OpenFile(Server.MapPath("~/App_Data/Northwind.sdf"));
}

If you are using WebMatrix, WebMatrix.Data is referenced by default. That introduces a bit of a problem in that Simple.Data also defines a Database class. Therefore, any reference to the Database class will be ambiguous. The easiest way to resolve this is the reference Simple.Data using an alias. I've used SD as my alias for Simple.Data. You could of course simply use the fully qualified namespace: Simple.Data.Database.OpenFile() but that can become boring to type after a while. Beyond that, no configuration (class definition files) is required.

The Database class has an All method, just like Massive, but it also has a variety of Find methods: FindAll, FindBy, FindAllBy. The last of these allows you to append the filed name which acts as a filter, and pass in the value. It's like this that Simple.Data helps to avoid any SQL injection problems. Here's one way to get all products with a CategoryID of 4:

@{
    var db = SD.Database.OpenFile(Server.MapPath("~/App_Data/Northwind.sdf"));
    var products = db.Products.FindAllByCategoryID(4);
}
@foreach(var product in products){
    @product.ProductName<br />
}

Simple.Data also features a number of other commands for inserting, deleting and updating data. It is also capable of traversing relationships which are defined by foreign keys. Getting the count of products in each category is not as clear cut as it would be using SQL, and I found that the only way I could work out how to do this is not good on performance:

@{
    var categories = new List<string>();
    products = db.Products.All();
    foreach(var product in products){
        foreach (var category in product.Categories){
           categories.Add(category.CategoryName);
        }
    }
    var result = categories.GroupBy(c => c).Select(g => new { Category = g.Key, Count = g.Count() });

    foreach(var item in result){
        @:@item.Category (@item.Count)<br />
    }
}

I hope I've got this badly wrong, because what happens is that all products from the Products table are retrieved, and then the Category Name associated with the Product's CategoryID is retrieved within a loop, meaning that a new database query is fired for every product. All of the Category names are loaded in a List, and then that's subjected to some Linq to obtain the de-duplicated category names along with the total for each one.

So which should you use? If you are just starting out, I would strongly recommend that you use the standard WebMatrix.Data library. WebMatrix.Data.StronglyTyped certainly has a place if you want to move towards building your application around objects. Massive is a nice little utility but seems to be more of an experiment by the author around the whole C# dynamics thing. It will be interesting to see if and how it develops. Finally, Simple.Data promises a lot, but is not ready for Prime Time yet. According to Mark Rendle's blog, Simple.Data appears to have started out as a proof of concept and probably wasn't intended to grow as much as it has. A huge amount of work has already gone into Simple.Data, and it will be interesting again to see if this is developed further and in what way.

A download containing samples for each library, including WebMatrix.Data is available here