Parameterized IN clauses with ADO.NET and LINQ

Since my article on using parameters for Preventing SQL Injection In ASP.NET, I have had a few emails asking how to do the same with an IN clause expression. In this entry, I explore a couple of ways to do this.

Just in case you don't know, IN() takes a subquery or expression. The expression is a comma-separated list of values, so the following SQL:

SELECT * FROM Products WHERE CategoryID IN (1,4,8)

will return all Products that have a CategoryID of 1, or 4, or 8. It's a lot simpler to use than a series of OR statements such as

SELECT * FROM Products WHERE CategoryID = 1 OR CategoryID = 4 OR CategoryID = 8

Where it's particularly useful is if you allow the user to choose from a number of categories using CheckBoxes or similar. However, simply doing this:

SELECT * FROM Products WHERE CategoryID IN (@List)
Command.Parameters.AddWithValue("@List", values);

doesn't work, in that @List is a single value. SQL Server doesn't know it should parse whatever @List has as a value into a series of individual values, unless you tell it what to do. There are a large number of SQL Server-specific solutions for this, which are detailed and maintained by SQL Server MVP Erland Sommerskog here. But they all require that you have access to the SQL Server, and can create stored procedures etc. Oh, and that you are not using a database such as Access which doesn't support stored procedures containing control of flow code. What we really want is a solution that can be applied using purely application code.

Recapping the previous article on parameters, a parameterized query consists of a SQL statement (a string) and a collection of Parameter objects. Taking this into account, what we really need to do is construct something like the follow:

SELECT * FROM Products WHERE CategoryID IN (@p1, @p2, @p3)

Since an IN clause contains a variable number of values in its expression, it is impossible to know at design time how many parameters are needed, so these along with the SQL itself need to be constructed at runtime. And once you get your head around that, the rest should fall nicely into place.

I'm going to make this easy and illustrate how to do this with a series of CheckBoxes. Actually, this isn't a series of Checkboxes - the elements are raw HTML inputs of type checkbox:


<input type="checkbox" name="categories" value="1" />Beverages<br />
<input type="checkbox" name="categories" value="2" />Condiments<br />
<input type="checkbox" name="categories" value="3" />Confections<br />
<input type="checkbox" name="categories" value="4" />Dairy Products<br />
<input type="checkbox" name="categories" value="5" />Grains/Cereals<br />
<input type="checkbox" name="categories" value="6" />Meat/Poultry<br />
<input type="checkbox" name="categories" value="7" />Produce<br />
<input type="checkbox" name="categories" value="8" />Seafood<br />

The reason I have used HTML elements instead of CheckBoxes is that you can give all elements the same name attribute, and when they are posted to the server, all checked values are concatenated nicely into a comma delimited list. If you are binding a CheckBoxList server control, you will need to loop through the Items collection and build your own comma separated string. Moving on, in the Button_Click event of this page, the following code appears:


protected void Button1_Click(object sender, EventArgs e)
{
  var connect = WebConfigurationManager.ConnectionStrings["Northwind"].ConnectionString;
  var sql = "SELECT * FROM Products WHERE CategoryID IN ({0}) ORDER BY CategoryID";
  var values = Request.Form["categories"];
  var valuearray = values.Split(',');

  var parms = valuearray.Select((s, i) => "@p" + i.ToString()).ToArray();
  var inclause = string.Join(",", parms);

  using (var conn = new SqlConnection(connect))
  {
    var cmd = new SqlCommand(string.Format(sql, inclause), conn);
    for (var i = 0; i < valuearray.Length; i++)
    {
      cmd.Parameters.AddWithValue(parms[i], valuearray[i]);
    }
    conn.Open();
    var reader = cmd.ExecuteReader();
    GridView1.DataSource = reader;
    GridView1.DataBind();
  }
}

All I have done here is to take the comma-separated string from the Request.Form collection, and then used string.Split() to turn it into an array. Using a touch of LINQ, I've then constructed another array with formatted values, so that the elements now take the shape of @p1, @p2 etc. string.Join() is then used to convert the new array into a comma separated string holding the new values, which is then added to the SQL.

Finally, once the Command object has been declared, the initial array is looped through so that the values themselves can be applied to the parameters that are added to the collection.

If you are using LINQ To SQL, or LINQ with the Entity Framework, the answer is much easier. You simpy use the Contains() extension method:


protected void Button1_Click(object sender, EventArgs e)
{
  var db = new NorthwindDataContext();
  var temp = Request.Form["categories"].Split(',');
  var output = Array.ConvertAll(temp, s => int.Parse(s));

  var query = db.Products
                  .Where(p => output.Contains(p.CategoryID))
                  .OrderBy(p => p.CategoryID);

  GridView1.DataSource = query;
  GridView1.DataBind();
}

The SQL this generates is just as you would expect:

exec sp_executesql N'SELECT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], 
[t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], 
[t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued]
FROM [dbo].[Products] AS [t0]
WHERE ([t0].[CategoryID]) IN (@p0, @p1, @p2)
ORDER BY [t0].[CategoryID]',N'@p0 int,@p1 int,@p2 int',@p0=1,@p1=4,@p2=7

Woo hoo. You gotta love a bit of LINQ. Again, the Request.Form value is converted to an array of strings using the string.Split() method. To work with the type-safety of LINQ to SQL generated classes and their properties, this then needs to be converted to an array of ints. A nice little method Array.ConvertAll(TInput, TOutput) helps magnificently here. If you read the docs, you will see that it takes an array and a delegate pointing to the method that will perform the conversion from one type to the other. As soon as you see that a method takes a delegate as an argument, it's a candidate for a Lambda expression. This one is no different. The lambda takes each element of the string array and uses int.Parse() to convert it to an int.

One final thing to mention. If you try this yourself with the default schema of the Northwind database, you will probably get a complaint from the compiler around the Contains() method. The reason for this is that Products.CategoryID is by default a nullable int - arising from the fact that the default design of the Northwind database allows nulls in that particular column. The compiler is mightily satisfied if you explicitly cast p.CategoryID to an int:


.Where(p => output.Contains((int)p.CategoryID))

Or you can do what I did - amended the database column so that it does not allow nulls, single click CategoryID in the Products class in the LINQ to SQL designer, go to the properties panel and set Nullable to false.