11 December, 2008

The Secret Behind LINQ To SQL

I finally get it! It's all about expression trees!

It all finally clicked for me when I saw the declaration of IQueryable:

public interface IQueryable : IEnumerable
Type ElementType { get; }
Expression Expression { get; }
IQueryProvider Provider { get; }

See that? It's right there! How could I have been so blind for so long? But I'm getting ahead of myself. Let me tell you about expression trees.

What are expression trees? Well, they're a trees with expressions as nodes. What do they do? They provide a mechanism to convert code into data (expressions). Why is that useful? Because you may want to examine and/or modify you code before execution. In particular, it would be very useful, if we wanted to take say C# code and convert it to, oh... i dunno.., SQL statements.

Say then you created the following Expression:

Expression<Func<int, int, int>> expression =
(a,b) => a + b;

(There's a little magic going on in the above sample: the compiler knows how to take the lambda "(a, b) => a + b" and make a delegate out of it).

The above code would literally translate to a "+" plus node with two children nodes: "a" and "b". Obviously, I've simplified the tree for clarity. If you really want to see the tree structure, fire up VS2008 and take a look at the expression tree with the ExpressionTreeVisualizer plugin.

Anyhow, now that we have our C# code in a tree, we can easily parse that tree and convert the data to a string that SQL can understand. We can then take that string, send it across the wire to our SQL server and voila: we have LINQ to SQL. Cool, huh?

Now, here's something else to think about: IEnumerable offers most of the same methods IQueryable offers, yet the declaration is completely different:

public interface IEnumerable<T> : IEnumerable
IEnumerator<T> GetEnumerator();

Notice how IEnumerable does not have an Expression tree? That's a fundamental difference between the two interfaces. This means that IEnumerable won't do anything with your code but execute it. Therefore, if you order, filter, or project an IEnumerable collection, the action will execute in the process where the collection lives; it will not be sent to a beefy SQL box that can handle ordering large sets easily (Yes, I've made that mistake. In fact, that's what inspired this post).

Turns out my CS professors were right: trees really are useful data structures.


Mike Murray said...

Yeah, I had this same "Aha!" moment a few weeks back:


It was prompted by learning about a cool trick with IQueryable<> to do Lazy Loading with Pipes and Filters in a Repository pattern:


Post a Comment