Entity Framework 4.1: Deep Fetch vs Lazy Load (3)

Entity Framework 4.1: Deep Fetch vs Lazy Load (3) Edit

This is part of a series of blog post about Entity Framework 4.1.  The past blog entries are:

In this article, I’ll cover the control of what is getting loaded in queries.

EF 4.1 is able to manage relations.  Now which relations get loaded when you do a query?  Everyone “visible” to the object?  In some cases that might make sense (e.g. when you query an entity that has only a child entity), but in many cases you would end up load a good portion of the database or at least much more data that you would like.

By default, EF 4.1 loads only the entities in the query, but it supports two features to help you control what is loaded:

  • Deep Fetch
  • Lazy Load

—————————-

UPDATE:  Typically, deep fetch is referred to as “Eager loading”.  Sorry for inventing a term!

—————————-

I came up with the name deep fetch, I’m not sure there is one for that mechanism.  If there is feel free to tell me.  Anyhow, this mechanism allow you to specify entities you would like to be loaded along the way:

using (var context = new MyDomainContext())

{

  var orders = from o in context.Orders.Include(“OrderDetails”)

where o.CustomerName == “Mac”

select o;

Here I specify that I want to load certain orders (the ones with a Mac as customer name) and that I want the order details of those orders to be loaded along the way.

You can actually look at the generated SQL query:

Console.WriteLine(orders.ToString());

EF 4.1 doesn’t generate easy to read queries and they quickly become impossible to decipher, but this one you should be able to read and you’ll see that the order details are loaded with it.

Now this brings an issue of EF 4.1 in general regarding deep fetch:  query efficiency.  If you give the exercise to a graduate to write a query to retrieve the orders and order details, chances are they’re going to write something functionally equivalent to the generated query.  If you’re lucky they might be smarter and write a query returning two result sets:  one for the orders and one for the details.  This would be much more efficient since you wouldn’t repeat all the order information for each order details.  For some reason EF never supported that.  Probably because it isn’t supported across all database systems and it would be forced to generate two SQL queries for one LINQ query which would probably open a can of worms.  Anyway, keep this in mind as it can easily get ugly on the performance side.

Anyhow, you could request more than one sub collections be brought along:

var orders = from o in context.Orders.Include(“OrderDetails”).Include(“Businesses”)

where o.CustomerName == “Mac”

select o;

The other feature you can use to control what’s brought along is lazy loading.  By default, lazy loading is supported.  If you want to disable it, you need to do it explicitly.  The best place would be in the constructor of your db context:

public MyDomainContext()

{

this.Configuration.LazyLoadingEnabled = false;

}

Now lazy loading works as you would expect:  you request an entity-set, it gets loaded and if you try to access a sub-collection of an entity, it gets loaded on the fly, auto-magically!

How does EF knows you’re trying to access a sub collection?  You collection are POCO collection (e.g. List<EntityType>), so no events are raised if you access them.  Anyone?  It’s generating a dynamic object deriving from your entities and override your sub-collection access properties.  Yes it does.  This is why you need to mark your sub-collection access properties with the keyword virtual in order for the magic to operate:

public class Order

{

public int OrderID { get; set; }

public string OrderTitle { get; set; }

public string CustomerName { get; set; }

public DateTime TransactionDate { get; set; }

public virtual List<OrderDetail> OrderDetails { get; set; }

public virtual List<Business> Businesses { get; set; }

}

Let’s give some characteristics of the two mechanisms.  For deep fetch:

  • Reduces Latency (it fetches all data in one trip to the DB server)
  • You need to know in advance what you’re going to need and be explicit about it

Lazy Loading:

  • Very forgiving, since it will just load the data on requests, you do not need to plan in advance
  • Could kill performance because of latency (think of a loop on parent entities and lazy load of the children of one parent in the body of the loop)

Now, when should we use which mechanism?  I’ll give you my guidelines here:  feel free to bring up other ones.  I would use lazy loading except when you have loops with a lazy load in the body of the loop.  It might create 2-3 server queries instead of one, but it is still acceptable, especially given the shortcoming of the deep fetch query mechanism.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s