Writing efficient queries with Entity Framework Code First (Part 2)

In this series, we will explore the Social Recipes sample application.  This is a simple application that is intended to show some of the common inefficient queries that can be generated using Entity Framework (EF) Code First.  The application is built using ASP.NET MVC4 and Entity Framework 5.

The application is site that allows users to post, rate, and review recipes, create and join groups, and share recipes with those groups. For more information on the domain, refer to Part 1 - Eager Loading.

Why is this page so slow?

In the last post, we explored the classic n+1 select problem. In this post, we are going to explore another example of the n+1 problem. Spoiler alert…in this example, eager loading is not enough to solve the problem.

Let’s take a look at the Groups page.

This page displays all of the Groups in the database. For each group, we display the number of members and the number of new recipes in that group.  New recipes are recipes that have been posted in the last 2 days.

The controller action for this page is very simple.  All it does it queries all the Groups in the database and passes the results to the view:

public ActionResult Index()
{

return View(_recipeContext.Groups.OrderBy(g => g.Name));
}

The view then iterates over each group and renders a div with the information for each group (including a count of the number of new recipes and a count of the number of users).

@model SocialRecipesMVC4.Domain.Group

<div class="post">
<h3>@Model.Name</h3>
<p class="post-info">
Members (@Model.Users.Count())</p>
<p>@Model.Description</p>
<p class="postmeta">
@Html.ActionLink("View Group", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
|
@Html.ActionLink("New Recipes (" + Model.Recipes.Count(r => r.PostedOn > DateTime.Now.AddDays(-2)) + ")", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
</p>
</div>

This was pretty easy to implement, but as we see when we turn the profiler on, this page is really very slow to load. The sample database only has 100 groups in it, and this page is taking almost 2 seconds to load.

As we dig into the details, we can see that 2 queries are being executed for each group that is displayed. We are running into the same problem as last time, but it is ~2x worse. In order to render the Groups page, we are executing 201 queries (2n+1). Clearly this will not scale well. Let’s see what happens when we apply the eager loading strategy.

public ActionResult Index()
{

return View(_recipeContext.Groups
.Include("Recipes")
.Include("Users").OrderBy(g => g.Name));
}

That reduced the number of queries from 201 to 1. Unfortunately, the eager loading strategy did nothing to help performance.  In fact, the page actually takes longer to load.  We went from ~1.7s to ~1.9s:

What happened?

It might not be obvious at first, but we almost loaded the entire database into memory! By querying all the Groups and including the Recipes and Users, we have loaded everything except the Comments into memory. Imagine a system with hundreds of thousands of Users and Recipes! Every request to the Groups page would take down the server.

Using LINQ Projections

Clearly, we are loading a lot more data than we actually need to render this page. We are going to use LINQ projections to tell Entity Framework to only load the data we need.

Let’s start by defining a class that will hold the data we need to display a single Group on the Groups page. This class is NOT part of our database context. Entity Framework knows nothing about this class.

public class GroupSummaryModel
{
public int Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public int NumberOfUsers { get; set; }
public int NumberOfNewRecipes { get; set; }
}

Next, we clean up our View to expect a GroupSummaryModel instead of a Group.

@model SocialRecipesMVC4.Models.GroupSummaryModel
<div class="post">
<h3>@Model.Name</h3>
<p class="post-info">
Members (@Model.NumberOfUsers)</p>
<p>@Model.Description</p>
<p class="postmeta">
@Html.ActionLink("View Group", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
|
@Html.ActionLink("New Recipes (" + Model.NumberOfNewRecipes + ")", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
</p>
</div>

Finally, we modify the controller action to use a LINQ Projection:

public ActionResult Index()
{

DateTime twoDaysAgo = DateTime.Now.AddDays(-2);
var groupSummaries = _recipeContext.Groups.OrderBy(g => g.Name)
.Select(g => new GroupSummaryModel{
Id = g.Id,
Name = g.Name,
Description = g.Description,
NumberOfUsers = g.Users.Count(),
NumberOfNewRecipes = g.Recipes.Count(r => r.PostedOn > twoDaysAgo)
});

return View(groupSummaries);
}

In the projection, we are telling Entity Framework to query the Groups, but instead of returning instances of Groups, we want instances of GroupSummaryModels. Entity Framework knows nothing about the GroupSummaryModel, but with the projection, we are able to tell Entity Framework how to populate it.

Let’s see if this helped…

Almost miraculously, the page renders in under 50ms and requires only a single query to fetch the data! This is pretty powerful stuff. Now, I am much more confident that the Groups page will scale well as the Social Recipes community grows.

What’s Next?

In the next part of this series, we will look at how to implement paging to avoid loading large lists.

Get the source Social Recipes on GitHub

More Reading

Part 1 - Eager Loading