Microsoft, .NET and Open Source

I recently returned from my first MVP summit and there was one somewhat surprising theme that was repeated in a number of the sessions I attended.

This was one big take away for me. The folks at Microsoft really are committed to open source. I’m not talking about the old MSPL type of open source you might have expected from Microsoft a few years ago. I’m talking about full on open source. Here are a few notable projects that Microsoft has open sourced under the Apache 2.0 license:

What does this mean?

It means you can (and should) contribute. These projects are fully open source and are actively accepting pull requests. Yes…YOU can contribute to the next release of Entity Framework or ASP.NET MVC. I look at Entity Framework as a great example of openness coming from the developer tools group at Microsoft. Meeting notes, design notes, thoughts on future direction, open issues, and discussion points are all posed on the Entity Framework CodePlex site. They even have a list of issues tagged as ‘UpForGrabs’, making it very easy for community members (that’s us!) to contribute.

Microsoft is also shipping its products using existing open source software. One great example that was referenced more than a few times at the summit is JSON.NET (by James Newton-King). Microsoft had their own JSON library, but JSON.NET was better. The either had to put resources towards improving their own JSON library, or make use of the existing framework. They chose wisely and went with the open source JSON.NET. ASP.NET MVC 4 now ships with JSON.NET, and that is awesome. It’s awesome for everyone using MVC and it’s awesome for everyone involved in the open source community.

It gets even better. If Microsoft ships a product that includes an open source library, Microsoft is committed to supporting that open source library. That means if you are using MVC 4 and you have having trouble with jQuery or JSON.NET, you can call Microsoft support for help! There is an important distinction to make here though. For each of the open source project at Microsoft, there is an open source version and there is an officially packaged and shipped Microsoft version that is built from the open source version. Only the Microsoft version is supported (and signed and all that other stuff).

What should you do?

You should contribute! Either start your own project, contribute to an existing community project, or contribute to a Microsoft project. The support that Microsoft is giving the open source community is very encouraging and it makes me want to be a bigger part of that community. I am now actively browsing through the Entity Framework site to see if there is any way I can contribute. It’s a great time to get involved in the .NET open source community! I have also started contributing to a small open source project called AngelaSmith.

What to expect from Microsoft

Don’t expect to see an open source version of Visual Studio or Windows or Office, but I think we will continue to see more openness from the developer groups at Microsoft (I’m talking about frameworks and libraries, not products like Visual Studio). This openness will come in the form of making more Microsoft libraries open source, as well is a continuing to support community driven open source projects.

Writing efficient queries with Entity Framework Code First (Part 3)

In this series, we will explore the Social Recipes sample application.  This is a simple application that is intended to show some of the common inefficient queries that can be generated using Entity Framework (EF) Code First.  The application is built using ASP.NET MVC4 and Entity Framework 5.

The application is site that allows users to post, rate, and review recipes, create and join groups, and share recipes with those groups. For more information on the domain, refer to Part 1 - Eager Loading.

Loading too much data

In the last post, we explored using LINQ projections to generate SQL queries that only retrieve the data that is needed to display a particular page. Using LINQ projections, we were able to improve performance substantially. There is, however, one more problem to solve with this Groups page. 

We know the Groups page now performs well when we have 100 Groups in the database. You know what’s cooler than 100 Groups? Let’s see what happens when we have 1,000 groups on the page.

The page is taking 800ms to render. The problem is that we are trying to display ALL the Groups at once. Really, this is a NOT great idea. As our user base grows, the website will get slower and slower.

Paging

Luckily, we can implement a strategy called paging using the Skip and Take extensions methods.

First, let’s change the controller action to add a page number parameter and use that parameter when we add the Skip and Take methods to our query.

public ActionResult Index(int pageNumber = 0)
{

ViewBag.PageNumber = pageNumber;
const int pageSize = 25;
DateTime twoDaysAgo = DateTime.Now.AddDays(-2);
var groupSummaries = _recipeContext.Groups.OrderBy(g => g.Name)
.Select(g => new GroupSummaryModel{
Id = g.Id,
Name = g.Name,
Description = g.Description,
NumberOfUsers = g.Users.Count(),
NumberOfNewRecipes = g.Recipes.Count(r => r.PostedOn > twoDaysAgo)
}).Skip(pageSize * pageNumber)
.Take(pageSize);

return View(groupSummaries);
}

Next, we update the Groups page to have links to the Next and Previous pages. To keep the example simple, I won’t check to see if we should actually be showing the Previous and Next links.

<div>
@Html.ActionLink("Previous", "Index", new { pageNumber = ViewBag.PageNumber - 1})
@Html.ActionLink("Next", "Index", new { pageNumber = ViewBag.PageNumber + 1})
</div>

Now, we are back to rendering the Groups page in under 50ms. The nice thing is that no matter how many groups are in the database, the page will always take approximately the same amount of time to load.

When we look at the generated SQL, we see that it is using the TOP command to only load 25 rows at a time.

DECLARE @p__linq__0 DateTime2 = '2013-02-16T17:08:19'

SELECT TOP (25)
[Project3].[Id] AS [Id],
[Project3].[Name] AS [Name],
[Project3].[Description] AS [Description],
[Project3].[C1] AS [C1],
[Project3].[C2] AS [C2]
FROM ( SELECT [Project3].[Id] AS [Id], [Project3].[Name] AS [Name], [Project3].[Description] AS [Description], [Project3].[C1] AS [C1], [Project3].[C2] AS [C2], row_number() OVER (ORDER BY [Project3].[Name] ASC) AS [row_number]
FROM ( SELECT
[Project2].[Id] AS [Id],
[Project2].[Name] AS [Name],
[Project2].[Description] AS [Description],
[Project2].[C1] AS [C1],
[Project2].[C2] AS [C2]
FROM ( SELECT
[Project1].[Id] AS [Id],
[Project1].[Name] AS [Name],
[Project1].[Description] AS [Description],
[Project1].[C1] AS [C1],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[RecipeGroups] AS [Extent3]
INNER JOIN [dbo].[Recipes] AS [Extent4] ON [Extent4].[Id] = [Extent3].[Recipe_Id]
WHERE ([Project1].[Id] = [Extent3].[Group_Id]) AND ([Extent4].[PostedOn] > @p__linq__0)) AS [C2]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Description] AS [Description],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[GroupUsers] AS [Extent2]
WHERE [Extent1].[Id] = [Extent2].[Group_Id]) AS [C1]
FROM [dbo].[Groups] AS [Extent1]
) AS [Project1]
) AS [Project2]
) AS [Project3]
) AS [Project3]
WHERE [Project3].[row_number] > 0
ORDER BY [Project3].[Name] ASC

What’s Next?

Get the source Social Recipes on GitHub

More Reading

Part 1 - Eager Loading

Part 2 – LINQ Projections

Writing efficient queries with Entity Framework Code First (Part 2)

In this series, we will explore the Social Recipes sample application.  This is a simple application that is intended to show some of the common inefficient queries that can be generated using Entity Framework (EF) Code First.  The application is built using ASP.NET MVC4 and Entity Framework 5.

The application is site that allows users to post, rate, and review recipes, create and join groups, and share recipes with those groups. For more information on the domain, refer to Part 1 - Eager Loading.

Why is this page so slow?

In the last post, we explored the classic n+1 select problem. In this post, we are going to explore another example of the n+1 problem. Spoiler alert…in this example, eager loading is not enough to solve the problem.

Let’s take a look at the Groups page.

This page displays all of the Groups in the database. For each group, we display the number of members and the number of new recipes in that group.  New recipes are recipes that have been posted in the last 2 days.

The controller action for this page is very simple.  All it does it queries all the Groups in the database and passes the results to the view:

public ActionResult Index()
{

return View(_recipeContext.Groups.OrderBy(g => g.Name));
}

The view then iterates over each group and renders a div with the information for each group (including a count of the number of new recipes and a count of the number of users).

@model SocialRecipesMVC4.Domain.Group

<div class="post">
<h3>@Model.Name</h3>
<p class="post-info">
Members (@Model.Users.Count())</p>
<p>@Model.Description</p>
<p class="postmeta">
@Html.ActionLink("View Group", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
|
@Html.ActionLink("New Recipes (" + Model.Recipes.Count(r => r.PostedOn > DateTime.Now.AddDays(-2)) + ")", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
</p>
</div>

This was pretty easy to implement, but as we see when we turn the profiler on, this page is really very slow to load. The sample database only has 100 groups in it, and this page is taking almost 2 seconds to load.

As we dig into the details, we can see that 2 queries are being executed for each group that is displayed. We are running into the same problem as last time, but it is ~2x worse. In order to render the Groups page, we are executing 201 queries (2n+1). Clearly this will not scale well. Let’s see what happens when we apply the eager loading strategy.

public ActionResult Index()
{

return View(_recipeContext.Groups
.Include("Recipes")
.Include("Users").OrderBy(g => g.Name));
}

That reduced the number of queries from 201 to 1. Unfortunately, the eager loading strategy did nothing to help performance.  In fact, the page actually takes longer to load.  We went from ~1.7s to ~1.9s:

What happened?

It might not be obvious at first, but we almost loaded the entire database into memory! By querying all the Groups and including the Recipes and Users, we have loaded everything except the Comments into memory. Imagine a system with hundreds of thousands of Users and Recipes! Every request to the Groups page would take down the server.

Using LINQ Projections

Clearly, we are loading a lot more data than we actually need to render this page. We are going to use LINQ projections to tell Entity Framework to only load the data we need.

Let’s start by defining a class that will hold the data we need to display a single Group on the Groups page. This class is NOT part of our database context. Entity Framework knows nothing about this class.

public class GroupSummaryModel
{
public int Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public int NumberOfUsers { get; set; }
public int NumberOfNewRecipes { get; set; }
}

Next, we clean up our View to expect a GroupSummaryModel instead of a Group.

@model SocialRecipesMVC4.Models.GroupSummaryModel
<div class="post">
<h3>@Model.Name</h3>
<p class="post-info">
Members (@Model.NumberOfUsers)</p>
<p>@Model.Description</p>
<p class="postmeta">
@Html.ActionLink("View Group", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
|
@Html.ActionLink("New Recipes (" + Model.NumberOfNewRecipes + ")", "Details", "Group", new { id = Model.Id }, new { @class = "readmore" })
</p>
</div>

Finally, we modify the controller action to use a LINQ Projection:

public ActionResult Index()
{

DateTime twoDaysAgo = DateTime.Now.AddDays(-2);
var groupSummaries = _recipeContext.Groups.OrderBy(g => g.Name)
.Select(g => new GroupSummaryModel{
Id = g.Id,
Name = g.Name,
Description = g.Description,
NumberOfUsers = g.Users.Count(),
NumberOfNewRecipes = g.Recipes.Count(r => r.PostedOn > twoDaysAgo)
});

return View(groupSummaries);
}

In the projection, we are telling Entity Framework to query the Groups, but instead of returning instances of Groups, we want instances of GroupSummaryModels. Entity Framework knows nothing about the GroupSummaryModel, but with the projection, we are able to tell Entity Framework how to populate it.

Let’s see if this helped…

Almost miraculously, the page renders in under 50ms and requires only a single query to fetch the data! This is pretty powerful stuff. Now, I am much more confident that the Groups page will scale well as the Social Recipes community grows.

What’s Next?

In the next part of this series, we will look at how to implement paging to avoid loading large lists.

Get the source Social Recipes on GitHub

More Reading

Part 1 - Eager Loading

Writing efficient queries with Entity Framework Code First (Part 1)

In this series, we will explore the Social Recipes sample application. This is a simple application that is intended to show some of the common inefficient queries that can be generated using Entity Framework (EF) Code First. The application is built using ASP.NET MVC4 and EF 5 Code First 5. To make things a little easier, we also use the following nuget packages:

Install-Package nInject.MVC3
Install-Package MiniProfiler.MVC3
Install-Package MiniProfiler.EF

The application is site that allows users to post, rate, and review recipes, create and join groups, and share recipes with those groups.

Our domain model consists of 4 classes: User, Group, Recipe, and Comment

The database context contains 4 DbSets:

public class RecipeContext : DbContext
{
public DbSet<User> Users { get; set; }
public DbSet<Group> Groups { get; set; }
public DbSet<Recipe> Recipes { get; set; }
public DbSet<Comment> Comments { get; set; }
}

The classic n+1 select problem

Let’s start by exploring the My Recipes page.

This page displays all the recipes that I have posted. Each recipe is displayed on a card, which contains the title, a link to the recipe details, a count of the number of comments on the recipe and other basic information.

The controller for this page queries the context for the current user, then passes the current user’s recipes to the view.

public ActionResult Index()
{

User currentUser = _recipeContext.Users
.Single(u => u.Id.ToUpper() == User.Identity.Name.ToUpper());
return View(currentUser.Recipes);
}

The view then iterates over each recipe and renders a div with the information for each recipe (including a count of the number of recipes).

At first glance, this page appears to load in a reasonable amount of time. However, once we turn MiniProfiler on, we that an unusually large number of SQL queries are executed to render this page.

What we have here is the classic N+1 select problem. First, there is 1 query to retrieve the current User. When we access the Recipes collection on that user, another query is executed to retrieve all the Recipes. This Recipes collection is not loaded until we actually try to iterate over the collection. This pattern is referred to as Lazy Loading. The real problem occurs when we try to get a count of the number of Comments for each Recipe. The Comments collection is also lazy loaded, so now we get an additional query for each Recipe in the collection. Where N=# of Recipes, we ended up with N+1 queries (+1 again for the initial query to initially retrieve the User). In our simple example, the My Recipes page resulted in 27 queries. That’s 27 round trips to the database just to display this simple little webpage. Clearly this is not an optimal solution!

How can we fix it?

Luckily, EF Code First provides a simple extension method called Include() that can be used to control when properties should be Eager Loaded (instead of the default Lazy Loaded). With eager loading, EF can generate a single query to load all the data we need.

Here’s all we need to do:

public ActionResult Index()
{

User currentUser = _recipeContext.Users.Include("Recipes")
.Include("Recipes.Comments")
.Single(u => u.Id.ToUpper() == User.Identity.Name.ToUpper());
return View(currentUser.Recipes);
}

Now, when access the My Recipes page, we can see that only a single query is executed.

That’s a pretty huge improvement. The page loads almost twice as fast as it previously did, and we have reduced the number of database round trips from 27 to 1. Not bad for just calling an extension method!

What’s Next?

In the next part of this series, we will to explore some situations where eager loading can get us into trouble and some more advanced strategies we can use to generate efficient queries with EF Code First.

get the source – Social Recipes on GitHub

Part 2 - LINQ Projects
Part 3 - Paging

The Developer Movement

If you are a Canadian Resident and you are interested in building apps for Windows 8, Windows Phone 8, or Windows Azure, then you should check out the Developer Movement.  There is no cost to join, and all the developer tools are completely free.

Once you join, you get points for publishing apps and completing challenges.  You can use the points to get some pretty cool stuff.  For example, if you publish your first Windows Phone or Windows 8 app and use Azure, you will get 15,000 points.  That’s enough for a Sony 3D Blu-Ray Disc Player or a couple Xbox games.  Not a bad deal!

Check it out at http://www.developermovement.ca