.NET Tools How-To's

Getting Started With Entity Framework Core 5

With the .NET 5 release comes the newest Entity Framework Core version, unsurprisingly called Entity Framework Core 5 (EF Core 5). The .NET community got its first glimpse of Entity Framework (EF) on August 11th, 2008, with the release of .NET Framework 3.5 Service Pack 1. Let’s say a few things have changed since the initial release. For those unfamiliar with the technology, Entity Framework is an object-relational mapper (ORM), helping developers overcome the impedance mismatches between C# and a database instance’s schema.

Entity Framework has come a long way since its original inception. Conceived initially as a designer-centric tool heavily reliant on Visual Studio’s graphical user interface and SQL Server, EF’s current philosophy recognizes the need to be friendlier to ideas like distributed source-control, build servers, multiple database engines, and operating systems.

In this post, we’ll see how we can get started with EF Core 5. We’ll design a schema for entertainment productions using a code-first approach, migrate an SQLite database to our schema’s current incarnation, seed our database, and run some typical LINQ queries.

Getting started

To get started, I recommend folks install the latest version of the .NET 5 SDK. Technically, EF Core 5 can run on .NET Core 3.1, but aligning versions is always a good idea.

Starting with a brand new console application, we will need to install the following packages, making sure that all the versions are 5.0.0:

  • Microsoft.EntityFrameworkCore
  • Microsoft.EntityFrameworkCore.Design
  • Microsoft.EntityFrameworkCore.Sqlite

We can use the NuGet tool window to find and install all these packages into our project.

Additionally, we will want to install the dotnet-ef command-line tool (CLI). From within the newly created project directory type the following commands into Rider’s integrated terminal.

> dotnet new tool-manifest
> dotnet tool install dotnet-ef

When we finish setting up our project, we can run the dotnet-ef command, where the EF unicorn will greet us.

> dotnet ef

Great! We’re now ready to start modeling our problem domain.

Folks who want to follow along can clone a sample project from GitHub.

Code-first modeling

Every EF Core solution will have at least one DbContext implementation. What is DbContext? The DbContext is a specialized base class that tracks our in-memory operations, allowing us to write and execute queries, track changes that we can persist back to our database, and interact with our database engine indirectly or directly in exceptional circumstances.

To get started using EF Core 5, we need to define a class that derives from DbContext and contains our models. Typically, our classes will map one to one with our database tables, and relationships between tables are defined using navigational conventions. We’ll see how this works as we get into defining our classes.

Before diving into C# code, let’s discuss our topic. We’ll keep the subject simple, but we’ll also explore some important EF Core concepts. Our subject matter is Entertainment, both movies and television, which we’ll call Productions. A Production has characters and ratings. An Actor will always portray a Character. I hope as folks read these sentences, they start to formulate the relationships in their mind. Let’s look at our models, starting with a Production.

Table-per-hierarchy and inheritance

We had mentioned that both a Movie and a Series could be a Production. EF Core 5 applies an approach known as Table-per-hierarchy (TPH), where we store hierarchically related entities in a single table.

Each record’s type is distinguished by a Discriminator column, which usually holds a simplified C# type name. The Discriminator column is invisible to developers from C#, but we can see it in the database schema generated by EF Core 5.

public abstract class Production
{
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime Release { get; set; }
}

public class Movie : Production
{
    public int DurationInMinutes { get; set; }
    public double WorldwideBoxOfficeGross { get; set; }
}

public class Series : Production
{
    public int NumberOfEpisodes { get; set; }
}

The discriminator is an important mechanism that we’ll see used later in this post’s query section.

Relationships

When thinking about relational databases such as SQLite, SQL Server, PostgreSQL, and Oracle, we need to consider three kinds of relationships between our tables:

  1. Many to many
  2. One to many
  3. One to one

We have a many to many relationship between our Production and the Actors. We express this relationship through the Character entity. We also have a one to many relationship, where a Production can have many instances of a Rating.

public abstract class Production
{
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime Release { get; set; }

    public List<Character> Characters { get; set; } = new List<Character>();
    public List<Rating> Ratings { get; set; } = new List<Rating>();
}

public class Movie : Production
{
    public int DurationInMinutes { get; set; }
    public double WorldwideBoxOfficeGross { get; set; }
}

public class Series : Production
{
    public int NumberOfEpisodes { get; set; }
}

public class Rating
{
    public int Id { get; set; }
    public int ProductionId { get; set; }
    public Production Production { get; set; }
    public string Source { get; set; }
    public int Stars { get; set; }
}

public class Character
{
    public int Id { get; set; }
    public int ProductionId { get; set; }
    public Production Production { get; set; }
    public string Name { get; set; }
    public Actor Actor { get; set; }
    public int ActorId { get; set; }
}

public class Actor
{
    public int Id { get; set; }
    public string Name { get; set; }
    public List<Character> Characters { get; set; } = new List<Character>();
}

Looking over our models, we can see some conventional patterns emerge. Naming is an essential part of EF Core modeling, where we define relationships by the Type of property definitions and their names. Let’s look at the Character definition.

We can see that the Character class has a one to one relationship to both an Actor entity and a Production entity. EF Core uses the naming convention of [Entity]Id to map values to the in-memory object instance. Developers can override EF Core conventions, but I recommend folks starting with EF Core use the default conventions.

Great, now that we have built our entities and relationships, let’s add them to our DbContext instance, which we’ll call EntertainmentDbContext. We’ll also want to define our database connection string for demo purposes.

 public class EntertainmentDbContext : DbContext
    {
        protected override void OnConfiguring(DbContextOptionsBuilder options)
            => options.UseSqlite("Data Source=entertainment.db");

        public DbSet<Production> Productions { get; set; }
        public DbSet<Movie> Movies { get; set; }
        public DbSet<Series> Series { get; set; }
        public DbSet<Rating> Ratings { get; set; }
        public DbSet<Character> Characters { get; set; }
        public DbSet<Actor> Actors { get; set; }
    }

    public abstract class Production
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public DateTime Release { get; set; }

        public List<Character> Characters { get; set; } = new List<Character>();
        public List<Rating> Ratings { get; set; } = new List<Rating>();
    }

    public class Movie : Production
    {
        public int DurationInMinutes { get; set; }
        public double WorldwideBoxOfficeGross { get; set; }
    }

    public class Series : Production
    {
        public int NumberOfEpisodes { get; set; }
    }

    public class Rating
    {
        public int Id { get; set; }
        public int ProductionId { get; set; }
        public Production Production { get; set; }
        public string Source { get; set; }
        public int Stars { get; set; }
    }

    public class Character
    {
        public int Id { get; set; }
        public int ProductionId { get; set; }
        public Production Production { get; set; }
        public string Name { get; set; }
        public Actor Actor { get; set; }
        public int ActorId { get; set; }
    }

    public class Actor
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public List<Character> Characters { get; set; } = new List<Character>();
    }
}

Seeding initial data

There are multiple ways to insert data into our database, but it’s a good idea to use the data seed mechanism of EF Core for demos and necessary look-up values. In EF Core, we can override the OnModelCreating method on our EntertainmentDbContext.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    #region seed data

    var movies = new Movie[] {
        new Movie { Id = 1, Name = "Avengers: Endgame", WorldwideBoxOfficeGross = 2_797_800_564, DurationInMinutes = 181, Release = new DateTime(2019, 4, 26) },
        new Movie { Id = 2, Name = "The Lion King", WorldwideBoxOfficeGross = 1_654_791_102, DurationInMinutes     = 118, Release = new DateTime(2019, 7, 19) },
        new Movie { Id = 3, Name = "Ip Man 4", WorldwideBoxOfficeGross = 192_617_891, DurationInMinutes = 105, Release = new DateTime(2019, 12, 25) },
        new Movie { Id = 4, Name = "Gemini Man", WorldwideBoxOfficeGross = 166_623_705, DurationInMinutes = 116, Release = new DateTime(2019, 11, 20) },
        new Movie { Id = 5, Name = "Downton Abbey", WorldwideBoxOfficeGross = 194_051_302, DurationInMinutes = 120, Release = new DateTime(2020, 9, 20 )}
    };

    var series = new Series [] {
        new Series { Id = 6 , Name = "The Fresh Prince of Bel-Air", NumberOfEpisodes = 148, Release = new DateTime(1990, 9, 10) },
        new Series { Id = 7 , Name = "Downton Abbey", NumberOfEpisodes = 52, Release = new DateTime(2010, 09, 26) },
        new Series { Id = 8 , Name = "Stranger Things", NumberOfEpisodes = 34 , Release = new DateTime(2016, 7, 11) },
        new Series { Id = 9 , Name = "Kantaro: The Sweet Tooth Salaryman", NumberOfEpisodes = 12, Release = new DateTime(2017,7, 14) },
        new Series { Id = 10, Name = "The Walking Dead", NumberOfEpisodes = 177 , Release = new DateTime(2010, 10, 31) }
    };

    var productions = movies
        .Cast<Production>()
        .Union(series)
        .ToList();

    modelBuilder.Entity<Movie>().HasData(movies);
    modelBuilder.Entity<Series>().HasData(series);

    // characters
    modelBuilder.Entity<Character>().HasData(new Character []
    {
        // movies
        new Character { Id = 1, Name = "Tony Stark", ProductionId = 1, ActorId = 1 },
        new Character { Id = 2, Name = "Steve Rogers", ProductionId = 1, ActorId = 2 },
        new Character { Id = 3, Name = "Okoye", ProductionId = 1, ActorId = 3 },
        new Character { Id = 4, Name = "Simba", ProductionId = 2, ActorId = 4 },
        new Character { Id = 5, Name = "Nala", ProductionId = 2, ActorId = 5 },
        new Character { Id = 6, Name = "Ip Man", ProductionId = 3, ActorId = 6 },
        new Character { Id = 7, Name = "Henry Brogan", ProductionId = 4, ActorId = 7 },
        new Character { Id = 8, Name = "Violet Crawley", ProductionId = 5, ActorId = 8 },
        new Character { Id = 9, Name = "Lady Mary Crawley", ProductionId = 5, ActorId = 9 },
        // television
        new Character { Id = 10, Name = "Will Smith", ProductionId = 6, ActorId = 7},
        new Character { Id = 11, Name = "Hilary Banks", ProductionId = 6, ActorId = 10 },
        new Character { Id = 12, Name = "Violet Crawley", ProductionId = 7, ActorId = 8 },
        new Character { Id = 13, Name = "Lady Mary Crawley", ProductionId = 7, ActorId = 9 },
        new Character { Id = 14, Name = "Eleven", ProductionId = 8, ActorId = 11 },
        new Character { Id = 15, Name = "Lucas", ProductionId = 8, ActorId = 12 },
        new Character { Id = 16, Name = "Joyce Byers", ProductionId = 8, ActorId = 13 },
        new Character { Id = 17, Name = "Jim Hopper", ProductionId = 8, ActorId = 14 },
        new Character { Id = 18, Name = "Ametani Kantarou", ProductionId = 9, ActorId = 15},
        new Character { Id = 19, Name = "Sano Erika", ProductionId = 9, ActorId = 16 },
        new Character { Id = 20, Name = "Daryl Dixon", ProductionId = 10, ActorId = 17 },
        new Character { Id = 21, Name = "Michonne", ProductionId = 10, ActorId = 3 },
        new Character { Id = 22, Name = "Carol Peletier", ProductionId = 10, ActorId = 18 }
    });

    // actors
    modelBuilder.Entity<Actor>().HasData(new Actor[]
    {
        new Actor { Id = 1, Name = "Robert Downey Jr." },
        new Actor { Id = 2, Name = "Chris Evans" },
        new Actor { Id = 3, Name = "Danai Guira" },
        new Actor { Id = 4, Name = "Donald Glover" },
        new Actor { Id = 5, Name = "Beyoncé" },
        new Actor { Id = 6, Name = "Donny Yen" },
        new Actor { Id = 7, Name = "Will Smith" },
        new Actor { Id = 8, Name = "Maggie Smith" },
        new Actor { Id = 9, Name = "Michelle Dockery" },
        new Actor { Id = 10, Name = "Karyn Parsons" },
        new Actor { Id = 11, Name = "Millie Bobby Brown" },
        new Actor { Id = 12, Name = "Caleb McLaughlin" },
        new Actor { Id = 13, Name = "Winona Ryder"},
        new Actor { Id = 14, Name = "David Harbour" },
        new Actor { Id = 15, Name = "Matsuya Onoe" },
        new Actor { Id = 16, Name = "Hazuki Shimizu"},
        new Actor { Id = 17, Name = "Norman Reedus" },
        new Actor { Id = 18, Name = "Melissa McBride" }
    });

    // let's generate lots of ratings
    var random = new Random();
    var size = 100;
    var sources = new [] {
        "Internet",
        "Newspaper",
        "Magazine",
        "App"
    };

    var ratings = productions
        .SelectMany((production, index) => {

            return Enumerable
                .Range(index * 100 + 1, size - 1)
                .Select(id => new Rating {
                    Id = id,
                    ProductionId = production.Id,
                    Stars = random.Next(1, 6),
                    Source = sources[random.Next(0, 4)]
                }).ToList();
        });

    modelBuilder.Entity<Rating>().HasData(ratings);

    #endregion

    base.OnModelCreating(modelBuilder);
}

There are a few caveats to seeding data that developers should be aware of:

  1. All identifiers need to be specified, even if the entity identifiers are auto-generated via the schema.
  2. We need to specify TPH data per type. In this case, one set of data for the Movie entity, and another for the Series entity.
  3. Our data must satisfy the requirements of our schema, including non-null columns.

Database migrations

Up to this point, we should have a schema expressed through the EntertainmentDbContext that we can apply to a database, although we’ve yet to apply any changes to our database. In our case, we are using an SQLite database so that folks can get up and running quickly with few external dependencies. Let’s initialize our database. From the Terminal tool window in JetBrains Rider, let’s run the following command.

>  dotnet ef migrations add InitialCreate --project Entertainment

By executing the command, we will build our current EF Core project and create our initial migrations. Our IntialCreate migration should have a timestamp prefix. Since migrations are chronological, the prefix is an essential tool for making sense of the changes occurring to our schema.

We should have some additional files in our project under the Migrations folder. These files are source control friendly, allowing multiple individuals to work on the EntertainmentDbContext without fear of irreparable conflicts or lost work. Since the dotnet-ef CLI tool generates and manages these files, we should resist the urge to modify these files via the IDE.

We can apply our migrations with the following command:

> dotnet ef database update --project Entertainment

Executing the database update command will apply our migrations up to the latest.

Build started...
Build succeeded.
Applying migration '20201023150600_InitialCreate'.
Done.

We can use Rider’s database tool window to make sure that our migrations were applied correctly.

Additionally, we can ask Rider to diagram our database tables, showing the relationships that we modeled during the code-first design phase of this tutorial.

database diagram tool in Rider

To test that our data has been seeded, we can use Rider’s built-in SQL query console. One of our favorite features is database introspection, which suggest SQL joins according to the current schema in the dialect of the target database engine.

sql query console

LINQ queries

We’ve done a lot of work up to this point to get to the reason developers choose an ORM in the first place, writing queries. This section will explore common queries we will write when dealing with our domain. Let’s look at a list of questions we can answer with our data.

  • What are the highest-grossing movies?
  • What are the highest-rated productions?
  • What are the sources of ratings by count?
  • Series with the least number of episodes?
  • What actors play multiple characters?

Before we answer these exciting questions, let’s talk about LINQ, also known as Language-Integrated Query syntax. In particular, we will use the Select method to project or results as anonymous objects. While we could use the types we’ve created in our EntertainmentDbContext, it’s essential to our app’s performance to return results we will use in our code. Like many things in life, the way we use EF Core’s LINQ interface depends on many factors, and developers should consider their use case and develop accordingly.

Highest grossing movies

We talked about how both a Movie and a Series is a Production. We store each entity type within the same Productions table, which EF Core manages for us. We can access the Movie types in one of two ways.

The first approach is to use the DbSet on our EntertainmentDbContext class.

var movies = database
    .Movies
    .OrderByDescending(x => x.WorldwideBoxOfficeGross);

We can also use the OfType method after accessing the Productions property to narrow down to a particular hierarchical subset. In our case, we want Movies.

var movies = database
    .Productions.OfType<Movie>()
    .OrderByDescending(x => x.WorldwideBoxOfficeGross);

Let’s take a look at the generated SQL from our LINQ statement. EF Core comes with a ToQueryString extension method to apply to IQueryable instances to retrieve the generated SQL string.

 SELECT "p"."Id", "p"."Discriminator", "p"."Name", "p"."Release",
 "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross"
 FROM "Productions" AS "p"
 WHERE "p"."Discriminator" = 'Movie'
 ORDER BY "p"."WorldwideBoxOfficeGross" DESC

We can see that EF Core utilized the Discriminator column, even though there is no explicit mention of it in our database model.

Highest-rated productions

Aggregation is one of the most important reasons to use a relational database, and with LINQ, using an aggregate function like Sum, Average, and Max is straightforward. Let’s find the highest-rated production by averaging the Star value on a Rating.

var highestRated = database
    .Productions
    .Select(x => new
    {
        id = x.Id,
        name = x.Name,
        avg = x.Ratings.Average(r => r.Stars),
        type = x.GetType().Name
    })
    .OrderByDescending(x => x.avg);

As we can see, we utilize the navigation property of Ratings to determine the average of the star values. What does this translate to in SQL?

 SELECT "p"."Id", "p"."Name", (
     SELECT AVG(CAST("r"."Stars" AS REAL))
     FROM "Ratings" AS "r"
     WHERE "p"."Id" = "r"."ProductionId"), "p"."Discriminator", "p"."Release",
 "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes"
 FROM "Productions" AS "p"
 ORDER BY (
     SELECT AVG(CAST("r0"."Stars" AS REAL))
     FROM "Ratings" AS "r0"
     WHERE "p"."Id" = "r0"."ProductionId") DESC

We can see that EF Core applies a sub-query to retrieve the average rating value of a Production. Additionally, it uses an additional sub-query to order our results.

EF Core users who want to write the query starting with the Ratings property cannot currently do so due to some EF Core query generator limitations. The EF team is still actively working through querying scenarios and are trying to resolve some of these issues. The issue appears when trying to select the production information from the rating’s row. In the next section, we’ll see that it’s possible to do aggregates on a single table without traversing the production relationship.

Sources of ratings by count

As mentioned in the previous section, aggregations are essential for providing summary data. In this example, we’ll look at the sources of ratings and how many exist in our database.

var sources = database
    .Ratings
    .GroupBy(x => x.Source)
    .Select(x => new { Source = x.Key, Count = x.Count() })
    .OrderByDescending(x => x.Count);

Unlike the previous query, this example only uses the Rating entity, which results in the following SQL.

SELECT "r"."Source", COUNT(*) AS "Count"
 FROM "Ratings" AS "r"
 GROUP BY "r"."Source"
 ORDER BY COUNT(*) DESC

Series with the fewest episodes

LINQ provides multiple approaches to pick the top record from a query, with methods like Take, First, and FirstOrDefault. In this example, let’s find the Series with the fewest number of episodes.

var episodes = database
    .Series
    .OrderBy(x => x.NumberOfEpisodes)
    .Select(x => new {
        x.Name,
        x.NumberOfEpisodes,
        x.Release
    })
    .Take(1);

An element to note in our LINQ query is the value of 1 passed to the Take method. This value exists in C# but does not exist in our SQL query unless we give it to our database engine. Let’s take a look at the generated SQL.

 .param set @__p_0 1

 SELECT "p"."Name", "p"."NumberOfEpisodes", "p"."Release"
 FROM "Productions" AS "p"
 WHERE "p"."Discriminator" = 'Series'
 ORDER BY "p"."NumberOfEpisodes"
 LIMIT @__p_0

As we can see, EF Core passes our value of 1 to the database as a parameter. Any constant we define in C# will be added as either a parameter or hardcoded into the SQL Query.

Here is an example where we find all productions with "The" in the name.

var productionsWithTheInName = database
    .Productions
    .Where(x => x.Name.Contains("The"));

The value of "The" finds its way into the generated SQL’s where clause.

 SELECT "p"."Id", "p"."Discriminator", "p"."Name", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes"
 FROM "Productions" AS "p"
 WHERE ('The' = '') OR (instr("p"."Name", 'The') > 0)

Actors playing characters in multiple productions

Likely the most complex query in this list, let’s find actors playing characters in different productions. We’ll need to start at the Actors entity and navigate through the Characters collection while including the Production information.

var multipleCharacters = database
    .Actors
    .Where(a => a.Characters.Count > 1)
    .Select(a => new {
        a.Name, 
        // some characters are both in TV and Movies
        Characters = a.Characters.Select(x => new {
            Name = x.Name,
            ProductionType = x.Production.GetType().Name
        })
        .OrderBy(x => x.Name)
        .ToList()
    });

As you can imagine, the SQL generated by EF Core will reflect the complexity of our request.

 SELECT "a"."Name", "a"."Id", "t"."Name", "t"."Id", "t"."Discriminator", "t"."Name0", "t"."Release", "t"."DurationInMinutes", "t"."WorldwideBoxOfficeGross",
 "t"."NumberOfEpisodes", "t"."Id0"
 FROM "Actors" AS "a"
 LEFT JOIN (
     SELECT "c"."Name", "p"."Id", "p"."Discriminator", "p"."Name" AS "Name0", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes", 
 "c"."Id" AS "Id0", "c"."ActorId"
     FROM "Characters" AS "c"
     INNER JOIN "Productions" AS "p" ON "c"."ProductionId" = "p"."Id"
 ) AS "t" ON "a"."Id" = "t"."ActorId"
 WHERE (
     SELECT COUNT(*)
     FROM "Characters" AS "c0"
     WHERE "a"."Id" = "c0"."ActorId") > 1
 ORDER BY "a"."Id", "t"."Name", "t"."Id0", "t"."Id"

The SQL statement is still very readable, even when we consider the answer we are returning from our database.

Conclusion

In this post, we modeled an entertainment problem domain using code-first design. In addition to modeling, we applied our schema to an SQLite database. Finally, we used the expressiveness of LINQ to generate SQL queries and return our result set.

From my previous experience with Entity Framework 6, I find the generated SQL of EF Core more readable and concise. EF Core still falters around some strange edge cases, but the EF team is actively working to support more complex LINQ to SQL translations. If you’re considering working with a database engine, I highly recommend looking at EF Core as an option to jumpstart your development.

You can play with these samples directly by cloning our sample repository.

Download Rider and give it a try!

image description