.NET Tools
Essential productivity kit for .NET and game developers
Getting Started With Entity Framework Core 5
With the .NET 5 release comes the newest Entity Framework Core version, unsurprisingly called Entity Framework Core 5 (EF Core 5). The .NET community got its first glimpse of Entity Framework (EF) on August 11th, 2008, with the release of .NET Framework 3.5 Service Pack 1. Let’s say a few things have changed since the initial release. For those unfamiliar with the technology, Entity Framework is an object-relational mapper (ORM), helping developers overcome the impedance mismatches between C# and a database instance’s schema.
Entity Framework has come a long way since its original inception. Conceived initially as a designer-centric tool heavily reliant on Visual Studio’s graphical user interface and SQL Server, EF’s current philosophy recognizes the need to be friendlier to ideas like distributed source-control, build servers, multiple database engines, and operating systems.
In this post, we’ll see how we can get started with EF Core 5. We’ll design a schema for entertainment productions using a code-first approach, migrate an SQLite database to our schema’s current incarnation, seed our database, and run some typical LINQ queries.
Getting started
To get started, I recommend folks install the latest version of the .NET 5 SDK. Technically, EF Core 5 can run on .NET Core 3.1, but aligning versions is always a good idea.
Starting with a brand new console application, we will need to install the following packages, making sure that all the versions are 5.0.0
:
Microsoft.EntityFrameworkCore
Microsoft.EntityFrameworkCore.Design
Microsoft.EntityFrameworkCore.Sqlite
We can use the NuGet tool window to find and install all these packages into our project.
Additionally, we will want to install the dotnet-ef
command-line tool (CLI). From within the newly created project directory type the following commands into Rider’s integrated terminal.
> dotnet new tool-manifest > dotnet tool install dotnet-ef
When we finish setting up our project, we can run the dotnet-ef
command, where the EF unicorn will greet us.
> dotnet ef
Great! We’re now ready to start modeling our problem domain.
Folks who want to follow along can clone a sample project from GitHub.
Code-first modeling
Every EF Core solution will have at least one DbContext
implementation. What is DbContext
? The DbContext
is a specialized base class that tracks our in-memory operations, allowing us to write and execute queries, track changes that we can persist back to our database, and interact with our database engine indirectly or directly in exceptional circumstances.
To get started using EF Core 5, we need to define a class that derives from DbContext
and contains our models. Typically, our classes will map one to one with our database tables, and relationships between tables are defined using navigational conventions. We’ll see how this works as we get into defining our classes.
Before diving into C# code, let’s discuss our topic. We’ll keep the subject simple, but we’ll also explore some important EF Core concepts. Our subject matter is Entertainment, both movies and television, which we’ll call Productions. A Production
has characters and ratings. An Actor
will always portray a Character
. I hope as folks read these sentences, they start to formulate the relationships in their mind. Let’s look at our models, starting with a Production
.
Table-per-hierarchy and inheritance
We had mentioned that both a Movie
and a Series
could be a Production
. EF Core 5 applies an approach known as Table-per-hierarchy (TPH), where we store hierarchically related entities in a single table.
Each record’s type is distinguished by a Discriminator
column, which usually holds a simplified C# type name. The Discriminator
column is invisible to developers from C#, but we can see it in the database schema generated by EF Core 5.
public abstract class Production { public int Id { get; set; } public string Name { get; set; } public DateTime Release { get; set; } } public class Movie : Production { public int DurationInMinutes { get; set; } public double WorldwideBoxOfficeGross { get; set; } } public class Series : Production { public int NumberOfEpisodes { get; set; } }
The discriminator
is an important mechanism that we’ll see used later in this post’s query section.
Relationships
When thinking about relational databases such as SQLite, SQL Server, PostgreSQL, and Oracle, we need to consider three kinds of relationships between our tables:
- Many to many
- One to many
- One to one
We have a many to many relationship between our Production
and the Actors
. We express this relationship through the Character
entity. We also have a one to many relationship, where a Production
can have many instances of a Rating
.
public abstract class Production { public int Id { get; set; } public string Name { get; set; } public DateTime Release { get; set; } public List<Character> Characters { get; set; } = new List<Character>(); public List<Rating> Ratings { get; set; } = new List<Rating>(); } public class Movie : Production { public int DurationInMinutes { get; set; } public double WorldwideBoxOfficeGross { get; set; } } public class Series : Production { public int NumberOfEpisodes { get; set; } } public class Rating { public int Id { get; set; } public int ProductionId { get; set; } public Production Production { get; set; } public string Source { get; set; } public int Stars { get; set; } } public class Character { public int Id { get; set; } public int ProductionId { get; set; } public Production Production { get; set; } public string Name { get; set; } public Actor Actor { get; set; } public int ActorId { get; set; } } public class Actor { public int Id { get; set; } public string Name { get; set; } public List<Character> Characters { get; set; } = new List<Character>(); }
Looking over our models, we can see some conventional patterns emerge. Naming is an essential part of EF Core modeling, where we define relationships by the Type
of property definitions and their names. Let’s look at the Character
definition.
We can see that the Character
class has a one to one relationship to both an Actor
entity and a Production
entity. EF Core uses the naming convention of [Entity]Id
to map values to the in-memory object instance. Developers can override EF Core conventions, but I recommend folks starting with EF Core use the default conventions.
Great, now that we have built our entities and relationships, let’s add them to our DbContext
instance, which we’ll call EntertainmentDbContext
. We’ll also want to define our database connection string for demo purposes.
public class EntertainmentDbContext : DbContext { protected override void OnConfiguring(DbContextOptionsBuilder options) => options.UseSqlite("Data Source=entertainment.db"); public DbSet<Production> Productions { get; set; } public DbSet<Movie> Movies { get; set; } public DbSet<Series> Series { get; set; } public DbSet<Rating> Ratings { get; set; } public DbSet<Character> Characters { get; set; } public DbSet<Actor> Actors { get; set; } } public abstract class Production { public int Id { get; set; } public string Name { get; set; } public DateTime Release { get; set; } public List<Character> Characters { get; set; } = new List<Character>(); public List<Rating> Ratings { get; set; } = new List<Rating>(); } public class Movie : Production { public int DurationInMinutes { get; set; } public double WorldwideBoxOfficeGross { get; set; } } public class Series : Production { public int NumberOfEpisodes { get; set; } } public class Rating { public int Id { get; set; } public int ProductionId { get; set; } public Production Production { get; set; } public string Source { get; set; } public int Stars { get; set; } } public class Character { public int Id { get; set; } public int ProductionId { get; set; } public Production Production { get; set; } public string Name { get; set; } public Actor Actor { get; set; } public int ActorId { get; set; } } public class Actor { public int Id { get; set; } public string Name { get; set; } public List<Character> Characters { get; set; } = new List<Character>(); } }
Seeding initial data
There are multiple ways to insert data into our database, but it’s a good idea to use the data seed mechanism of EF Core for demos and necessary look-up values. In EF Core, we can override the OnModelCreating
method on our EntertainmentDbContext
.
protected override void OnModelCreating(ModelBuilder modelBuilder) { #region seed data var movies = new Movie[] { new Movie { Id = 1, Name = "Avengers: Endgame", WorldwideBoxOfficeGross = 2_797_800_564, DurationInMinutes = 181, Release = new DateTime(2019, 4, 26) }, new Movie { Id = 2, Name = "The Lion King", WorldwideBoxOfficeGross = 1_654_791_102, DurationInMinutes = 118, Release = new DateTime(2019, 7, 19) }, new Movie { Id = 3, Name = "Ip Man 4", WorldwideBoxOfficeGross = 192_617_891, DurationInMinutes = 105, Release = new DateTime(2019, 12, 25) }, new Movie { Id = 4, Name = "Gemini Man", WorldwideBoxOfficeGross = 166_623_705, DurationInMinutes = 116, Release = new DateTime(2019, 11, 20) }, new Movie { Id = 5, Name = "Downton Abbey", WorldwideBoxOfficeGross = 194_051_302, DurationInMinutes = 120, Release = new DateTime(2020, 9, 20 )} }; var series = new Series [] { new Series { Id = 6 , Name = "The Fresh Prince of Bel-Air", NumberOfEpisodes = 148, Release = new DateTime(1990, 9, 10) }, new Series { Id = 7 , Name = "Downton Abbey", NumberOfEpisodes = 52, Release = new DateTime(2010, 09, 26) }, new Series { Id = 8 , Name = "Stranger Things", NumberOfEpisodes = 34 , Release = new DateTime(2016, 7, 11) }, new Series { Id = 9 , Name = "Kantaro: The Sweet Tooth Salaryman", NumberOfEpisodes = 12, Release = new DateTime(2017,7, 14) }, new Series { Id = 10, Name = "The Walking Dead", NumberOfEpisodes = 177 , Release = new DateTime(2010, 10, 31) } }; var productions = movies .Cast<Production>() .Union(series) .ToList(); modelBuilder.Entity<Movie>().HasData(movies); modelBuilder.Entity<Series>().HasData(series); // characters modelBuilder.Entity<Character>().HasData(new Character [] { // movies new Character { Id = 1, Name = "Tony Stark", ProductionId = 1, ActorId = 1 }, new Character { Id = 2, Name = "Steve Rogers", ProductionId = 1, ActorId = 2 }, new Character { Id = 3, Name = "Okoye", ProductionId = 1, ActorId = 3 }, new Character { Id = 4, Name = "Simba", ProductionId = 2, ActorId = 4 }, new Character { Id = 5, Name = "Nala", ProductionId = 2, ActorId = 5 }, new Character { Id = 6, Name = "Ip Man", ProductionId = 3, ActorId = 6 }, new Character { Id = 7, Name = "Henry Brogan", ProductionId = 4, ActorId = 7 }, new Character { Id = 8, Name = "Violet Crawley", ProductionId = 5, ActorId = 8 }, new Character { Id = 9, Name = "Lady Mary Crawley", ProductionId = 5, ActorId = 9 }, // television new Character { Id = 10, Name = "Will Smith", ProductionId = 6, ActorId = 7}, new Character { Id = 11, Name = "Hilary Banks", ProductionId = 6, ActorId = 10 }, new Character { Id = 12, Name = "Violet Crawley", ProductionId = 7, ActorId = 8 }, new Character { Id = 13, Name = "Lady Mary Crawley", ProductionId = 7, ActorId = 9 }, new Character { Id = 14, Name = "Eleven", ProductionId = 8, ActorId = 11 }, new Character { Id = 15, Name = "Lucas", ProductionId = 8, ActorId = 12 }, new Character { Id = 16, Name = "Joyce Byers", ProductionId = 8, ActorId = 13 }, new Character { Id = 17, Name = "Jim Hopper", ProductionId = 8, ActorId = 14 }, new Character { Id = 18, Name = "Ametani Kantarou", ProductionId = 9, ActorId = 15}, new Character { Id = 19, Name = "Sano Erika", ProductionId = 9, ActorId = 16 }, new Character { Id = 20, Name = "Daryl Dixon", ProductionId = 10, ActorId = 17 }, new Character { Id = 21, Name = "Michonne", ProductionId = 10, ActorId = 3 }, new Character { Id = 22, Name = "Carol Peletier", ProductionId = 10, ActorId = 18 } }); // actors modelBuilder.Entity<Actor>().HasData(new Actor[] { new Actor { Id = 1, Name = "Robert Downey Jr." }, new Actor { Id = 2, Name = "Chris Evans" }, new Actor { Id = 3, Name = "Danai Guira" }, new Actor { Id = 4, Name = "Donald Glover" }, new Actor { Id = 5, Name = "Beyoncé" }, new Actor { Id = 6, Name = "Donny Yen" }, new Actor { Id = 7, Name = "Will Smith" }, new Actor { Id = 8, Name = "Maggie Smith" }, new Actor { Id = 9, Name = "Michelle Dockery" }, new Actor { Id = 10, Name = "Karyn Parsons" }, new Actor { Id = 11, Name = "Millie Bobby Brown" }, new Actor { Id = 12, Name = "Caleb McLaughlin" }, new Actor { Id = 13, Name = "Winona Ryder"}, new Actor { Id = 14, Name = "David Harbour" }, new Actor { Id = 15, Name = "Matsuya Onoe" }, new Actor { Id = 16, Name = "Hazuki Shimizu"}, new Actor { Id = 17, Name = "Norman Reedus" }, new Actor { Id = 18, Name = "Melissa McBride" } }); // let's generate lots of ratings var random = new Random(); var size = 100; var sources = new [] { "Internet", "Newspaper", "Magazine", "App" }; var ratings = productions .SelectMany((production, index) => { return Enumerable .Range(index * 100 + 1, size - 1) .Select(id => new Rating { Id = id, ProductionId = production.Id, Stars = random.Next(1, 6), Source = sources[random.Next(0, 4)] }).ToList(); }); modelBuilder.Entity<Rating>().HasData(ratings); #endregion base.OnModelCreating(modelBuilder); }
There are a few caveats to seeding data that developers should be aware of:
- All identifiers need to be specified, even if the entity identifiers are auto-generated via the schema.
- We need to specify TPH data per type. In this case, one set of data for the
Movie
entity, and another for theSeries
entity. - Our data must satisfy the requirements of our schema, including non-null columns.
Database migrations
Up to this point, we should have a schema expressed through the EntertainmentDbContext
that we can apply to a database, although we’ve yet to apply any changes to our database. In our case, we are using an SQLite database so that folks can get up and running quickly with few external dependencies. Let’s initialize our database. From the Terminal tool window in JetBrains Rider, let’s run the following command.
> dotnet ef migrations add InitialCreate --project Entertainment
By executing the command, we will build our current EF Core project and create our initial migrations. Our IntialCreate
migration should have a timestamp prefix. Since migrations are chronological, the prefix is an essential tool for making sense of the changes occurring to our schema.
We should have some additional files in our project under the Migrations
folder. These files are source control friendly, allowing multiple individuals to work on the EntertainmentDbContext
without fear of irreparable conflicts or lost work. Since the dotnet-ef
CLI tool generates and manages these files, we should resist the urge to modify these files via the IDE.
We can apply our migrations with the following command:
> dotnet ef database update --project Entertainment
Executing the database update
command will apply our migrations up to the latest.
Build started... Build succeeded. Applying migration '20201023150600_InitialCreate'. Done.
We can use Rider’s database tool window to make sure that our migrations were applied correctly.
Additionally, we can ask Rider to diagram our database tables, showing the relationships that we modeled during the code-first design phase of this tutorial.
To test that our data has been seeded, we can use Rider’s built-in SQL query console. One of our favorite features is database introspection, which suggest SQL joins according to the current schema in the dialect of the target database engine.
LINQ queries
We’ve done a lot of work up to this point to get to the reason developers choose an ORM in the first place, writing queries. This section will explore common queries we will write when dealing with our domain. Let’s look at a list of questions we can answer with our data.
- What are the highest-grossing movies?
- What are the highest-rated productions?
- What are the sources of ratings by count?
- Series with the least number of episodes?
- What actors play multiple characters?
Before we answer these exciting questions, let’s talk about LINQ, also known as Language-Integrated Query syntax. In particular, we will use the Select
method to project or results as anonymous objects. While we could use the types we’ve created in our EntertainmentDbContext
, it’s essential to our app’s performance to return results we will use in our code. Like many things in life, the way we use EF Core’s LINQ interface depends on many factors, and developers should consider their use case and develop accordingly.
Highest grossing movies
We talked about how both a Movie
and a Series
is a Production
. We store each entity type within the same Productions
table, which EF Core manages for us. We can access the Movie
types in one of two ways.
The first approach is to use the DbSet
on our EntertainmentDbContext
class.
var movies = database .Movies .OrderByDescending(x => x.WorldwideBoxOfficeGross);
We can also use the OfType
method after accessing the Productions
property to narrow down to a particular hierarchical subset. In our case, we want Movies.
var movies = database .Productions.OfType<Movie>() .OrderByDescending(x => x.WorldwideBoxOfficeGross);
Let’s take a look at the generated SQL from our LINQ statement. EF Core comes with a ToQueryString
extension method to apply to IQueryable
instances to retrieve the generated SQL string.
SELECT "p"."Id", "p"."Discriminator", "p"."Name", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross" FROM "Productions" AS "p" WHERE "p"."Discriminator" = 'Movie' ORDER BY "p"."WorldwideBoxOfficeGross" DESC
We can see that EF Core utilized the Discriminator
column, even though there is no explicit mention of it in our database model.
Highest-rated productions
Aggregation is one of the most important reasons to use a relational database, and with LINQ, using an aggregate function like Sum
, Average
, and Max
is straightforward. Let’s find the highest-rated production by averaging the Star
value on a Rating
.
var highestRated = database .Productions .Select(x => new { id = x.Id, name = x.Name, avg = x.Ratings.Average(r => r.Stars), type = x.GetType().Name }) .OrderByDescending(x => x.avg);
As we can see, we utilize the navigation property of Ratings
to determine the average of the star values. What does this translate to in SQL?
SELECT "p"."Id", "p"."Name", ( SELECT AVG(CAST("r"."Stars" AS REAL)) FROM "Ratings" AS "r" WHERE "p"."Id" = "r"."ProductionId"), "p"."Discriminator", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes" FROM "Productions" AS "p" ORDER BY ( SELECT AVG(CAST("r0"."Stars" AS REAL)) FROM "Ratings" AS "r0" WHERE "p"."Id" = "r0"."ProductionId") DESC
We can see that EF Core applies a sub-query to retrieve the average rating value of a Production
. Additionally, it uses an additional sub-query to order our results.
EF Core users who want to write the query starting with the Ratings
property cannot currently do so due to some EF Core query generator limitations. The EF team is still actively working through querying scenarios and are trying to resolve some of these issues. The issue appears when trying to select the production information from the rating’s row. In the next section, we’ll see that it’s possible to do aggregates on a single table without traversing the production relationship.
Sources of ratings by count
As mentioned in the previous section, aggregations are essential for providing summary data. In this example, we’ll look at the sources of ratings and how many exist in our database.
var sources = database .Ratings .GroupBy(x => x.Source) .Select(x => new { Source = x.Key, Count = x.Count() }) .OrderByDescending(x => x.Count);
Unlike the previous query, this example only uses the Rating
entity, which results in the following SQL.
SELECT "r"."Source", COUNT(*) AS "Count" FROM "Ratings" AS "r" GROUP BY "r"."Source" ORDER BY COUNT(*) DESC
Series with the fewest episodes
LINQ provides multiple approaches to pick the top record from a query, with methods like Take
, First
, and FirstOrDefault
. In this example, let’s find the Series
with the fewest number of episodes.
var episodes = database .Series .OrderBy(x => x.NumberOfEpisodes) .Select(x => new { x.Name, x.NumberOfEpisodes, x.Release }) .Take(1);
An element to note in our LINQ query is the value of 1 passed to the Take
method. This value exists in C# but does not exist in our SQL query unless we give it to our database engine. Let’s take a look at the generated SQL.
.param set @__p_0 1 SELECT "p"."Name", "p"."NumberOfEpisodes", "p"."Release" FROM "Productions" AS "p" WHERE "p"."Discriminator" = 'Series' ORDER BY "p"."NumberOfEpisodes" LIMIT @__p_0
As we can see, EF Core passes our value of 1 to the database as a parameter. Any constant we define in C# will be added as either a parameter or hardcoded into the SQL Query.
Here is an example where we find all productions with "The" in the name.
var productionsWithTheInName = database .Productions .Where(x => x.Name.Contains("The"));
The value of "The" finds its way into the generated SQL’s where clause.
SELECT "p"."Id", "p"."Discriminator", "p"."Name", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes" FROM "Productions" AS "p" WHERE ('The' = '') OR (instr("p"."Name", 'The') > 0)
Actors playing characters in multiple productions
Likely the most complex query in this list, let’s find actors playing characters in different productions. We’ll need to start at the Actors
entity and navigate through the Characters
collection while including the Production
information.
var multipleCharacters = database .Actors .Where(a => a.Characters.Count > 1) .Select(a => new { a.Name, // some characters are both in TV and Movies Characters = a.Characters.Select(x => new { Name = x.Name, ProductionType = x.Production.GetType().Name }) .OrderBy(x => x.Name) .ToList() });
As you can imagine, the SQL generated by EF Core will reflect the complexity of our request.
SELECT "a"."Name", "a"."Id", "t"."Name", "t"."Id", "t"."Discriminator", "t"."Name0", "t"."Release", "t"."DurationInMinutes", "t"."WorldwideBoxOfficeGross", "t"."NumberOfEpisodes", "t"."Id0" FROM "Actors" AS "a" LEFT JOIN ( SELECT "c"."Name", "p"."Id", "p"."Discriminator", "p"."Name" AS "Name0", "p"."Release", "p"."DurationInMinutes", "p"."WorldwideBoxOfficeGross", "p"."NumberOfEpisodes", "c"."Id" AS "Id0", "c"."ActorId" FROM "Characters" AS "c" INNER JOIN "Productions" AS "p" ON "c"."ProductionId" = "p"."Id" ) AS "t" ON "a"."Id" = "t"."ActorId" WHERE ( SELECT COUNT(*) FROM "Characters" AS "c0" WHERE "a"."Id" = "c0"."ActorId") > 1 ORDER BY "a"."Id", "t"."Name", "t"."Id0", "t"."Id"
The SQL statement is still very readable, even when we consider the answer we are returning from our database.
Conclusion
In this post, we modeled an entertainment problem domain using code-first design. In addition to modeling, we applied our schema to an SQLite database. Finally, we used the expressiveness of LINQ to generate SQL queries and return our result set.
From my previous experience with Entity Framework 6, I find the generated SQL of EF Core more readable and concise. EF Core still falters around some strange edge cases, but the EF team is actively working to support more complex LINQ to SQL translations. If you’re considering working with a database engine, I highly recommend looking at EF Core as an option to jumpstart your development.
You can play with these samples directly by cloning our sample repository.
Download Rider and give it a try!