.NET Tools

Getting Started: Entity Framework Core 7 JSON Support

When working with databases, there are a few tenets to keep in mind to maximize your application’s performance. First, reduce the number of network calls. Second, reduce the complexity of your queries. Third, reduce the data retrieved in your result set.

If you’re an Entity Framework Core 7 user, you can strive to meet these using JSON columns. JavaScript Object Notation (JSON) is a language-independent format widely used across many technologies and ecosystems. However, while JSON support has been available in most database engines for quite some time, the EF Core team only recently released first-class support for EF Core 7 alongside the .NET 7 release.

This post will discuss the advantages of using JSON when data modeling, the benefits of storing JSON, and how to work with JSON data in your EF Core 7 applications.

Why store data as JSON in a database?

Relational data models rely on multiple tables to describe your problem domain. While in most cases this works, in some cases, it might be unnecessary to sprawl data across various tables. With numerous tables, you also have the overhead of foreign keys, column indexes, and the potential to have broken relationships leading to invalid data. At query time, this means join queries. You could spend hours, days, or weeks optimizing around these issues. There has to be a better way, right?

In the late 2000s, the NoSQL trend introduced developers to “documents”, a singular record that contains most of the necessary information required to fulfill a user experience. At a glance, documents were indistinguishable from JSON, and developers fell in love with the ideas NoSQL presented:

  1. More straightforward queries, typically by key.
  2. Denormalization of data.
  3. Document structures that are similar to an application’s objects.

While NoSQL is still popular today, it failed to overthrow the relational order, with many teams valuing the benefits of production-hardened database engines. 

The SQL-based ecosystem is robust, and teams typically had stakeholders invested in tooling. But it’s safe to say migrating from a product that works to an unknown and unproven promise, at the time, was difficult. So, in an attempt to meet the needs of developers, many database products began offering some element of JSON support, providing the desired development workflow without completely upending an organization’s established processes and practices.

All this to say, why do you want JSON? Well, it’s straightforward:

  1. Simpler schema designs.
  2. Fewer tables are involved in queries.
  3. Inserts and updates include fewer tables.
  4. Optimized result sets due to lack of repeated rows.

Depending on your use case, there are more benefits to storing data as JSON. But ultimately, as the NoSQL trend showed developers, reducing the complexity of your data-access layer can provide you with a much more enjoyable development experience.

What type of data can you store in JSON columns?

Like any data modeling technique, you should apply it where it makes the most sense and will provide you with the most value. 

The best candidates for conversion are typically “owned” entities. They are relatively complex entities that don’t make sense without the necessary context of their parent entity. Examples include an individual’s home address, tags on a blog post, and image metadata. If you query data with its parent, you have a good candidate for a JSON column. 

Other candidates for JSON columns are the denormalization of commonly-used data. These examples include an author’s name on a blog post, the latest records from a historical log, and cached third-party service data that typically relies on a service call.

The final case for storing values in a JSON column is for quick prototyping. Keeping essential data in a JSON column may make sense if you’re still discovering your domain’s problem. You can iterate on your application while looking for emerging usage patterns. Then, as your application’s features stabilize, you can migrate data from JSON into a traditional table structure.

Defining JSON columns with Entity Framework Core 7

For this post’s example, we will use a data context designed for a blogging platform. I know, “how original”, but it’s the no-frills way to see the benefit of storing values using JSON columns. So first, let’s have a look at our entities.

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }
    public Author Author { get; set; }
    public string Permalink { get; set; }
    public DateTimeOffset CreateAt { get; set; } = DateTimeOffset.UtcNow;
    public DateTimeOffset UpdatedAt { get; set; } = DateTimeOffset.UtcNow;
    public List<Tag> Tags { get; set; } = new();
}

public class Tag
{
    public string Value { get; set; }
    public static implicit operator string(Tag tag) => tag.Value;
    public static implicit operator Tag(string tag) => new() { Value = tag };
}

public class Author
{
    public string Name { get; set; }
    public string ImageUrl { get; set; }
    public string SocialMediaUrl { get; set; }
}

We’ll store values for tags and the author’s information in JSON columns alongside top-level post information. Looking at the data objects, we see an essential requirement of using JSON columns in the Tag collection of our Post entity. Since EF Core provides object tracking of entities, all relationships, regardless of whether the values will exist in a separate table or within the same table, require an entity definition.

Unfortunately, the requirement means you can’t store a collection of primitives like int, string, or double directly. On the positive side, defining a Tag entity in your code allows for data expansion if your use case evolves past the need for simple string values.

If you’re a long-time user of EF Core, you will notice little difference in defining your entities. These definitions are almost identical to a table-based approach. The next step is to tell EF Core to treat these related entities like JSON, which occurs in the OnModelCreating method in your DbContext definition.

public class Database : DbContext
{
    public DbSet<Post> Posts => Set<Post>();

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .LogTo(Console.WriteLine, (_, level) => level == LogLevel.Information)
            .EnableSensitiveDataLogging()
            .UseSqlServer("server=localhost,11433;database=Blog;user=sa;password=Pass123!;Encrypt=false");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder
            .Entity<Post>()
            .OwnsOne(post => post.Author, builder => { builder.ToJson(); })
            .OwnsMany(post => post.Tags, builder => { builder.ToJson(); });
    }
}

You’ll notice the calls to ToJson when defining the parent-child relationships of Author and Tags. The model builder definition tells EF Core to serialize these columns and treat them like strings when generating SQL. The exact details of the generated SQL will depend on your database engine and SQL provider.

I’m using the SQL Server provider for EF Core in this post. Generating a migration shows that EF Core expects JSON columns to be nvarchar(max) types. The column type is configurable, but I recommend starting with nvarchar(max) if you’re initially unsure how large your JSON values will be.

migrationBuilder.CreateTable(
    name: "Posts",
    columns: table => new
    {
        Id = table.Column<int>(type: "int", nullable: false)
            .Annotation("SqlServer:Identity", "1, 1"),
        Title = table.Column<string>(type: "nvarchar(max)", nullable: false),
        Permalink = table.Column<string>(type: "nvarchar(max)", nullable: false),
        CreateAt = table.Column<DateTimeOffset>(type: "datetimeoffset", nullable: false),
        UpdatedAt = table.Column<DateTimeOffset>(type: "datetimeoffset", nullable: false),
        Author = table.Column<string>(type: "nvarchar(max)", nullable: false),
        Tags = table.Column<string>(type: "nvarchar(max)", nullable: true)
    },
    constraints: table =>
    {
        table.PrimaryKey("PK_Posts", x => x.Id);
    });

Using EF Core 7 JSON column properties

As you’d expect, using JSON-backed properties is transparent to you as a developer. You can write the same EF Core code you’ve always been writing; it should work! Let’s look at storing a new blog post using the same previously defined entities.

using EntityFrameworkCoreJson.Models;
using Microsoft.EntityFrameworkCore;

var db = new Database();

var blogPost = new Post
{
    Title = "Writing about .NET",
    Author = 
    {
        Name = "Khalid Abuhakmeh",
        ImageUrl = "/khalidabuhakmeh.jpg",
        SocialMediaUrl = "@khalidabuhakmeh@mastodon.social"
    },
    Tags = { "dotnet", "drop", "dot" }
};
db.Posts.Add(blogPost);
await db.SaveChangesAsync();

In all honesty, that’s what makes this approach great. It allows you to optimize your applications without affecting much, if any, of your business logic. 

So, now that you have a record in the database, let’s look at a standard query.

var post = await db.Posts
    .OrderByDescending(x => x.CreateAt)
    .FirstAsync();

You’ll notice one significant difference here compared to commonly written EF Core queries: the lack of Include statements. If Author and Tags were separate tables, you’d need to include them in our query. If you do have Include statements, EF Core will ignore them. In this case, you get each owned value as part of the row result. You can see the columns for Author and Tags referenced in the generated SQL Query.

SELECT TOP(1) [p].[Id], [p].[CreateAt], [p].[Permalink], [p].[Title], [p].[UpdatedAt], JSON_QUERY([p].[Author],'$'), JSON_QUERY([p].[Tags],'$')

What do you do when you want to update values? Well, it’s as simple as updating the values. Since EF Core tracks JSON-backed entities, it will be responsible for generating the necessary SQL to update the modified columns.

post.Author.Name = "Maarten Balliauw";
post.Author.ImageUrl = "/maartenballiauw.jpg";
post.Author.SocialMediaUrl = "@maarten@balliauw.be";
post.Tags.Add("Belgians");

The SQL update statements are just as you’d expect them to look.

[Parameters=[@p0='{"ImageUrl":"/maartenballiauw.jpg","Name":"Maarten Balliauw","SocialMediaUrl":"@maarten@balliauw.be"}' (Nullable = false) (Size = 99), @p1='[{"Value":"dotnet"},{"Value":"drop"},{"Value":"dot"},{"Value":"Belgians"}]' (Nullable = false) (Size = 74), @p2='1'], CommandType='Text', CommandTimeout='30']
      SET IMPLICIT_TRANSACTIONS OFF;
      SET NOCOUNT ON;
      UPDATE [Posts] SET [Author] = @p0, [Tags] = @p1
      OUTPUT 1
      WHERE [Id] = @p2;

Managing child collections has always been one of the more tedious aspects of working with EF Core. However, as you can see, the Tags collection can be modified in memory and stored in one database call. That’s a solid reason to consider JSON columns as an option.

In addition to creating and updating values, JSON-based entities can also participate in projections, helping you reduce the amount of data your query retrieves over the network.

var latest = await db.Posts
    .OrderByDescending(x => x.CreateAt)
    .Select(x => new { x.Title, x.Author.Name })
    .AsNoTracking()
    .FirstAsync();

Console.WriteLine($"{latest.Name} wrote \"{latest.Title}\"");

The generated SQL, in this case, uses SQL Server-specific syntax to limit the JSON pulled back from the database.

      SELECT TOP(1) [p].[Title], CAST(JSON_VALUE([p].[Author],'$.Name') AS nvarchar(max)) AS [Name]
      FROM [Posts] AS [p]
      ORDER BY [p].[CreateAt] DESC

Conclusion

EF Core 7 and JSON columns will help developers optimize their applications more than they previously could, with the added benefit of managing a lot less code. I started the article with tenets of improving performance in database-driven applications. JSON support in EF Core 7 is a great way to strive for a better user experience. You can perform all the meaningful operations you can with table-based relationships, and in most cases, the EF Core abstraction masks much of the complexity from you.

Currently, the SQL Server provider is the best option, but SQLite and PostgreSQL providers are being worked on for release soon. The EF Core team did a great job with this feature, and I’m excited for EF Core developers to explore the possibilities of storing more data as JSON.

Please comment below if you found this post helpful or have any questions.

References

image description