Diving Into NuGet History For Fun and Community Insights
In the dark ages of .NET, developers had to crawl the internet for dependencies, run installers, create bloated
lib folders, and possibly introduce assemblies into the Global Assembly Cache. Gross! Those who remember will agree it was terrible, but it’s better now!
We can all breathe a deep sigh of relief knowing that package management is central to most modern technology stacks, including our beloved .NET. Today, most developers instinctually reach to the catalog of packages on NuGet to solve mundane to difficult problems. That’s a good thing, but easy access to solutions also can lull us into taking our community for granted.
In this post, we’ll look at our .NET community’s history through the lens of NuGet data harvested using Maarten’s Azure Functions-based metadata crawler, which produced 2.7 million records and a 1.5GB comma-delimited text file. We loaded the data into Elasticsearch and utilized Kibana to make a dashboard, which you’ll see later in this post.
APIs, data points, and structure
As of writing this post, there is no way to get a singular database of NuGet information. Folks looking to recreate this post will need to crawl the NuGet APIs made available to everyone. Again, Maarten’s Azure Functions-based metadata crawler is a great place to start.
At first, retrieving the data had us worried. NuGet is a critical community infrastructure that everyone relies on working. Having multiple community members coordinate millions of API calls at the same time could damage NuGet’s ability to deliver responses to clients. That said, we were given the “unofficial” go-ahead by NuGet team members to use the APIs.
For our data, we only retrieved publishing information and not consumer information. For example, Amazon has published the
AWSSDK.APIGateway package over 396 times since its first release. Amazon’s release cycle is tame in comparison to other libraries. Paket, a popular package manager client, has published its packages over 2,413 times!
While the NuGet API does expose download counts, it’s cumbersome to gather this data for all packages. There is an open GitHub issue about exposing download counts in an easy-to-consume manner, but given the already enormous breadth of information, we decided against download counts (for now).
The information we were able to retrieve for published packages includes:
Publish Date, and
Target Frameworks. Each record in our database is for a package version, helping us understand any particular package’s publishing frequency and lifespan.
We need to get a disclaimer out of the way before we start looking at information.
First, we’re making assumptions about folks appropriately tagging packages – likely a significant portion of publishers are not. Lack of tags could mean the results are incomplete. Additionally, while we can compute narrow slices based on tags, authors, and time frames, it gets harder to do larger wedges of data. The computation overwhelms the compute engine.
Additionally, this post is about data. We’ll try not to assume anything about the state of .NET, OSS, or any other conjectures.
So let’s get started by looking at some data.
Started from the bottom…
*If the dashboard images are too small, use the context menu to open the images in a new tab.
All great things have humble beginnings, and NuGet is no different. On January 7th, 2011, David Bryon published
Agatha-rrsl to the NuGet registry. The first NuGet package, followed almost 1 second later by a flood of packages. On January 7th, 177 publish events, and publishers added 155 unique packages to the ecosystem.
Authors on that day include the Codeplex Foundation, Google, and Microsoft. Significant tags include
mocking. Seeing that ASP.NET MVC is one of the first projects to be open-sourced by Microsoft, it’s no surprise to see it amongst the first packages released.
Some folks may be wondering what
sl4 mean regarding target frameworks. Well, that’s Silverlight, the discontinued web version of .NET. Don’t worry though, many of the lessons learned from Silverlight found their way into .NET Core and .NET 5.
From those small yet significant beginnings, we now have 229,336+ unique packages on NuGet.
Now we’re here!
Let’s look at the publishing of unique packages from the last 30 days. We’ll see a clear pattern emerging as compared to the first day of NuGet packages.
The consistent pattern would likely be due to advances in continuous build and deployment processes. Products like TeamCity, AppVeyor, and GitHub Actions help developers ship consistently even for minor changes.
Looking at a complete picture of the last 30 days, we can see service providers have been busy pushing commercial packages to NuGet: Amazon, Google, Microsoft, Uno Platform, and Syncfusion. Amazon has an overwhelmingly large lead in the tag count, with aws being present over 3,000 times.
What’s also interesting here is the target frameworks tag cloud. The target framework
netstandard2.1 looms large over other versions. While still present, the market share for legacy .NET is relatively smaller. We can see that
net50 begins to appear as developers start to support the new version of .NET.
❤️ JetBrains loves OSS
It wouldn’t be any fun unless we looked at what JetBrains has done to support .NET OSS. For this section, let’s look exclusively at the
jetbrains author, which does not include individual JetBrains employees’ contributions to the ecosystem.
JetBrains has contributed 44 unique packages to NuGet, with a primary focus around our .NET tools like ReSharper, dotMemory, and dotCover. We also ship .NET packages to support our other product offerings, such as YouTrack and TeamCity. What’s visible in the dashboard above is our unique versioning approach, which uses years to denote the current release. For example, you’ll see the version prefix
2020.*.* used frequently this year. What’s also amazing is that JetBrains has been part of the .NET OSS ecosystem for over eight years.
Next, we were curious, outside of Microsoft, who in our community contributes the most to the tags
aspnetcore. Here is where things get interesting, as we see our Asian friends contributing a good portion of the 2,890 packages to the ASP.NET ecosystem.
Developers in China make up 50% of the packages with the
aspnetcore tags. Chinese is likely an uncommon language for native English speaking developers, so these packages are less frequently used outside of China. These publishers are using the tag
applicationframework heavily, which leads us to think many of these packages are part of a philosophical approach to building web applications.
All this makes us wonder what other sub-cultures can be uncovered simply by looking at NuGet’s data?
Xamarin is a cross-platform approach to building mobile applications targeting operating systems like iOS, Android, UWP, and many more. Let’s see what the search query of
tags: xamarin* returns.
The mobile ecosystem is healthy and alive, with a head-spinning 6,393 packages. The trajectory of growth from 0 to over 3000 published packages in a given timeframe is impressive. Additionally, what’s different about mobile development is the support for varying target frameworks, which almost borders on the absurd.
Here’s the complete picture for the Xamarin community.
It’s hard to argue against C# being the dominant language in the .NET ecosystem, but that’s not to say that F# and VB.NET developers don’t love and champion their languages. Let’s take a look at the two sibling languages and their contributions to NuGet. Let’s start with F#. We’ll filter our results using the tag
What stands out immediately with the F# dashboard is its continued publishing momentum. This momentum is even more astounding, as it is driven mostly by community members, with Ryan Riley contributing a whopping 21.96% of unique packages.
In the last 30 days, the F# community has focused on releasing Fable and Fantomas. We recently invited core contributor Florian Verdonck to give a talk about Fantomas to JetBrains .NET Days attendees.
Well, what about VB.NET? Let’s filter by the tag of
vb.net and see what we get.
We can see that the Visual Basic packages come from software vendors like Evo PDF Software, HiQPDF software, and ComponentPro Software. Seeing VB.NET has a following in enterprise development circles, it makes sense to see more business-focused packages.
The data we have is a little sparse when it comes to project licenses. The NuGet spec allows folks to set their license URL, which makes determining the precise value difficult. For this section, we relied on a keyword in the URL string itself, so take this data with a healthy dose of skepticism. For example, if the license URL contains
MIT, we assume the project has chosen the MIT license.
Looking at the packages, we see that the MIT and Apache licenses are the most commonly specified. The Custom license could contain MIT, Apache, and BSD variants, but we chose not to follow the URLs to determine the exact values. We also see many publishers leave the license URL value empty altogether, which can be problematic for consumers trying to choose a legally-usable package.
We’ve just started to scratch the surface around the NuGet ecosystem. Sometimes trends are self-evident, and other times we need to have a starting point to query further. Like we mentioned at the start of this post, this information is publisher only, giving us an exciting yet admittedly incomplete picture of our community. What was most exciting to see is the broader .NET community’s sub-cultures, programming languages, platforms, and geography.
If you folks have any interesting questions you’d like to see answered by this data, please leave a comment below. If you see something in the information we missed, again, we’d love to hear from you. Thank you for reading, and please share this post and discuss it with your developer friends.