Dotnet logo

.NET Tools

Essential productivity kit for .NET and game developers

.NET Tools

Getting started with ML.NET

Machine learning continues to be a hot topic among developers and non-developers. But, regardless of your opinion, our AI-powered companions are here to stay with applied instances of machine learning models such as ChatGPT, GitHub Copilot, and Midjourney, captivating millions of users worldwide. While it may seem like a mysterious black box of magic, many of these models operate on a combination of basic tenets of machine learning: data mining, algorithms, and optimization.

While Iโ€™m no machine learning expert, Iโ€™ve explored the space and have a working understanding of its application in .NET. In this post, Iโ€™ll introduce you to ML.NET, a library developed by Microsoft to train, optimize, and deploy trained models based on your datasets.

Types of Machine Learning

Three recognized categories of machine learning models are Supervised, Unsupervised, and Semi-supervised. Understanding your ultimate goal can help you pick the appropriate approach for your needs.

With Supervised machine learning, you typically need to spend the majority of your effort curating a dataset. This learning method involves labeling and cleaning datasets to ensure the information you train your model on is as accurate as possible. The adage of โ€œGarbage-in, Garbage-outโ€ very much holds here. 

In a supervised training session, you use some of your data to train the model while using another percentage to validate the prediction results. Finding the best fit is crucial; more accurate and clean data is typically better. Itโ€™s common to see text-based datasets in the order of gigabytes.

Once your data is labeled, you can build neural networks, linear regressions, logistic regression, random forest models, or other approaches. These models are what power recommendation engines youโ€™d see on your favorite streaming service or online shopping outlets.

You can use unsupervised learning to determine patterns in unlabeled datasets. Using these techniques, you can uncover the information you werenโ€™t aware was there. You can use these models for pattern and image recognition. If youโ€™ve had to prove youโ€™re not a robot, youโ€™ve encountered (and trained) these models.

Finally, Semi-supervised learning mixes the previously-mentioned approaches to provide an unsupervised learning environment with guard rails. The IBM documentation states:

โ€œDuring training, it uses a smaller labeled data set to guide classification and feature extraction from a larger, unlabeled data set. Semi-supervised learning can solve the problem of not having enough labeled data for a supervised learning algorithm. It also helps if itโ€™s too costly to label enough data.โ€

IBM documentation

Todayโ€™s typical machine learning applications include speech recognition, image detection, chatbots, recommendation engines, and fraud detection. As a result, the likelihood of using a model in your daily activities is almost inevitable.

Regardless of what approach you ultimately decide on, youโ€™ll still have to do a lot of work with data and think critically about your modelsโ€™ output. Just because a machine does it doesnโ€™t mean it absolves you of the ethical implications of your model.

What is ML.NET?

ML.NET is an open-source machine learning framework for .NET applications. It is a set of APIs that can help you train, build, and ship custom machine-learning models. In addition to making custom models, you can also use ML.NET to import models from other ecosystems, using formats such as Open Neural Network Exchange (ONNX) specification, TensorFlow, or Infer.NET

These ecosystems have rich pre-trained models for image classification, object detection, and speech recognition. Starting with existing models and optimizing is expected in the machine-learning space, and ML.NET makes that straightforward. Most teams will lack the resources to train current-generation models in these areas, so fine-tuning existing models allows them to benefit from the knowledge captured by these models while adapting them to their own problem space.

You can use ML.NET from C# and F# applications, in multiple host environments, including desktop, mobile, and web.

ML.NET also includes a utility named AutoML. AutoML allows you to easily provide a dataset to a CLI interface and quickly choose your intent, train a model, and verify its predictive outcomes. Once complete, AutoML can also generate boilerplate code to consume your new model in an existing application.

Your First ML.NET Application

In this sample, youโ€™ll use ML.NET to perform sentiment analysis. I adapted this from the original tutorial on Microsoft Documentation, clarifying the steps required to train, fit, and use a model. Youโ€™ll be using .NET 7 and Spectre.Console to build a neat predictive REPL.

A personal word of caution, ML.NET was written by data scientists for data scientists. Therefore, some code constructs may feel idiomatically strange for many C# developers.

Start by creating a new Console Application and add the dependencies of ML.NET and Spectre.Console. The packages are Microsoft.ML and Spectre.Console respectively.

Every machine learning model starts with data. In this tutorial, youโ€™ll use 1,000 lines of yelp reviews to build a model to predict if new reviews skew positively or negatively. Hereโ€™s a piece of the tab-delimited Yelp sentiment dataset, which contains restaurant reviews. 

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
Wow... Loved this place. 1
Crust is not good. 0
Not tasty and the texture was just nasty. 0
Wow... Loved this place. 1 Crust is not good. 0 Not tasty and the texture was just nasty. 0
Wow... Loved this place.	1
Crust is not good.	0
Not tasty and the texture was just nasty.	0

Download the Yelp review sentiment dataset and add it to your console application as a file to be copied to your output directory. The creators have already labeled the data with positive and negative labels. However, feel free to look at it in your favorite spreadsheet editor if compelled.  

Once complete, your .csproj file should look like the following, with version numbers possibly being different.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.ML" Version="2.0.0" />
<PackageReference Include="Spectre.Console" Version="0.45.0" />
</ItemGroup>
<ItemGroup>
<None Update="yelp_labelled.txt">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
</Project>
<Project Sdk="Microsoft.NET.Sdk"> <PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>net7.0</TargetFramework> <ImplicitUsings>enable</ImplicitUsings> <Nullable>enable</Nullable> </PropertyGroup> <ItemGroup> <PackageReference Include="Microsoft.ML" Version="2.0.0" /> <PackageReference Include="Spectre.Console" Version="0.45.0" /> </ItemGroup> <ItemGroup> <None Update="yelp_labelled.txt"> <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> </None> </ItemGroup> </Project>
<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net7.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
    </PropertyGroup>

    <ItemGroup>
      <PackageReference Include="Microsoft.ML" Version="2.0.0" />
      <PackageReference Include="Spectre.Console" Version="0.45.0" />
    </ItemGroup>

    <ItemGroup>
      <None Update="yelp_labelled.txt">
        <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      </None>
    </ItemGroup>

</Project>

Letโ€™s start writing some code. Every ML.NET application begins with an MLContext. If youโ€™re familiar with Entity Framework Core, you can think of this instance as your โ€œUnit of Workโ€. It will contain all your data, the trained model, and its prediction statistics.

At the beginning of the file, add the following lines.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
using Microsoft.ML;
using Microsoft.ML.Data;
using Spectre.Console;
using Console = Spectre.Console.AnsiConsole;
var ctx = new MLContext();
using Microsoft.ML; using Microsoft.ML.Data; using Spectre.Console; using Console = Spectre.Console.AnsiConsole; var ctx = new MLContext();
using Microsoft.ML;
using Microsoft.ML.Data;
using Spectre.Console;
using Console = Spectre.Console.AnsiConsole;

var ctx = new MLContext();

Our next step is to load our sentiment data from yelp_labelled.txt. Then, immediately below our context, add the following line.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// load data
var dataView = ctx.Data
.LoadFromTextFile<SentimentData>("yelp_labelled.txt");
// load data var dataView = ctx.Data .LoadFromTextFile<SentimentData>("yelp_labelled.txt");
// load data
var dataView = ctx.Data
    .LoadFromTextFile<SentimentData>("yelp_labelled.txt");

By default, ML.NET expects tab-delimited files, but there are additional parameters to allow for different formats and handling of headers. Next, youโ€™ll need an object to map our data from the CSV to our .NET application. Add the following types to the end of your

Program.cs
Program.cs.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class SentimentData
{
[LoadColumn(0)] public string? Text;
[LoadColumn(1), ColumnName("Label")] public bool Sentiment;
}
class SentimentPrediction : SentimentData
{
[ColumnName("PredictedLabel")] public bool Prediction { get; set; }
public float Probability { get; set; }
public float Score { get; set; }
}
class SentimentData { [LoadColumn(0)] public string? Text; [LoadColumn(1), ColumnName("Label")] public bool Sentiment; } class SentimentPrediction : SentimentData { [ColumnName("PredictedLabel")] public bool Prediction { get; set; } public float Probability { get; set; } public float Score { get; set; } }
class SentimentData
{
    [LoadColumn(0)] public string? Text;
    [LoadColumn(1), ColumnName("Label")] public bool Sentiment;
}

class SentimentPrediction : SentimentData
{
    [ColumnName("PredictedLabel")] public bool Prediction { get; set; }
    public float Probability { get; set; }
    public float Score { get; set; }
}

You may have noticed these types include attributes. These attributes help ML.NET set the data in the correct location and, in the case of ColumnName, apply metadata that youโ€™ll use to train the model. There are a few conventional names, but you can explicitly pass arguments to change the training process in most cases.

Our next step is to split our data into two parts: Training and Testing. ML.NET offers a data structure known as TrainTestData, which has two IDataView properties of TrainSet and TestSet. You can decide how much data each collection will contain, but letโ€™s choose 20% test data for this sample.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// split data into testing set
var splitDataView = ctx.Data
.TrainTestSplit(dataView, testFraction: 0.2);
// split data into testing set var splitDataView = ctx.Data .TrainTestSplit(dataView, testFraction: 0.2);
// split data into testing set
var splitDataView = ctx.Data
    .TrainTestSplit(dataView, testFraction: 0.2);

Jodie Burchell, our resident data scientist, recommends that you create individual sets for training, testing, and validation as you begin to train models. Train sets are for training, validation for assessing the performance of multiple competing models, and test sets for checking how your model will perform in real-world settings. This ensures reproducible results.

Now that you have the test data, youโ€™ll need to build a training model. First, you should specify the training pipeline for your data. Since youโ€™re doing sentiment analysis, a Binary Classification makes the most sense here: reviews are either positive or negative. Add the following line below the splitDataView variable.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// Build model
var estimator = ctx.Transforms.Text
.FeaturizeText(
outputColumnName: "Features",
inputColumnName: nameof(SentimentData.Text)
).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features"));
// Build model var estimator = ctx.Transforms.Text .FeaturizeText( outputColumnName: "Features", inputColumnName: nameof(SentimentData.Text) ).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features"));
// Build model
var estimator = ctx.Transforms.Text
    .FeaturizeText(
        outputColumnName: "Features",
        inputColumnName: nameof(SentimentData.Text)
    ).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features"));

Here youโ€™re setting up a processing pipeline to take the tab-delimited information and โ€œFeaturizeโ€ it. Featurizing text extracts values meant to represent the data. The most simple form of featurizing is taking all the words in the collection and indicating whether the text contains specific words. This is known as count vectorization. Once featurized, you can then pass the values over to be processed by a trainer.

You can choose from multiple trainers, and you should experiment to find the one with the best outcome. Youโ€™ll use the SdcaLogisticRegression trainer in this sample because it yields the most accurate result.

So youโ€™ve set up your processing pipeline. Now itโ€™s time to Fit our data and test your prediction accuracy.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// Train model
ITransformer model = default!;
var rule = new Rule("Create and Train Model");
Console
.Live(rule)
.Start(console =>
{
// training happens here
model = estimator.Fit(splitDataView.TrainSet);
var predictions = model.Transform(splitDataView.TestSet);
rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy.";
console.Refresh();
// evaluate the accuracy of our model
var metrics = ctx.BinaryClassification.Evaluate(predictions);
var table = new Table()
.MinimalBorder()
.Title("๐Ÿ’ฏ Model Accuracy");
table.AddColumns("Accuracy", "Auc", "F1Score");
table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}");
console.UpdateTarget(table);
console.Refresh();
});
// Train model ITransformer model = default!; var rule = new Rule("Create and Train Model"); Console .Live(rule) .Start(console => { // training happens here model = estimator.Fit(splitDataView.TrainSet); var predictions = model.Transform(splitDataView.TestSet); rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy."; console.Refresh(); // evaluate the accuracy of our model var metrics = ctx.BinaryClassification.Evaluate(predictions); var table = new Table() .MinimalBorder() .Title("๐Ÿ’ฏ Model Accuracy"); table.AddColumns("Accuracy", "Auc", "F1Score"); table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}"); console.UpdateTarget(table); console.Refresh(); });
// Train model
ITransformer model = default!;

var rule = new Rule("Create and Train Model");
Console
    .Live(rule)
    .Start(console =>
    {
        // training happens here
        model = estimator.Fit(splitDataView.TrainSet);
        var predictions = model.Transform(splitDataView.TestSet);

        rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy.";
        console.Refresh();

        // evaluate the accuracy of our model
        var metrics = ctx.BinaryClassification.Evaluate(predictions);

        var table = new Table()
            .MinimalBorder()
            .Title("๐Ÿ’ฏ Model Accuracy");
        table.AddColumns("Accuracy", "Auc", "F1Score");
        table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}");

        console.UpdateTarget(table);
        console.Refresh();
    });

Youโ€™re using the TrainTestData instance to fit and test the prediction results. You can also ask the MLContext to evaluate the metrics of our classification. Finally, for visualization, you can write them to output using Spectre.Consoleโ€™s table. If youโ€™ve followed along correctly, you can now run your application, showing the following accuracy rate.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
๐Ÿ’ฏ Model Accuracy
Accuracy โ”‚ Auc โ”‚ F1Score
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
83.96% โ”‚ 90.06% โ”‚ 84.38%
๐Ÿ’ฏ Model Accuracy Accuracy โ”‚ Auc โ”‚ F1Score โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ 83.96% โ”‚ 90.06% โ”‚ 84.38%
       ๐Ÿ’ฏ Model Accuracy                                                                                   
                                                                                                           
  Accuracy โ”‚ Auc โ”‚ F1Score                                                                              
 โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                                             
  83.96% โ”‚ 90.06% โ”‚ 84.38%    

An accuracy of 83.96% could be better, but our data set is limited to 1,000 rows, with 800 records to train our model. Your prediction accuracy may be different based on the randomization of training and test data. That said, itโ€™s enough for the use case of this demo.

Now that you have a trained model, letโ€™s add our REPL experience. In the final step, add the following code after the previous code outside the Start method.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
while (true)
{
var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?");
var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);
var input = new SentimentData { Text = text };
var result = engine.Predict(input);
var style = result.Prediction
? (color: "green", emoji: "๐Ÿ‘")
: (color: "red", emoji: "๐Ÿ‘Ž");
Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] ");
}
while (true) { var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?"); var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model); var input = new SentimentData { Text = text }; var result = engine.Predict(input); var style = result.Prediction ? (color: "green", emoji: "๐Ÿ‘") : (color: "red", emoji: "๐Ÿ‘Ž"); Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] "); }
while (true)
{
    var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?");
    var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);

    var input = new SentimentData { Text = text };
    var result = engine.Predict(input);
    var style = result.Prediction
        ? (color: "green", emoji: "๐Ÿ‘")
        : (color: "red", emoji: "๐Ÿ‘Ž");

    Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] ");
}

Rerunning the application, you can exercise the trained sentiment analysis model.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
What's your review text? I love this Pizza
๐Ÿ‘ "I love this Pizza" (100%)
What's your review text? This lettuce is bad
๐Ÿ‘Ž "This lettuce is bad" (15%)
What's your review text? I love this Pizza ๐Ÿ‘ "I love this Pizza" (100%) What's your review text? This lettuce is bad ๐Ÿ‘Ž "This lettuce is bad" (15%)
What's your review text? I love this Pizza
๐Ÿ‘ "I love this Pizza" (100%) 
What's your review text? This lettuce is bad
๐Ÿ‘Ž "This lettuce is bad" (15%) 

Youโ€™ll get some false positives, but thatโ€™s based on the limited dataset. 

While you could re-train your model every time you need it, usually, datasets will be large, and training takes time. To make interacting with your trained model more efficient and speed up startup times, you can permanently save the trained model and load it later with a few lines of additional code: 

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// save to disk
ctx.Model.Save(model, dataView.Schema, "model.zip");
// load from disk
ctx.Model.Load("model.zip", out var schema);
// save to disk ctx.Model.Save(model, dataView.Schema, "model.zip"); // load from disk ctx.Model.Load("model.zip", out var schema);
// save to disk
ctx.Model.Save(model, dataView.Schema, "model.zip");


// load from disk
ctx.Model.Load("model.zip", out var schema);

You can also save models via streams to remote storage services, such as Azure blob storage or AWS S3 buckets.

Iโ€™ve included the full application here on GitHub and the complete Program.cs below.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
using Microsoft.ML;
using Microsoft.ML.Data;
using Spectre.Console;
using Console = Spectre.Console.AnsiConsole;
var ctx = new MLContext();
// load data
var dataView = ctx.Data
.LoadFromTextFile<SentimentData>("yelp_labelled.txt");
// split data into testing set
var splitDataView = ctx.Data
.TrainTestSplit(dataView, testFraction: 0.2);
// Build model
var estimator = ctx.Transforms.Text
.FeaturizeText(
outputColumnName: "Features",
inputColumnName: nameof(SentimentData.Text)
).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features"));
// Train model
ITransformer model = default!;
var rule = new Rule("Create and Train Model");
Console
.Live(rule)
.Start(console =>
{
// training happens here
model = estimator.Fit(splitDataView.TrainSet);
var predictions = model.Transform(splitDataView.TestSet);
rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy.";
console.Refresh();
// evaluate the accuracy of our model
var metrics = ctx.BinaryClassification.Evaluate(predictions);
var table = new Table()
.MinimalBorder()
.Title("๐Ÿ’ฏ Model Accuracy");
table.AddColumns("Accuracy", "Auc", "F1Score");
table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}");
console.UpdateTarget(table);
console.Refresh();
});
while (true)
{
var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?");
var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);
var input = new SentimentData { Text = text };
var result = engine.Predict(input);
var style = result.Prediction
? (color: "green", emoji: "๐Ÿ‘")
: (color: "red", emoji: "๐Ÿ‘Ž");
Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] ");
}
class SentimentData
{
[LoadColumn(0)] public string? Text;
[LoadColumn(1), ColumnName("Label")] public bool Sentiment;
}
class SentimentPrediction : SentimentData
{
[ColumnName("PredictedLabel")] public bool Prediction { get; set; }
public float Probability { get; set; }
public float Score { get; set; }
}
using Microsoft.ML; using Microsoft.ML.Data; using Spectre.Console; using Console = Spectre.Console.AnsiConsole; var ctx = new MLContext(); // load data var dataView = ctx.Data .LoadFromTextFile<SentimentData>("yelp_labelled.txt"); // split data into testing set var splitDataView = ctx.Data .TrainTestSplit(dataView, testFraction: 0.2); // Build model var estimator = ctx.Transforms.Text .FeaturizeText( outputColumnName: "Features", inputColumnName: nameof(SentimentData.Text) ).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features")); // Train model ITransformer model = default!; var rule = new Rule("Create and Train Model"); Console .Live(rule) .Start(console => { // training happens here model = estimator.Fit(splitDataView.TrainSet); var predictions = model.Transform(splitDataView.TestSet); rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy."; console.Refresh(); // evaluate the accuracy of our model var metrics = ctx.BinaryClassification.Evaluate(predictions); var table = new Table() .MinimalBorder() .Title("๐Ÿ’ฏ Model Accuracy"); table.AddColumns("Accuracy", "Auc", "F1Score"); table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}"); console.UpdateTarget(table); console.Refresh(); }); while (true) { var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?"); var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model); var input = new SentimentData { Text = text }; var result = engine.Predict(input); var style = result.Prediction ? (color: "green", emoji: "๐Ÿ‘") : (color: "red", emoji: "๐Ÿ‘Ž"); Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] "); } class SentimentData { [LoadColumn(0)] public string? Text; [LoadColumn(1), ColumnName("Label")] public bool Sentiment; } class SentimentPrediction : SentimentData { [ColumnName("PredictedLabel")] public bool Prediction { get; set; } public float Probability { get; set; } public float Score { get; set; } }
using Microsoft.ML;
using Microsoft.ML.Data;
using Spectre.Console;
using Console = Spectre.Console.AnsiConsole;

var ctx = new MLContext();

// load data
var dataView = ctx.Data
    .LoadFromTextFile<SentimentData>("yelp_labelled.txt");

// split data into testing set
var splitDataView = ctx.Data
    .TrainTestSplit(dataView, testFraction: 0.2);

// Build model
var estimator = ctx.Transforms.Text
    .FeaturizeText(
        outputColumnName: "Features",
        inputColumnName: nameof(SentimentData.Text)
    ).Append(ctx.BinaryClassification.Trainers.SdcaLogisticRegression(featureColumnName: "Features"));

// Train model
ITransformer model = default!;

var rule = new Rule("Create and Train Model");
Console
    .Live(rule)
    .Start(console =>
    {
        // training happens here
        model = estimator.Fit(splitDataView.TrainSet);
        var predictions = model.Transform(splitDataView.TestSet);

        rule.Title = "๐Ÿ Training Complete, Evaluating Accuracy.";
        console.Refresh();

        // evaluate the accuracy of our model
        var metrics = ctx.BinaryClassification.Evaluate(predictions);

        var table = new Table()
            .MinimalBorder()
            .Title("๐Ÿ’ฏ Model Accuracy");
        table.AddColumns("Accuracy", "Auc", "F1Score");
        table.AddRow($"{metrics.Accuracy:P2}", $"{metrics.AreaUnderRocCurve:P2}", $"{metrics.F1Score:P2}");

        console.UpdateTarget(table);
        console.Refresh();
    });

while (true)
{
    var text = AnsiConsole.Ask<string>("What's your [green]review text[/]?");
    var engine = ctx.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);

    var input = new SentimentData { Text = text };
    var result = engine.Predict(input);
    var style = result.Prediction
        ? (color: "green", emoji: "๐Ÿ‘")
        : (color: "red", emoji: "๐Ÿ‘Ž");

    Console.MarkupLine($"{style.emoji} [{style.color}]\"{text}\" ({result.Probability:P00})[/] ");
}

class SentimentData
{
    [LoadColumn(0)] public string? Text;
    [LoadColumn(1), ColumnName("Label")] public bool Sentiment;
}

class SentimentPrediction : SentimentData
{
    [ColumnName("PredictedLabel")] public bool Prediction { get; set; }
    public float Probability { get; set; }
    public float Score { get; set; }
}

Conclusion

Congratulations, you just wrote your first ML.NET-powered application. If youโ€™ve run the application, youโ€™ll notice the accuracy could be better. Accuracy depends on your dataset, labels, and algorithms. Thatโ€™s where โ€œscienceโ€ plays an essential part in building these models. Itโ€™s always important to continuously test and verify your results and tune your models as you get new information. ML.NET makes testing easy, and deploying and consuming models is just as straightforward.

Iโ€™d love to hear more about your ML.NET journey as you try to build your custom models and deploy them in real-world settings. As always, thanks for reading, and feel free to leave a comment below.

References

image description

Discover more