Train a deep learning image classification model with ML.NET and TensorFlow

This sample may be downloaded and built directly. However, for a succesful run, you must first unzip assets.zip in the project directory, and copy its subdirectories into the assets directory.

Source Code - Click to download

Understanding the problem

Image classification is a computer vision problem. Image classification takes an image as input and categorizes it into a prescribed class. This sample shows a .NET Core console application that trains a custom deep learning model using transfer learning, a pretrained image classification TensorFlow model and the ML.NET Image Classification API to classify images of concrete surfaces into one of two categories, cracked or uncracked.

 

Dataset

The datasets for this tutorial are from Maguire, Marc; Dorafshan, Sattar; and Thomas, Robert J., "SDNET2018: A concrete crack image dataset for machine learning applications" (2018). Browse all Datasets. Paper 48. https://digitalcommons.usu.edu/all_datasets/48

SDNET2018 is an image dataset that contains annotations for cracked and non-cracked concrete structures (bridge decks, walls, and pavement).

The data is organized in three subdirectories:

  • D contains bridge deck images
  • P contains pavement images
  • W contains wall images

Each of these subdirectories contains two additional prefixed subdirectories:

  • C is the prefix used for cracked surfaces
  • U is the prefix used for uncracked surfaces

In this sample, only bridge deck images are used.

Prepare Data

  1. Unzip the assets.zip directory in the project directory.
  2. Copy the subdirectories into the assets directory.
  3. Define the image data schema containing the image path and category the image belongs to. Create a class called ImageData.
C#
 
class ImageData
{
    public string ImagePath { get; set; }

    public string Label { get; set; }
}
  1. Define the input schema by creating the ModelInput class. The only columns/properties used for training and making predictions are the Image and LabelAsKey. The ImagePath and Label columns are there for convenience to access the original file name and text representation of the category it belongs to respectively.
C#
 
class ModelInput
{
    public byte[] Image { get; set; }
    
    public UInt32 LabelAsKey { get; set; }

    public string ImagePath { get; set; }

    public string Label { get; set; }
}
  1. Define the output schema by creating the ModelOutput class.
C#
 
class ModelOutput
{
    public string ImagePath { get; set; }

    public string Label { get; set; }

    public string PredictedLabel { get; set; }
}

Load the data

  1. Before loading the data, it needs to be formatted into a list of ImageInput objects. To do so, create a data loading utility method LoadImagesFromDirectory.
C#
 
public static IEnumerable<ImageData> LoadImagesFromDirectory(string folder, bool useFolderNameAsLabel = true)
{
    var files = Directory.GetFiles(folder, "*",
        searchOption: SearchOption.AllDirectories);

    foreach (var file in files)
    {
        if ((Path.GetExtension(file) != ".jpg") && (Path.GetExtension(file) != ".png"))
            continue;

        var label = Path.GetFileName(file);

        if (useFolderNameAsLabel)
            label = Directory.GetParent(file).Name;
        else
        {
            for (int index = 0; index < label.Length; index++)
            {
                if (!char.IsLetter(label[index]))
                {
                    label = label.Substring(0, index);
                    break;
                }
            }
        }

        yield return new ImageData()
        {
            ImagePath = file,
            Label = label
        };
    }
}
  1. Inside of your application, use the LoadImagesFromDirectory method to load the data.
C#
 
IEnumerable<ImageData> images = LoadImagesFromDirectory(folder: assetsRelativePath, useFolderNameAsLabel: true);
IDataView imageData = mlContext.Data.LoadFromEnumerable(images);

Preprocess the data

  1. Add variance to the data by shuffling it.
C#
 
IDataView shuffledData = mlContext.Data.ShuffleRows(imageData);
  1. Machine learning models expect input to be in numerical format. Therefore, some preprocessing needs to be done on the data prior to training. First, the label or value to predict is converted into a numerical value. Then, the images are loaded as a byte[].
C#
 
var preprocessingPipeline = mlContext.Transforms.Conversion.MapValueToKey(
        inputColumnName: "Label",
        outputColumnName: "LabelAsKey")
    .Append(mlContext.Transforms.LoadRawImageBytes(
        outputColumnName: "Image",
        imageFolder: assetsRelativePath,
        inputColumnName: "ImagePath"));
  1. Fit the data to the preprocessing pipeline.
C#
 
IDataView preProcessedData = preprocessingPipeline
                    .Fit(shuffledData)
                    .Transform(shuffledData);
  1. Create train/validation/test datasets to train and evaluate the model.
C#
 
TrainTestData trainSplit = mlContext.Data.TrainTestSplit(data: preProcessedData, testFraction: 0.3);
TrainTestData validationTestSplit = mlContext.Data.TrainTestSplit(trainSplit.TestSet);

IDataView trainSet = trainSplit.TrainSet;
IDataView validationSet = validationTestSplit.TrainSet;
IDataView testSet = validationTestSplit.TestSet;

Define the training pipeline

C#
 
var classifierOptions = new ImageClassificationTrainer.Options()
{
    FeatureColumnName = "Image",
    LabelColumnName = "LabelAsKey",
    ValidationSet = validationSet,
    Arch = ImageClassificationTrainer.Architecture.ResnetV2101,
    MetricsCallback = (metrics) => Console.WriteLine(metrics),
    TestOnTrainSet = false,
    ReuseTrainSetBottleneckCachedValues = true,
    ReuseValidationSetBottleneckCachedValues = true,
    WorkspacePath=workspaceRelativePath
};

var trainingPipeline = mlContext.MulticlassClassification.Trainers.ImageClassification(classifierOptions)
    .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));

Train the model

Apply the data to the training pipeline.

 
ITransformer trainedModel = trainingPipeline.Fit(trainSet);

Use the model

  1. Create a utility method to display predictions.
C#
 
private static void OutputPrediction(ModelOutput prediction)
{
    string imageName = Path.GetFileName(prediction.ImagePath);
    Console.WriteLine($"Image: {imageName} | Actual Value: {prediction.Label} | Predicted Value: {prediction.PredictedLabel}");
}

Classify a single image

  1. Make predictions on the test set using the trained model. Create a utility method called ClassifySingleImage.
C#
 
public static void ClassifySingleImage(MLContext mlContext, IDataView data, ITransformer trainedModel)
{
    PredictionEngine<ModelInput, ModelOutput> predictionEngine = mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(trainedModel);

    ModelInput image = mlContext.Data.CreateEnumerable<ModelInput>(data,reuseRowObject:true).First();

    ModelOutput prediction = predictionEngine.Predict(image);

    Console.WriteLine("Classifying single image");
    OutputPrediction(prediction);
}
  1. Use the ClassifySingleImage inside of your application.
C#
 
ClassifySingleImage(mlContext, testSet, trainedModel);

Classify multiple images

  1. Make predictions on the test set using the trained model. Create a utility method called ClassifyImages.
C#
 
public static void ClassifyImages(MLContext mlContext, IDataView data, ITransformer trainedModel)
{
    IDataView predictionData = trainedModel.Transform(data);

    IEnumerable<ModelOutput> predictions = mlContext.Data.CreateEnumerable<ModelOutput>(predictionData, reuseRowObject: true).Take(10);

    Console.WriteLine("Classifying multiple images");
    foreach (var prediction in predictions)
    {
        OutputPrediction(prediction);
    }
}
  1. Use the ClassifyImages inside of your application.
C#
 
ClassifySingleImage(mlContext, testSet, trainedModel);

Run the application

Run your console app. The output should be similar to that below. You may see warnings or processing messages, but these messages have been removed from the following results for clarity. For brevity, the output has been condensed.

Bottleneck phase

text
 
Phase: Bottleneck Computation, Dataset used:      Train, Image Index: 279
Phase: Bottleneck Computation, Dataset used:      Train, Image Index: 280
Phase: Bottleneck Computation, Dataset used: Validation, Image Index:   1
Phase: Bottleneck Computation, Dataset used: Validation, Image Index:   2

Training phase

text
 
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  21, Accuracy:  0.6797619
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  22, Accuracy:  0.7642857
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  23, Accuracy:  0.7916667

Classification Output

text
 
Classifying single image
Image: 7001-220.jpg | Actual Value: UD | Predicted Value: UD

Classifying multiple images
Image: 7001-220.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-163.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-210.jpg | Actual Value: UD | Predicted Value: UD
Image: 7004-125.jpg | Actual Value: CD | Predicted Value: UD
Image: 7001-170.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-77.jpg | Actual Value: UD | Predicted Value: UD

Improve the model

  • More Data: The more examples a model learns from, the better it performs. Download the full SDNET2018 dataset and use it to train.
  • Augment the data: A common technique to add variety to the data is to augment the data by taking an image and applying different transforms (rotate, flip, shift, crop). This adds more varied examples for the model to learn from.
  • Train for a longer time: The longer you train, the more tuned the model will be. Increasing the number of epochs may improve the performance of your model.
  • Experiment with the hyper-parameters: In addition to the parameters used in this tutorial, other parameters can be tuned to potentially improve performance. Changing the learning rate, which determines the magnitude of updates made to the model after each epoch may improve performance.
  • Use a different model architecture: Depending on what your data looks like, the model that can best learn its features may differ. If you're not satisfied with the performance of your model, try changing the architecture.

Ref: https://docs.microsoft.com/en-us/samples/dotnet/machinelearning-samples/mlnet-image-classification-transfer-learning/ 

ML.NET and Model Builder

ML.NET is an open-source, cross-platform machine learning framework for .NET developers. It enables integrating machine learning into your .NET apps without requiring you to leave the .NET ecosystem or even have a background in ML or data science.

We are excited to announce new versions of ML.NET and Model Builder!

In this post, we’ll cover the following items:

  1. Model Builder Preview
  2. ML.NET v1.5.5
  3. Virtual ML.NET Community Conference
  4. Feedback
  5. Get started and resources

Model Builder Preview

This preview brings a lot of big changes to Model Builder, and we’re excited to get your feedback on all the new features which include:

  • Config-based training with generated code-behind files
  • Restructured Advanced Data Options
  • Redesigned Consume step

You can sign up for the Preview at aka.ms/blog-mb-preview.

Config-based training with generated code-behind files

The Model Builder experience has been revamped! Now when you right-click on your project in Solution Explorer and Add > Machine Learning, the Add New Item Dialog opens, and you can add an ML.NET Model.

New Item Dialog in Visual Studio

After adding your model, the Model Builder UI opens, and a new item (an *.mbconfig file) shows up in the Solution Explorer.

Model Builder UI in Visual Studio

Close up of Solution Explorer in Visual Studio

At any point when using Model Builder, if you close out of the UI, you can double click on the *.mbconfig in Solution Explorer, and it will open the UI again to your last saved state.

After training, two files are generated under the *.mbconfig file:

Solution Explorer expanded with mbconfig in Visual Studio

  • Model.consumption.cs: This file contains the Model Input and Model Output schemas as well as the Predict function generated for consuming the model.
  • Model.training.cs: This file contains the training pipeline (data transforms, algorithm, algorithm hyperparameters) chosen by Model Builder to train the model. You can use this pipeline for re-training your model.
  • Model.zip: This is a serialized zip file which represents your trained ML.NET model.

Previously, these files were added as two new projects (a class library for model consumption code and a console app for the training pipeline). The new experience is similar to adding a new form in a Windows Forms application, where there are code-behind files behind the form and double clicking the form opens the designer.

If you open the *.mbconfig file, you can see that it is simply a JSON file with state information:

{
  "TrainingConfigurationVersion": 0,
  "TrainingTime": 10,
  "Scenario": {
    "ScenarioType": "Classification"
  },
  "DataSource": {
    "DataSourceType": "TabularFile",
    "FileName": "C:\Desktop\Datasets\yelp_labelled.txt",
    "Delimiter": "t",
    "DecimalMarker": ".",
    "HasHeader": true,
    "ColumnProperties": [
      {
        "ColumnName": "Comment",
        "ColumnPurpose": "Feature",
        "ColumnDataFormat": "String",
        "IsCategorical": false
      },
      {
        "ColumnName": "Sentiment",
        "ColumnPurpose": "Label",
        "ColumnDataFormat": "String",
        "IsCategorical": true
      }
    ]
  },
  "Environment": {
    "EnvironmentType": "LocalCPU"
  },
  "Artifact": {
    "Type": "LocalArtifact",
    "MLNetModelPath": "C:\source\repos\ConsoleApp8\ConsoleApp8\MLModel1.zip"
  },
  "RunHistory": {
    "Trials": [
      {
        "TrainerName": "AveragedPerceptronOva",
        "Score": 0.8059,
        "RuntimeInSeconds": 4.4
      }
    ],
    "Pipeline": "[{"EstimatorType":"MapValueToKey","Name":null,"Inputs":["Sentiment"],"Outputs":["Sentiment"]},{"EstimatorType":"FeaturizeText","Name":null,"Inputs":["Comment"],"Outputs":["Comment_tf"]},{"EstimatorType":"CopyColumns","Name":null,"Inputs":["Comment_tf"],"Outputs":["Features"]},{"EstimatorType":"NormalizeMinMax","Name":null,"Inputs":["Features"],"Outputs":["Features"]},{"LabelColumnName":"Sentiment","EstimatorType":"AveragedPerceptronOva","Name":null,"Inputs":null,"Outputs":null},{"EstimatorType":"MapKeyToValue","Name":null,"Inputs":["PredictedLabel"],"Outputs":["PredictedLabel"]}]",
    "MetricName": "MicroAccuracy"
  }
}

This new Model Builder experience brings many benefits. You can:

  • Specify the name of your model and generated code.
  • Have more than one Model Builder-generated model in a solution.
  • Save your state and come back to the last saved state. If you spend an hour training and close out of Model Builder, now you don’t have to start over and can just pick up where you left off.
  • Share the *.mbconfig file and collaborate on the same Model Builder instance via source control.
  • Use the same *.mbconfig file in Model Builder and the ML.NET CLI (coming soon!).

Restructured Advanced Data Options

In the last Model Builder release, we added advanced data options for data loading which gave you more control over column settings and data formatting.

In this release, we added several more options and reorganized the options to make selecting your column settings even easier:

  • Purpose: Choose whether the column is a Feature column, a Label column, or a column to Ignore during training.
  • Data type: Choose whether the data in the column is a String, Single, or Boolean.
  • Categorical: Choose whether the column is categorical or not.

Advanced Data Options in Model Builder

Redesigned Consume Step

We have redesigned the consume step to make a smooth transition from training and evaluating a model to using that model to make predictions in an end-user application.

A code snippet has been provided in the UI which demonstrates how to set up the Model Input as well as how to use the generated Predict function to return the predicted output.

Each Model Input property is filled in with sample data from the first row of your dataset. You can use the copy button in the top right of the box to copy the entire code snippet; then once you paste this code into your end-user application, you can modify the Model Input fields to get real data to feed into your model.

Consume step in Model Builder

Additionally, there is a new Sample project section which generates an application that uses your model and adds the project to your solution. In previous versions of Model Builder, a sample console app was automatically added to your solution; now you can choose whether you want to add a new project to use your model.

Currently, there is only the option to add a console app, but in the future, we plan to add support for Web APIs, Azure Functions, and more.

ML.NET v1.5.5

This release of ML.NET brings numerous bug fixes and enhancements as well as the following new features:

  • New API that accepts double type for the confidence level which helps when you need to have higher precision than an int will allow for. Thank you @esso23 for your contributions!
  • Support for export ValueMapping estimator to ONNX.
  • New API to specify if the output from TensorFlow is batched or not (previously ML.NET always assumed it was a batch amount which caused errors when that wasn’t true).

Check out the release notes for more details.

Virtual ML.NET Community Conference

On May 7th, the 2nd annual Virtual ML.NET Community Conference will kick off with 2 days of sessions on all things ML.NET, and we’re looking for speakers to talk about:

  • MLOps
  • Case studies and real-life use cases
  • Interactive computing with Jupyter
  • ML.NET interop (ONNX)
  • ML.NET and IoT devices
  • ML.NET in F#
  • Big Data and ML.NET
  • A journey from experimentation to production
  • Anything else ML.NET related you can think of!

This is a 100% free event, by the community, for the community.

Published By:

Bri Achtman - Program Manager, .NET

March 15th, 2021