What’s included in the Microsoft Modern Workplace?

As businesses evolve it is very easy for their business technology and processes to quickly become out of date and potentially result in inefficient operations that predominantly affect customer experience and profit margins.

The Workplace Evolution published by Harvard Business Review found that 78% of senior executives in enterprise businesses believe fostering a modern workplace strategy is essential. However, only 31% think their company is forward-thinking enough to do so.

Changing a fixed mindset into a forward-thinking one within any business is no easy task, and with the implications of remote working due to the pandemic, many businesses have undergone rapid transformation over the past few months by implementing Modern Workplace technology within their businesses. This enables their employees to be supported by the leading technology that enables them to work smarter and be more productive, collaborate with remote teams and reap the other many benefits that Microsoft 365 offers.

What’s included in the Microsoft Modern Workplace?

The Microsoft Modern Workplace is one that operates using the suite of Microsoft 365 technologies and productivity applications that harness the power of the Cloud.

Microsoft Modern Workplace applications improve employee productivity and satisfaction and create more seamless communication across the business whilst promoting collaboration and maintaining the security and integrity of systems and data.

 

What is a Modern Workplace?

The definition of a Modern Workplace is an operational setup which has been professionally designed to meet both the physical and technological needs of both your business and its employees.

A Modern Workplace drives company-wide business transformation by utilising the latest Microsoft technology to power and streamline business operations and empower employees to do their best work around the clock.

What’s included within the Microsoft Modern Workplace?

Microsoft 365, includes Office 365, Windows 10 Enterprise, and Enterprise Mobility + Security and a variety of productivity and collaboration tools designed to support modern ways of working, help facilitate digital transformation, and most importantly keep your business secure.

How does the Microsoft Modern Workplace facilitate automation?

Within Microsoft 365 is Microsoft Power Automate (formerly Flow) which enables you to implement both Digital and Robotic Process Automation (RPA) across the business. By automating tasks you can quickly boost productivity, giving employees more time to focus on innovation than administrative tasks. You can also automate time-consuming tasks using the built-in AI capabilities and integrate with over 100 applications such as Microsoft Dynamics 365, Twitter, MailChimp, Google Analytics etc.

How does the Microsoft Modern Workplace promote collaboration?

When internal and external teams come together and collaborate to solve a problem the magic happens. Facilitating collaboration can prove tricky especially with remote/shift workers or multi-site offices. However, the collaboration applications included within the Microsoft Modern Workplace such as Microsoft Teams, Microsoft Teams for Education, Office 365 and SharePoint enable employees to easily work together on documents no matter their location or time of day. Windows 10 Virtual Desktop takes this even further by allowing users to access their desktop from any device.

Microsoft Teams in particular is constantly evolving and adding new features to improve accessibility and user experience.

This boosts productivity, can raise morale and enables people within different teams to come together. The more flexible it is for employees to collaborate the more collaboration will occur.

How secure is the Microsoft Modern Workplace?

Within Microsoft 365, the security stack gives us the insights to proactively defend against advanced threats, such as malware, phishing, and zero-day attacks as well as identity, app, data, and device protection with Azure Active Directory, Microsoft Intune, and Windows Information Protection.

What are the reasons to adopt a Modern Workplace?

Here are some common reasons why businesses choose to adopt the Microsoft Modern Workplace over a traditional setup:

Disparate communication channels – employees communicate through a variety of different channels which enables messages to get lost in translation, there is limited visibility and a lack of control.

Stand-alone platforms – business management platforms are implemented that have very limited integration and automation with each other. This makes the risk of error high and increases workload.

Data/Team Silos – Teams work in silos with limited visibility to other parts of the business. Data is stored in numerous locations and access is restricted due to the setup.

Hardware – Desktop PCs are commonly used, not all employees have access to laptops or tablets which disables remote working.

Technology – The technology you have implemented has scalable limitations and does not completely meet the needs of your employees.

Security – There is limited or no device or cyber security management in place putting your business at risk in the event of a breach.

 

 Ref: https://www.microsoft.com/en-us/itshowcase/microsoft-365 

 

1C:Enterprise in the cloud

The concept of cloud services for business applications is as simple as moving the application servers from the on-premises network to the Internet. The end users continue working with the same software (either the native client or the web client); the only thing required is an Internet connection. They no longer need to log on to the local enterprise network (directly or through VPN). Moreover, if the enterprise uses the SaaS model, the end users do not need to worry about software administration and updates any longer—the cloud service provider hosting your application servers will manage these tasks.

fbec4c7ee596fba90ee8190d6538bafc.jpg
Eye catcher image: the author of this article illustrates the "1C:Enterprise in the cloud" concept by using simple objects: clouds, banner, aircraft, parachute.

1C:Enterprise applications support both HTTP and HTTPS connections, making for a seamless transition of 1C:Enterprise application servers to the Internet. That's all you need to create a basic 1C:Enterprise cloud solution.
31c433669d056541bb228bff95bd1726.png
The only difference between this scenario and the on-premises installation is the location of the application server. To provide a service to a new company, one needs at least a new 1C:Enterprise infobase and a physical computer or a virtual machine running 1C:Enterprise. Therefore, application administration costs grow with the number of companies that utilize the service.

Multitenancy and data separation

To reduce application administration costs, you can now have multiple companies work with a single application instance. Of course, the application must be designed for shared work. And the application business logic in the shared work scenario must be identical to the business logic in the scenario where each company uses its own application.

This application architecture type is known as multitenancy. We can explain the multitenancy concept using the following example: a regular application is a house that provides its infrastructure (walls, roof, water system, heating system, etc.) to a family that lives there. A multitenant application is a house with multiple apartments, each apartment having access to the same shared infrastructure that is implemented at the basic house level.

In the simplest terms, the goal of multitenancy is reducing application maintenance costs by pushing the infrastructure maintenance to a higher level. It is similar to reducing costs by selling standard out-of-box applications (that might require some customization) instead of writing applications for specific customers from scratch. The difference is that out-of-box solutions reduce the development cost, while cloud solutions reduce the maintenance cost.

One of the multitenancy aspects is the separation of application data. An application stores data of all companies in a single database, however, each company can only access its own data (such as business documents and employee lists). Some reference data (such as legislation and regulations) can be made available to all companies. 1C:Enterprise applications can use the data separation functionality provided by the 1C:Enterprise platform.

Cloud services

Recently, a group of 1C:Enterprise developers faced the task of developing a cloud service for leasing 1C:Enterprise applications based on SaaS model. Moreover, the service needed to be an out-of-box solution, a "cloud in the box" including everything the customer may need to deploy infrastructure for leasing 1C:Enterprise applications (or individual applications based on 1C:Enterprise).

So what is an ideal cloud service from the end user's point of view? A store with shelves filled with solutions: accounting, reporting, payroll and human resources, and so on. A customer fills their cart and pays at the cash desk (however, they pay a rent instead of a one-time payment). Then the customer decides which of their employees will have access to accounting, payroll, and HR, as well as other solutions.

What is an ideal cloud service from the service provider's point of view? Basically, it's a large store they own. They have to fill the shelves with goods (software), add new goods, and make sure the customers pay promptly. The service must also provide horizontal scalability, access to solution demos (test drive), and centralized user administration tools.

Of course one can implement all this functionality directly in the applications. However, this means duplicating a large amount of code. It is better to optimize the solutions by implementing their common functionality and administration tools in a product that will serve as an entry point for users of cloud services.
This is how 1cFresh technology was developed. 1C customers and partners use it in their SaaS services and private clouds. 1C Company has its own application lease service based on 1cFresh: 1cFresh.com and 1C:AccountingService (both in Russian).

The service functionality is divided between the following major components based on 1C:Enterprise and Java technologies:
  • Service website. A single entry point for all users.
  • Service manager. An administration and coordination tool governing all service components.
  • Application gateway. The component that provides horizontal scalability.
  • Service agent. The component that provides all utility functions, such as application version updates or backup creation.
  • Service forum. A forum for service users and service provider representatives.
  • Availability manager. The "Service temporarily unavailable" board that informs users about service unavailability or the unavailability of service parts, the board itself is available even if central service components have failed.
e866fde6eeece643ebe8369b4131e603.png
Simplified 1cFresh component chart (with some components omitted)

Let us review the major service components in detail.

Service website

The site that provides the interface for service users is written in Java. It serves as a store shelf where users can choose applications to rent and try their demos. In addition to that, it is where users register, create application user accounts, read news and browse service online help. The 1cFresh.com page (in Russian) is exactly the "out-of-box" website, without any customizations.

A service can include any number of 1C:Enterprise server clusters that run 1C:Enterprise applications. Each cluster is registered in the service manager. 1C:Enterprise servers can run on both Windows and Linux computers. For example, the service at 1cFresh.com uses both Windows servers (with MS SQL Server DBMS for storing application data) and Linux servers (with PostgreSQL).

The cloud service administrators access the service via the service manager user interface. They use it to add 1C:Enterprise servers and applications, update application versions, manage user accounts, and perform other administrative tasks. Some of the operations, such as application updates, are delegated to the service agent component. The service manager communicates with the service agent via a web service.

Service agent

The service agent is a 1C:Enterprise application. It performs administrative operations on the service infobases, which include application version updates, scheduled backups, and gathering service operation statistics.

Application gateway

The application gateway is written in Java. It is responsible for the horizontal scalability of the service. It redirects service users to appropriate application servers.

Service forum

The service forum is a location where service users and service provider representatives can discuss the service and the applications available in that service. It is written in Java.

Availability manager

Some service features or even the entire service might be temporarily unavailable to end users. For example, an application is usually unavailable to end users while it is being updated, or the entire service might be unavailable during maintenance hours. The availability manager is a 1C:Enterprise application that displays messages about the unavailability of service resources to website and forum users even if all other service components, including the central service manager component, are not available.

1C:Enterprise infobases

1C:Enterprise infobases store application data. New infobases are added to serve as parts of scalability units. Each scalability unit is deployed as a single module and contains the following parts:
  • 1C:Enterprise server cluster
  • DBMS server that stores infobase data
  • One or two web servers (two ensure fault tolerance) that process HTTP requests to infobases belonging to the scalability unit
A scalability unit failure only affects the customers that work with the infobases belonging to the unit.

More service facts
  • The service supports a technology similar to OpenID that allows storing user authentication data in a single database. Therefore, you can set up Single Sign-On for the service and its users will be able to access all their applications (for example, accounting or payroll calculation) and the forum with a single user name and password.
  • Users can transfer local 1C:Enterprise application data (for example, accounting records) to the cloud and back.
  • Users can create standalone workstations (file infobases stored at their local computers). They do not need the Internet or service connections to work with such infobases. At the same time, they can use the service data exchange functionality to synchronize their local data with the cloud.
  • Users can set up automatic data exchange between the applications published in the service (for example, between accounting and payroll calculation applications). This minimizes the efforts required to input data because data entered into one application becomes available for all other applications.
  • A backup creation system is available. A user can initiate backup creation at any time, or schedule daily, monthly, and yearly backups.
  • The 1cFresh technology includes the data delivery feature. The service manager stores reference data that is always up-to-date and provides this data to all applications.
  • A service can run multiple versions of any application. These applications can use multiple 1C:Enterprise platform versions.
  • Users can update the applications that they use to access their infobases.
  • 1cFresh features the following tools for error identification and analysis:
    • Gathering infobase error data.
    • Writing this data to the error log of the service manager infobase.
    • Viewing error details. A service administrator can view the entire error log or filter it by infobase or application.
  • 1cFresh includes the "showcase" feature, which provides the option to run multiple cloud services on a single platform. A showcase is an Internet resource that provides services. From the user’s point of view, a showcase is an independent website with business applications. For example, a single website platform that belongs to a service provider can run several sites located in different domains, one featuring a showcase of small business applications, another with applications for public institutions, the third one with applications for medical institutions, and so on. Also, service providers can advertise each resource as an independent service.
  • The service includes a Feedback center where users can submit their feedback and feature requests. It is implemented as an application module that features a list of user posts and comments to these posts, voting for posts and commenting on them, as well as submitting feedback and feature requests. To include this functionality in an application, an administrator simply enables the subsystem where it is implemented.
  • Subscribers of 1cFresh-based services pay a subscription fee to the service provider. Flexible pricing options are available.
  • The service provides a wide range of options for viewing its usage statistics. One can use the statistics to determine the service load, obtain average key indicators of stable service work (for future evaluation of possible deviations), determine the periods of minimum and maximum service load (for planning scheduled maintenance), and more.
  • The option to gather application business statistics is available. Application developers can use the statistics to improve their understanding of application usage scenarios and to identify bottlenecks.

Applications compatible with cloud services

To be able to run in the cloud, 1C:Enterprise applications must meet the SaaS mode requirements. For the detailed list of requirements, see 1cFresh documentation.

The requirements include the use of data separation functionality, as well as implementations of remote administration functions, data import and export, backup generation, and more. Cloud applications must offer identical behavior between the thin client and the web client, they must not include OS-dependent features (because a cloud server might run either Windows or Linux), lengthy server calls are not recommended, and so on.

Cloud applications can have mobile application clients developed using the mobile 1C:Enterprise platform.

Cloud application customization

Thousands of users from hundreds of companies can work with a single cloud application instance. They might require custom application features. Thus, service providers need tools to customize applications for certain user groups.

1cFresh provides two kinds of application customization tools:
  • External reports and external data processors. These customization tools, well known to 1C:Enterprise application users, were enhanced for cloud operations.
  • Configuration extensions. Extensions are plug-ins that add functionality to applications without changing them. Currently, extensions do not support all of the configuration objects, however, we are working on making this function available.

Summary

We believe cloud service development to be a promising trend worthy of significant resource investment.
1cFresh cloud service fully complies with the cloud service definition provided by IDC:

94fd87240eb58fec1d1ec8f11f5eb0e1.png
According to the Gartner definition, 1cFresh service type is an Application Platform that operates as a Service (aPaaS): "Application Platform as a Service (aPaaS) is a form of PaaS that provides a platform to support app development, deployment and execution in the cloud." (source)

 

1C Developer team

28.06.2019 

Train a deep learning image classification model with ML.NET and TensorFlow

This sample may be downloaded and built directly. However, for a succesful run, you must first unzip assets.zip in the project directory, and copy its subdirectories into the assets directory.

Source Code - Click to download

Understanding the problem

Image classification is a computer vision problem. Image classification takes an image as input and categorizes it into a prescribed class. This sample shows a .NET Core console application that trains a custom deep learning model using transfer learning, a pretrained image classification TensorFlow model and the ML.NET Image Classification API to classify images of concrete surfaces into one of two categories, cracked or uncracked.

 

Dataset

The datasets for this tutorial are from Maguire, Marc; Dorafshan, Sattar; and Thomas, Robert J., "SDNET2018: A concrete crack image dataset for machine learning applications" (2018). Browse all Datasets. Paper 48. https://digitalcommons.usu.edu/all_datasets/48

SDNET2018 is an image dataset that contains annotations for cracked and non-cracked concrete structures (bridge decks, walls, and pavement).

The data is organized in three subdirectories:

  • D contains bridge deck images
  • P contains pavement images
  • W contains wall images

Each of these subdirectories contains two additional prefixed subdirectories:

  • C is the prefix used for cracked surfaces
  • U is the prefix used for uncracked surfaces

In this sample, only bridge deck images are used.

Prepare Data

  1. Unzip the assets.zip directory in the project directory.
  2. Copy the subdirectories into the assets directory.
  3. Define the image data schema containing the image path and category the image belongs to. Create a class called ImageData.
C#
 
class ImageData
{
    public string ImagePath { get; set; }

    public string Label { get; set; }
}
  1. Define the input schema by creating the ModelInput class. The only columns/properties used for training and making predictions are the Image and LabelAsKey. The ImagePath and Label columns are there for convenience to access the original file name and text representation of the category it belongs to respectively.
C#
 
class ModelInput
{
    public byte[] Image { get; set; }
    
    public UInt32 LabelAsKey { get; set; }

    public string ImagePath { get; set; }

    public string Label { get; set; }
}
  1. Define the output schema by creating the ModelOutput class.
C#
 
class ModelOutput
{
    public string ImagePath { get; set; }

    public string Label { get; set; }

    public string PredictedLabel { get; set; }
}

Load the data

  1. Before loading the data, it needs to be formatted into a list of ImageInput objects. To do so, create a data loading utility method LoadImagesFromDirectory.
C#
 
public static IEnumerable<ImageData> LoadImagesFromDirectory(string folder, bool useFolderNameAsLabel = true)
{
    var files = Directory.GetFiles(folder, "*",
        searchOption: SearchOption.AllDirectories);

    foreach (var file in files)
    {
        if ((Path.GetExtension(file) != ".jpg") && (Path.GetExtension(file) != ".png"))
            continue;

        var label = Path.GetFileName(file);

        if (useFolderNameAsLabel)
            label = Directory.GetParent(file).Name;
        else
        {
            for (int index = 0; index < label.Length; index++)
            {
                if (!char.IsLetter(label[index]))
                {
                    label = label.Substring(0, index);
                    break;
                }
            }
        }

        yield return new ImageData()
        {
            ImagePath = file,
            Label = label
        };
    }
}
  1. Inside of your application, use the LoadImagesFromDirectory method to load the data.
C#
 
IEnumerable<ImageData> images = LoadImagesFromDirectory(folder: assetsRelativePath, useFolderNameAsLabel: true);
IDataView imageData = mlContext.Data.LoadFromEnumerable(images);

Preprocess the data

  1. Add variance to the data by shuffling it.
C#
 
IDataView shuffledData = mlContext.Data.ShuffleRows(imageData);
  1. Machine learning models expect input to be in numerical format. Therefore, some preprocessing needs to be done on the data prior to training. First, the label or value to predict is converted into a numerical value. Then, the images are loaded as a byte[].
C#
 
var preprocessingPipeline = mlContext.Transforms.Conversion.MapValueToKey(
        inputColumnName: "Label",
        outputColumnName: "LabelAsKey")
    .Append(mlContext.Transforms.LoadRawImageBytes(
        outputColumnName: "Image",
        imageFolder: assetsRelativePath,
        inputColumnName: "ImagePath"));
  1. Fit the data to the preprocessing pipeline.
C#
 
IDataView preProcessedData = preprocessingPipeline
                    .Fit(shuffledData)
                    .Transform(shuffledData);
  1. Create train/validation/test datasets to train and evaluate the model.
C#
 
TrainTestData trainSplit = mlContext.Data.TrainTestSplit(data: preProcessedData, testFraction: 0.3);
TrainTestData validationTestSplit = mlContext.Data.TrainTestSplit(trainSplit.TestSet);

IDataView trainSet = trainSplit.TrainSet;
IDataView validationSet = validationTestSplit.TrainSet;
IDataView testSet = validationTestSplit.TestSet;

Define the training pipeline

C#
 
var classifierOptions = new ImageClassificationTrainer.Options()
{
    FeatureColumnName = "Image",
    LabelColumnName = "LabelAsKey",
    ValidationSet = validationSet,
    Arch = ImageClassificationTrainer.Architecture.ResnetV2101,
    MetricsCallback = (metrics) => Console.WriteLine(metrics),
    TestOnTrainSet = false,
    ReuseTrainSetBottleneckCachedValues = true,
    ReuseValidationSetBottleneckCachedValues = true,
    WorkspacePath=workspaceRelativePath
};

var trainingPipeline = mlContext.MulticlassClassification.Trainers.ImageClassification(classifierOptions)
    .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));

Train the model

Apply the data to the training pipeline.

 
ITransformer trainedModel = trainingPipeline.Fit(trainSet);

Use the model

  1. Create a utility method to display predictions.
C#
 
private static void OutputPrediction(ModelOutput prediction)
{
    string imageName = Path.GetFileName(prediction.ImagePath);
    Console.WriteLine($"Image: {imageName} | Actual Value: {prediction.Label} | Predicted Value: {prediction.PredictedLabel}");
}

Classify a single image

  1. Make predictions on the test set using the trained model. Create a utility method called ClassifySingleImage.
C#
 
public static void ClassifySingleImage(MLContext mlContext, IDataView data, ITransformer trainedModel)
{
    PredictionEngine<ModelInput, ModelOutput> predictionEngine = mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(trainedModel);

    ModelInput image = mlContext.Data.CreateEnumerable<ModelInput>(data,reuseRowObject:true).First();

    ModelOutput prediction = predictionEngine.Predict(image);

    Console.WriteLine("Classifying single image");
    OutputPrediction(prediction);
}
  1. Use the ClassifySingleImage inside of your application.
C#
 
ClassifySingleImage(mlContext, testSet, trainedModel);

Classify multiple images

  1. Make predictions on the test set using the trained model. Create a utility method called ClassifyImages.
C#
 
public static void ClassifyImages(MLContext mlContext, IDataView data, ITransformer trainedModel)
{
    IDataView predictionData = trainedModel.Transform(data);

    IEnumerable<ModelOutput> predictions = mlContext.Data.CreateEnumerable<ModelOutput>(predictionData, reuseRowObject: true).Take(10);

    Console.WriteLine("Classifying multiple images");
    foreach (var prediction in predictions)
    {
        OutputPrediction(prediction);
    }
}
  1. Use the ClassifyImages inside of your application.
C#
 
ClassifySingleImage(mlContext, testSet, trainedModel);

Run the application

Run your console app. The output should be similar to that below. You may see warnings or processing messages, but these messages have been removed from the following results for clarity. For brevity, the output has been condensed.

Bottleneck phase

text
 
Phase: Bottleneck Computation, Dataset used:      Train, Image Index: 279
Phase: Bottleneck Computation, Dataset used:      Train, Image Index: 280
Phase: Bottleneck Computation, Dataset used: Validation, Image Index:   1
Phase: Bottleneck Computation, Dataset used: Validation, Image Index:   2

Training phase

text
 
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  21, Accuracy:  0.6797619
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  22, Accuracy:  0.7642857
Phase: Training, Dataset used: Validation, Batch Processed Count:   6, Epoch:  23, Accuracy:  0.7916667

Classification Output

text
 
Classifying single image
Image: 7001-220.jpg | Actual Value: UD | Predicted Value: UD

Classifying multiple images
Image: 7001-220.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-163.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-210.jpg | Actual Value: UD | Predicted Value: UD
Image: 7004-125.jpg | Actual Value: CD | Predicted Value: UD
Image: 7001-170.jpg | Actual Value: UD | Predicted Value: UD
Image: 7001-77.jpg | Actual Value: UD | Predicted Value: UD

Improve the model

  • More Data: The more examples a model learns from, the better it performs. Download the full SDNET2018 dataset and use it to train.
  • Augment the data: A common technique to add variety to the data is to augment the data by taking an image and applying different transforms (rotate, flip, shift, crop). This adds more varied examples for the model to learn from.
  • Train for a longer time: The longer you train, the more tuned the model will be. Increasing the number of epochs may improve the performance of your model.
  • Experiment with the hyper-parameters: In addition to the parameters used in this tutorial, other parameters can be tuned to potentially improve performance. Changing the learning rate, which determines the magnitude of updates made to the model after each epoch may improve performance.
  • Use a different model architecture: Depending on what your data looks like, the model that can best learn its features may differ. If you're not satisfied with the performance of your model, try changing the architecture.

Ref: https://docs.microsoft.com/en-us/samples/dotnet/machinelearning-samples/mlnet-image-classification-transfer-learning/ 

ML.NET and Model Builder

ML.NET is an open-source, cross-platform machine learning framework for .NET developers. It enables integrating machine learning into your .NET apps without requiring you to leave the .NET ecosystem or even have a background in ML or data science.

We are excited to announce new versions of ML.NET and Model Builder!

In this post, we’ll cover the following items:

  1. Model Builder Preview
  2. ML.NET v1.5.5
  3. Virtual ML.NET Community Conference
  4. Feedback
  5. Get started and resources

Model Builder Preview

This preview brings a lot of big changes to Model Builder, and we’re excited to get your feedback on all the new features which include:

  • Config-based training with generated code-behind files
  • Restructured Advanced Data Options
  • Redesigned Consume step

You can sign up for the Preview at aka.ms/blog-mb-preview.

Config-based training with generated code-behind files

The Model Builder experience has been revamped! Now when you right-click on your project in Solution Explorer and Add > Machine Learning, the Add New Item Dialog opens, and you can add an ML.NET Model.

New Item Dialog in Visual Studio

After adding your model, the Model Builder UI opens, and a new item (an *.mbconfig file) shows up in the Solution Explorer.

Model Builder UI in Visual Studio

Close up of Solution Explorer in Visual Studio

At any point when using Model Builder, if you close out of the UI, you can double click on the *.mbconfig in Solution Explorer, and it will open the UI again to your last saved state.

After training, two files are generated under the *.mbconfig file:

Solution Explorer expanded with mbconfig in Visual Studio

  • Model.consumption.cs: This file contains the Model Input and Model Output schemas as well as the Predict function generated for consuming the model.
  • Model.training.cs: This file contains the training pipeline (data transforms, algorithm, algorithm hyperparameters) chosen by Model Builder to train the model. You can use this pipeline for re-training your model.
  • Model.zip: This is a serialized zip file which represents your trained ML.NET model.

Previously, these files were added as two new projects (a class library for model consumption code and a console app for the training pipeline). The new experience is similar to adding a new form in a Windows Forms application, where there are code-behind files behind the form and double clicking the form opens the designer.

If you open the *.mbconfig file, you can see that it is simply a JSON file with state information:

{
  "TrainingConfigurationVersion": 0,
  "TrainingTime": 10,
  "Scenario": {
    "ScenarioType": "Classification"
  },
  "DataSource": {
    "DataSourceType": "TabularFile",
    "FileName": "C:\Desktop\Datasets\yelp_labelled.txt",
    "Delimiter": "t",
    "DecimalMarker": ".",
    "HasHeader": true,
    "ColumnProperties": [
      {
        "ColumnName": "Comment",
        "ColumnPurpose": "Feature",
        "ColumnDataFormat": "String",
        "IsCategorical": false
      },
      {
        "ColumnName": "Sentiment",
        "ColumnPurpose": "Label",
        "ColumnDataFormat": "String",
        "IsCategorical": true
      }
    ]
  },
  "Environment": {
    "EnvironmentType": "LocalCPU"
  },
  "Artifact": {
    "Type": "LocalArtifact",
    "MLNetModelPath": "C:\source\repos\ConsoleApp8\ConsoleApp8\MLModel1.zip"
  },
  "RunHistory": {
    "Trials": [
      {
        "TrainerName": "AveragedPerceptronOva",
        "Score": 0.8059,
        "RuntimeInSeconds": 4.4
      }
    ],
    "Pipeline": "[{"EstimatorType":"MapValueToKey","Name":null,"Inputs":["Sentiment"],"Outputs":["Sentiment"]},{"EstimatorType":"FeaturizeText","Name":null,"Inputs":["Comment"],"Outputs":["Comment_tf"]},{"EstimatorType":"CopyColumns","Name":null,"Inputs":["Comment_tf"],"Outputs":["Features"]},{"EstimatorType":"NormalizeMinMax","Name":null,"Inputs":["Features"],"Outputs":["Features"]},{"LabelColumnName":"Sentiment","EstimatorType":"AveragedPerceptronOva","Name":null,"Inputs":null,"Outputs":null},{"EstimatorType":"MapKeyToValue","Name":null,"Inputs":["PredictedLabel"],"Outputs":["PredictedLabel"]}]",
    "MetricName": "MicroAccuracy"
  }
}

This new Model Builder experience brings many benefits. You can:

  • Specify the name of your model and generated code.
  • Have more than one Model Builder-generated model in a solution.
  • Save your state and come back to the last saved state. If you spend an hour training and close out of Model Builder, now you don’t have to start over and can just pick up where you left off.
  • Share the *.mbconfig file and collaborate on the same Model Builder instance via source control.
  • Use the same *.mbconfig file in Model Builder and the ML.NET CLI (coming soon!).

Restructured Advanced Data Options

In the last Model Builder release, we added advanced data options for data loading which gave you more control over column settings and data formatting.

In this release, we added several more options and reorganized the options to make selecting your column settings even easier:

  • Purpose: Choose whether the column is a Feature column, a Label column, or a column to Ignore during training.
  • Data type: Choose whether the data in the column is a String, Single, or Boolean.
  • Categorical: Choose whether the column is categorical or not.

Advanced Data Options in Model Builder

Redesigned Consume Step

We have redesigned the consume step to make a smooth transition from training and evaluating a model to using that model to make predictions in an end-user application.

A code snippet has been provided in the UI which demonstrates how to set up the Model Input as well as how to use the generated Predict function to return the predicted output.

Each Model Input property is filled in with sample data from the first row of your dataset. You can use the copy button in the top right of the box to copy the entire code snippet; then once you paste this code into your end-user application, you can modify the Model Input fields to get real data to feed into your model.

Consume step in Model Builder

Additionally, there is a new Sample project section which generates an application that uses your model and adds the project to your solution. In previous versions of Model Builder, a sample console app was automatically added to your solution; now you can choose whether you want to add a new project to use your model.

Currently, there is only the option to add a console app, but in the future, we plan to add support for Web APIs, Azure Functions, and more.

ML.NET v1.5.5

This release of ML.NET brings numerous bug fixes and enhancements as well as the following new features:

  • New API that accepts double type for the confidence level which helps when you need to have higher precision than an int will allow for. Thank you @esso23 for your contributions!
  • Support for export ValueMapping estimator to ONNX.
  • New API to specify if the output from TensorFlow is batched or not (previously ML.NET always assumed it was a batch amount which caused errors when that wasn’t true).

Check out the release notes for more details.

Virtual ML.NET Community Conference

On May 7th, the 2nd annual Virtual ML.NET Community Conference will kick off with 2 days of sessions on all things ML.NET, and we’re looking for speakers to talk about:

  • MLOps
  • Case studies and real-life use cases
  • Interactive computing with Jupyter
  • ML.NET interop (ONNX)
  • ML.NET and IoT devices
  • ML.NET in F#
  • Big Data and ML.NET
  • A journey from experimentation to production
  • Anything else ML.NET related you can think of!

This is a 100% free event, by the community, for the community.

Published By:

Bri Achtman - Program Manager, .NET

March 15th, 2021

The Blockchain Explained to Web Developers, Part 3: The Truth

Chains

After exploring the blockchain theory and using it for real, we now have a better understanding of its strengths and weaknesses. Surprisingly, most of our conclusions are very different from what you will read in the blogosphere. Maybe it’s because we don’t blindly relay the fascination caused by the huge valuations of BitCoin and others. Maybe it’s because the hard truth about the blockchain is that it’s not ready yet. Read on to understand our take on the blockchain, based on strong evidence.

The Technology Is Not Mature Enough

As explained in detail in the previous post in this series, developing Decentralized Apps over a blockchain is a pain. The development community is small, available code snippets don’t work, public tutorials are outdated, the libraries are crippled with bugs, developer tooling is lacking, bugs are silent, etc.

It’s not that the Ethereum developers and community are bad ; they’re amazing, and they’re pouring a lot of time and expertise into their tools. But building a blockchain framework is a huge amount of work, and they’re only halfway through. Ethereum hasn’t reached the point of usability yet. I’m confident that this will change in the future, but I don’t know if it’s a matter of months or years.

Tip: We haven’t developed DApps for Bitcoin, but I’ve heard it’s worse. Instead of using a JavaScript-like language (Solidity) for smart contracts, you must use an assembly-like language, which isn’t even Turing-complete. Yikes.

The consequence is that developers don’t want to work on blockchain projects - they find it very frustrating. If you force them to work with a technology they hate, they will leave. Since it’s extremely hard to find skilled developers these days, you should think twice before taking a chance on the blockchain.

The second consequence is that it’s impossible to estimate the time it will take to build a project on the blockchain. If you can’t estimate your costs, good luck building a Business Model on the blockchain.

Smart Contracts Can’t Call APIs

In our blockchain experimentation, everything a bit “smart” in the contract had to be moved to a plain old web service running outside of the blockchain, in a trusted environment. For instance, a smart contract can’t figure out if the person asking for an ad placement is the author of a pull request, because a smart contract can’t call the GitHub API. As a consequence, our smart contract keeps only a very minimal amount of logic, becoming, in fact, a dumb contract. It’s not because we wanted to, it’s because we couldn’t do otherwise.

By design, a blockchain is deterministic. That means that if you take the entire history of blocks, and replay it locally, you should end up with the same state as every other node. This forbids the call to external APIs, where responses may change over time, or according to who calls them.

Blockchains are walled gardens. You can execute a contract from the outside world, but a contract itself can’t require data from a source outside of the blockchain. If a smart contract needs external data, someone must push the data to the blockchain first. There is an effort to ease this process through a concept called Oracles. But Oracles need a reputation system and governance. So much for fully-automated contracts and disintermediation.

In the real world, very few applications work in isolation. All the applications we’ve built for the past 3 years relied on external APIs - for identity management, payment, live data source, image processing, storage, etc. The limited capabilities of smart contracts make them useless in real world situations.

You Need A PhD to Understand It

If you read through the first blog post of this series, you probably think that you have a good basic understanding of the blockchain. Now, go and read this article. I’m an average engineer with only 20 years of experience in Web Development, and I couldn’t understand anything after the Jurassic Park reference. Terms like “two-way-pegged blockchains”, “pre-determined Host Oracle Contract”, and sentences like “The M-S result, combined with our inability to feed (non-BB) a revelation mechanism, means that Oracles are out” make me fell like a first grader.

The blockchain concept is complex. Existing implementations rely on rare design patterns, that you don’t learn in college. The blockchain vocabulary is kabbalistic.

Developing decentralized apps on top of blockchains requires understanding too many complicated concepts to fit in an average developer’s brain. My opinion is that there are not enough highly skilled programmers to support the revolution promised by the blockchain. And there will never be, as long as it’s so hard to understand.

As a consequence, most Decentralized apps are very buggy. A recent article stated that smart contracts contain 1 bug every 10 lines of code, making Ethereum “candy for hackers”. It wouldn’t be such a big deal if fixing bugs was easy. Unfortunately, as we explained in the previous post, you can’t update a smart contract. You have to create a new contract, transfer all the data and pointers from the old contract to the new one, and wait for the blockchain to propagate the change. The old buggy contracts and transactions remain in the blockchain forever.

Developer Power

The blockchain authors suggest using “code as law”. This also means “bugs as law”, as every software contains bugs. These bugs can be used by smart developers (criminals, the NSA, etc.) to avoid playing by the rules. Bugs are very common, even in popular open-source projects. Bitcoin, for instance, suffered several critical bugs leading to “cybertheft”. So leaving the keys to developers also means giving extraordinary power to the mean developers.

I don’t want to go all FUD (Fear, Uncertainty and Doubt) on you, but the possible scenarios of a society governed by machines don’t all finish with a happy ending in my mind.

When machines control the world

And even if we don’t consider mean developers, giving the power to good developer is dangerous, too. The problem is that developers are irresponsible (no harm intended - I’m a developer myself). It’s not that they’re childish, it’s that nobody ever taught them to write the law.

Also, developers are not elected by the people. If you don’t agree with the direction that Bitcoin takes (favoring speculation rather than practical applications) too bad for you - there is nothing you can do to change that. This is currently happening: the Bitcoin network currently suffers a severe crisis, because of the disagreements between a few core developers.

The decisions of half a dozen developers may cause the collapse of a billion dollar market capitalization. But nobody will hold them accountable in case of failure.

Waste of Resources

A blockchain is not cost-effective at all. In fact, it’s a huge waste of resources.

Take data replication for instance. The blockchain replicates all transactions across all nodes. Engineers have long invented replication strategies with better space efficiency. Compare the Blockchain with RAID6 disk clustering for instance:

In a Blockchain network, 10 nodes of 1GB each allow for a total replicated data volume of 1GB. You can loose up to 9 nodes in the network, and yet be able to recover the entire data.

In a RAID6 pool, 10 hards disks of 1GB each allow for a total replicated data volume of 8GB. You can loose up to 2 HDD in the pool, and yet be able to recover the entire data.

Mining nodes require very expensive hardware, with high end GPU cards and a huge amount of memory.

And it’s not just about buying expensive hardware. 99.99% of the computing is just wasted. All miners compete to mine a block by running expensive Math challenges. In Bitcoin, only one node every 10 minutes wins, and is actually useful to the chain by creating a block. The computation done by all the other nodes is thrown away.

The Ethereum blockchain is trying to fix that: they plan to switch from a proof-of-work consensus algorithm to a proof-of-stake, which is much less resource intensive. But proof-of-stake also has drawbacks, such as giving more power to people or companies owning high amounts of cryptocurrency. Besides, it’s far from ready yet (expect it in at least a year from now).

When machines control the world

This waste of storage, CPU and memory translates into a huge waste of energy. According to a bitcoin mining-farm operator, energy consumption totaled 240kWh per bitcoin in 2014 (the equivalent of 16 gallons of gasoline). Mining farms are a distributed engine turning electricity into heat. A blockchain is, in short, an expensive radiator. Energy efficiency is a big deal in a globally warming planet.

Very Expensive

Who pays for all the wasted energy? The companies that publish and use smart contracts. Yes, that’s you, if you intend to run a business on the blockchain. When you pay for a transaction on the blockchain, you also pay 99.99% of the network running at full speed for nothing. That makes blockchain transactions expensive.

A million dollars in bank notes

An average BitCoin transaction requires a fee of BTC 0.0002 ($0.11). This price is rising. It’s not really cheaper than a bank transaction fee (unless you consider a transfer across two countries with different currencies, of course).

For ZeroDollarHomepage, executing a 10-lines script on Ethereum method costs about one cent ($0.01). That’s insanely expensive. Amazon Lambda, for instance, costs $0.0000002 per request (after the first million requests each month).

It’s normal to pay for hosting costs when you use a platform, but the Blockchain costs are orders of magnitude higher than the most expensive PaaS.

Volatility and Speculation

You could say that the blockchain cost isn’t such a big deal, as long as people are willing to use the network and pay for transactions. It’s a question of supply and demand, and the demand for blockchain and cryptocurrencies is currently high enough to make it profitable. But this high demand leads to speculation, and therefore the price of computing and storage in a blockchain (any blockchain) is highly volatile.

BTC to USD

Some analyst compare Bitcoin to a Ponzi Scheme, and predict that the market value will collapse once general interest disappears.

If we build a business based on the Ethereum’s blockchain, most of our expenses will be in Ether. If we don’t mine it ourselves, we’ll have to pay for that Ether in real money. But since the USD value of Ether may vary tenfold within a year, our business can move from profitable to worthless in the same timeframe. And we can’t do anything against it. On the other hand, if we mine ourselves, what is currently affordable (running a small server to cover expenses in Ether) might become very expensive once very large mining farms move from Bitcoin to Ethereum.

The high volatility of cryptocurrencies forbids any long-term profitable business built on the blockchain - except speculation.

Slow As Hell

Compared to many other innovations based on computers and networks, the blockchain is very slow. Experts say that you should wait 6 blocks to make sure that a transaction is legit. This means more than 1 minute in Ethereum, or more than 1 hour in Bitcoin.

In a traditional ad server, scheduling an ad takes about 100ms. If you’ve used our ZeroDollarHomepage Ad Server, you probably had a very different experience: Scheduling an ad takes about a minute. The network transport and replication accounts for a small share of that duration ; most of the time is spent waiting for the network to mine the transaction, and add a few more blocks after that. But all in all, the Ethereum blockchain is several orders of magnitude slower than traditional computing.

wait

For end users, every second counts. The Web Performance Optimization trend focuses on improving revenue by earning one or two seconds in download time. Betting on a technology that requires a transaction to be acknowledged by the entire world isn’t the best way to make business.

Free Market and Anarchy

One of the promises of the blockchain is to liberate markets that still require an intermediary. No more lawyers, bankers, or bookmakers. A great opportunity for new businesses?

Except these intermediaries currently report criminal activities to the authorities (governments and law enforcement agencies). If you remove the intermediaries, you also remove the police, and you let criminals proliferate. The first bitcoin application at scale was called The Silk Road. It was an online marketplace for everything illegal: drugs, weapons, child pornography, etc. Not to mention the ability to use bitcoins for tax evasion.

Even the proponents of free market economy recognize that a certain level of regulation is necessary to avoid total chaos. Running a business in a land full of criminals with no police isn’t profitable - unless you’re a criminal, too. For instance, the Mt. Gox Bankrupcy in 2014 cost about $450 million to BitCoin users.

Just like it took a long time for governments to control the Internet (which was, and remains, a haven for criminals), it will take a long time for our lawmakers to control the anarchy unleashed by blockchains. The blockchain may carry the promise of a better future in the long term, but for the near future, you’d better be armed.

Do You Really Need A Blockchain?

A large share of the hype around the blockchain comes from people who don’t really understand its shortcomings. They would probably use another solution is they were better informed. Here are a few bad reasons why you should probably not choose the blockchain technology.

You can use a private blockchain Nearly 80% of the blockchain projects I hear about, especially in finance, are based on private blockchains. This completely defeats the main purpose, which is to get an agreement between non-trusted parties. If a project needs runs on a private blockchain, then only trusted parties can join it, and you don’t have a trust problem. In a trusted network, there are many, many other tools to share a ledger of facts - all much better optimized than the blockchain (for instance: a web service).

It offers a way to reach distributed consensus It does, but only if this consensus can be written as code. For instance, a company working with music rights distribution recently contacted us to build an international platform for artist retribution on the blockchain. Except that when two countries disagree on how to pay right holders, they both have valid contracts. Only a court can decide which contract wins. No smart contract can replace that. You must have clear governance rules that already work before trying to automate them in a blockchain.

It’s secure Asymmetric cryptography is one of the blockchain’s strengths. However, the blockchain technology, just like any other, is safe only until someone finds a vulnerability. It has already happened in the past. The computer science behind the blockchain is so complex that very few developers can contribute or review the code. Consider smart contracts and blockchains as relatively less secure than, say, TSL on the web (through HTTPS). Of, and even if the software works perfectly, it doesn’t prevent fraud. Remember the double spend problem from our first post? It turns out people regularly try that in blockchains (see the latest 200 double spends in the Bitcoin blockchain)

It’s transparent Granted, all transactions are public, and expose location and IP address. But no personal information ever transits - only anonymous hashes. Even the creator of Bitcoin is a mystery. So blockchain transparency doesn’t prevent crime or fraud. Also, transparency is usually an inconvenient for businesses. Are you willing to bet your business on a technology that lets everyone track all your transactions, and exposes your code to hackers?

Data is replicated and safe Sure, but with the least cost effective replication strategy. Amazon S3 replicates every bit of data at least 3 times with 100% uptime, for a fraction of the price. And if you actually need full transaction history, use an event store.

It connects anonymous peers But if it’s only for a shared storage (i.e. if you don’t need fact ordering), then regular peer-to-peer network protocols like BitTorrent are enough.

It’s hip I can’t argue with that: yelling the word “blockchain” out loud is currently a great way to grab an innovation budget. However, many of the shining products that pretend to run on the blockchain are merely powerpoint presentations. Besides, you’ll get better results with many other technologies. Not to mention that the word blockchain also evokes money laundering, tax fraud, and pornography.

If you want to build your business on the blockchain, be certain that you need it, and that it will be really useful for your use case.

Conclusion

Blockchains are a very smart idea, with huge possible implications. But are the current implementations ready to power the disruptive applications of the next decade?

On the technical side, some elementary features are simply not feasible. Blockchains are not efficient enough, not enough developer-friendly, and they give too much power to a small league of extraordinary developers without enough political and economical background.

On the business side, the blockchain is moving too fast, it’s expensive, and often overkill. Costs may vary tenfold for no reason. Building a business on such an unstable platform is incredibly risky.

My take is that we have to wait. The blockchain isn’t ready yet. It needs more maturity, another killer app than a speculation engine, a larger developer community, more ecological and economical responsibility. How long will it take? Maybe a year or two? Nobody can tell that.

To be honest, this conclusion surprised me. Most of the publications about the blockchain suggest the opposite. They say “it’s time”, “don’t miss the train”, or “the giant businesses of the next decade are being built on the blockchain right now”. Maybe they are wrong, or maybe we are wrong. We’ve tried to argument this analysis with strong evidence. If you have a different opinion, please voice your comment below.

We’ll be following the developments in the different blockchain projects closely. Make sure you follow this blog for related news!

The Blockchain Explained to Web Developers, Part 2: In Practice

Published on 20 May 2016 by Gildas Garcia and Kevin Maschtaler

Edmonton's first digital billboard?

Is the blockchain a revolution? The technology that powers Bitcoin sure has the potential to disrupt the entire Internet, as we explained in a previous post. But how can you, a developer, use the blockchain to build applications? Are the tools easy to use, despite the complexity of the underlying concepts? How good is the developer experience?

We wanted to find out, and there is no better tutorial than developing an app from scratch. So we’ve made a simple decentralized ad server called Zero Dollar Homepage, powered by blockchain. This is the story of our experience. Read on to learn how hard the blockchain is for developers today.

Application Concept

The blockchain shines when it replaces intermediaries. We chose to focus on Ad Platforms, which are intermediaries between announcers (who buy visibility) and content providers (who sell screen real estate). Our project was to build a decentralized ad platform running on the blockchain.

Since the famous Million Dollar Homepage experiment, innovating in the field of paid ads can’t make you rich anymore.

Instead, we chose to build a tool that allows to display ads for free - a Zero Dollar Homepage. For free, but not for nothing: advertisers exchange ad visibility for open-source contributions. So we’ve built a decentralized app to manage how ads display on a particular page. Advertisers need to take up a coding challenge to be able to put their ads on this page.

User Workflow

In concrete terms, whenever we merge a Pull Request (PR) on one of marmelab’s open-source repositories, a GitHub bot comments on the PR, and invites the PR author to publish their ad on the ad platform admin panel.

Following the link contained in the comment, the PR author is asked to sign in with their GitHub credentials. Then, they can upload an ad - in fact, a simple image. This image is added to the list of images uploaded by other PR authors, in chronological order.

Each day at midnight, an automated script takes the next image in line (using a FIFO ordering), and displays it on http://marmelab.com/ZeroDollarHomepage/ for the next 24 hours.

Note: The entire process requires no intermediary, but in order to avoid the display of adult imagery on our site, we validate the uploaded images through the Google Vision API before putting them online.

Architecture

Here is how we separated responsibilities in each of the 4 use cases of an ad platform:

  1. Open-source contributor notification Whenever an open-source PR gets merged on one of our repositories, GitHub notifies the admin app with the PR details. The app publishes a comment on the PR to notify the contributor. The comment contains a link back to the admin app, with the PR details.
  2. Claim and image upload Following the comment link, the contributor goes to the admin app. He must sign in with his GitHub credentials to be authenticated. The admin app then calls GitHub to grab the PR details, and to check that the contributor is actually the PR author. If it’s OK, the admin app displays an image upload form. When the contributor uploads an image, the admin app pushes the PR id to the blockchain, and uploads the image to a CDN (named after the PR id). The admin app displays the approximate date of publication of the image based on the number of valid PRs with an image still waiting in the blockchain.
  3. Ad placement Every 24 hours, a cron asks the blockchain for the next PR not yet displayed. The blockchain marks this PR as displayed and sends the ID. The cron renames the image named after the pr ID to “current image”.
  4. Ad display Each time a visitor wants to display the ad in ZeroDollarHomepage, it asks the CDN for the current image. It happens to be the latest published ad from the blockchain, which remains displayed at least 1 day (and until another contributor claims a PR).

This might seem surprising, as the blockchain plays a very small part in the process. But we quickly realized that the entire code of the ad platform couldn’t live in the blockchain. In fact, blockchains have very limited capabilities in terms of connectivity to the Internet, and processing power. So we delegated only the crucial ad placement tasks to the blockchain:

  • Register a pull request by an authenticated contributor
  • Get the last non displayed pull request, and mark it as displayed

Other tasks ended up in the admin app, outside of the blockchain, for various reasons:

  • Register a pull request from a webhook Registering a pull request before it’s been claimed is useless, since the contributor may never claim it. Besides, storing data in the blockchain isn’t free, so we store only what we have to store. The downside is that any PR on our public repositories, including those created before this experiment, are eligible for the next step.
  • Notify the user by posting a comment to GitHub A smart contract can’t call an external API, so it’s just not possible. Instead, we delegated this task to the admin app.
  • Verify a claimed PR’s author Again, a smart contract can’t call the GitHub APIs. Instead, we moved this logic to the admin app, and made it a prerequisite before calling the blockchain.
  • Store the Image In theory, you can store pretty much anything in the blockchain, including images. In practice, images cost a lot to store, and we didn’t manage to store more than one “table” (array of data) in our smart contract.
  • Update the displayed ad to the next in line A blockchain has no equivalent of the setTimeout function, or cron jobs. You might however execute some code every x blocks but it’s not related to time. Instead, we used a cron-like library on our API.

Research, documentation and first attempts

As we explained in a previous post, they aren’t many good choices when choosing a blockchain network. So we chose Ethereum.

We quickly hit our first wall. Until a few weeks ago, you couldn’t play with the Ethereum blockchain without buying Ether first, even for simple tests. Also, Ethereum didn’t really allow private blockchains in its former version (named Frontier), which made development very complicated. Anyone accessing the Ethereum network might call your test contracts. More importantly, the documentation is a volunteer initiative, and was not in sync with the development state.

Note: Ethereum bumped their version since we developed the application, switching fromFrontier to Homestead. The Ethereum community improved the documentation quality for Homestead.

Despite these shortcomings, we managed to register three nodes on the Ethereum network across Nancy, Paris and Dijon, and to share a ping between those nodes.

In the course of our documentation search, we eventually found the Eris documentation. Eris did an excellent job at explaining Blockchains and contracts. Moreover, they especially built a layer on top of Ethereum, and open-sourced a bunch of tools to ease the process of developing smart contracts.

eris is command line tool you can use to initialize any number of local blockchains you need.

Smart Contract Implementation

A smart contract is very similar to an API. It has a few public functions which might be called by anyone registered on the blockchain network. Unlike an API, a smart contract cannot call external web APIs (a blockchain is a closed ecosystem). A smart contract may however call other smart contracts, provided it knows their address.

As with an API, the public functions are only the tip of the iceberg. A contract might be in fact composed of many private functions, variables, etc.

Smart contracts are hosted in the blockchain in an Ethereum-specific binary format, executable by the Ethereum Virtual Machine. Several languages and compilers are available to write contracts:

At marmelab, we code a lot in Javascript, so we chose to use Solidity. Solidity contracts are stored in .sol files.

The Zero Dollar Homepage Contract

The Zero Dollar Homepage contract stores the claimed pull-requests, and a queue of requests to display. The first version of the Solidity contract looked like this:

// in src/ethereum/ZeroDollarHomePage.sol
contract ZeroDollarHomePage {
    uint constant ShaLength = 40;

    enum ResponseCodes {
        Ok,
        InvalidPullRequestId,
        InvalidAuthorName,
        InvalidImageUrl,
        RequestNotFound,
        EmptyQueue,
        PullRequestAlreadyClaimed
    }

    struct Request {
        uint id;
        string authorName;
        string imageUrl;
        uint createdAt;
        uint displayedAt;
    }

    // what the contract stores
    mapping (uint => Request) _requests; // key is the pull request id
    uint public numberOfRequests;
    uint[] _queue;
    uint public queueLength;
    uint _current;
    address owner;

    // constructor
    function ZeroDollarHomePage() {
        owner = msg.sender;
        numberOfRequests = 0;
        queueLength = 0;
        _current = 0;
    }

    // a contract must give a way to destroy itself once uploaded to the blockchain
    function remove() {
        if (msg.sender == owner){
            suicide(owner);
        }
    }

    // the following three methods are public contracts entry points

    function newRequest(uint pullRequestId, string authorName, string imageUrl) returns (uint8 code, uint displayDate) {
        if (pullRequestId <= 0) {
            // Solidity is a strong typed language. You get compilation errors when types mismatch
            code = uint8(ResponseCodes.InvalidPullRequestId);
            return;
        }

        if (_requests[pullRequestId].id == pullRequestId) {
            code = uint8(ResponseCodes.PullRequestAlreadyClaimed);
            return;
        }

        if (bytes(authorName).length <= 0) {
            code = uint8(ResponseCodes.InvalidAuthorName);
            return;
        }

        if (bytes(imageUrl).length <= 0) {
            code = uint8(ResponseCodes.InvalidImageUrl);
            return;
        }

        // store new pull request details
        numberOfRequests += 1;
        _requests[pullRequestId].id = pullRequestId;
        _requests[pullRequestId].authorName = authorName;
        _requests[pullRequestId].imageUrl = imageUrl;
        _requests[pullRequestId].createdAt = now;

        _queue.push(pullRequestId);
        queueLength += 1;

        code = uint8(ResponseCodes.Ok);
        displayDate = now + (queueLength * 1 days);
        // no need to explicitly return code and displayDate as they are in the method signature
    }

    function closeRequest() returns (uint8) {
        if (queueLength == 0) {
            return uint8(ResponseCodes.EmptyQueue);
        }

        _requests[_queue[_current]].displayedAt = now;
        delete _queue[0];
        queueLength -= 1;
        _current = _current + 1;
        return uint8(ResponseCodes.Ok);
    }

    function getLastNonPublished() returns (uint8 code, uint id, string authorName, string imageUrl, uint createdAt) {
        if (queueLength == 0) {
            code = uint8(ResponseCodes.EmptyQueue);
            return;
        }

        var request = _requests[_queue[_current]];
        id = request.id;
        authorName = request.authorName;
        imageUrl = request.imageUrl;
        createdAt = request.createdAt;
        code = uint8(ResponseCodes.Ok);
    }
}

For this first attempt, we used the Eris JS libraries to communicate with our blockchain. Instanciating a contract from a Node.js file turned up to be as simple as:

import eris from 'eris';

function getContract(url, account) {
    const address = // Read address file stored on disk by the eris CLI;
    const abi = // Read abi file stored on disk by the eris CLI;
    const manager = eris.newContractManagerDev(url, account);
    return manager.newContractFactory(abi).at(address);
}

And calling it wasn’t difficult either:

function* newRequest(pullrequestId, authorName, imageUrl) {
    const contract = getContract(url, account);
    // First gotcha, when a function returns several named variables, they are returned as an Arrays
    // Second gotcha, numbers are returned as instances of BigNumber, do not forget to convert when standard numbers are expected
    const [codeAsBigNumber, displayDateAsBigNumber] = yield contract.newRequest(pullrequestId, authorName, imageUrl);
    const code = codeAsBigNumber.toNumber();

    if (code !== 0) {
        throw new Error(getErrorMessageFromCode(code));
    }

    // Return the displayDate for UI confirmation screen
    return displayDate.toNumber();
}

For more information about the Eris JS binding libraries, please refer to Eris documentation.

Unit Testing Contracts

We love Test Driven Development, and one of the first question we had was: how can we test a Solidity smart contract?

The Eris guys made a tool for that, too: sol-unit. It runs a new local blockchain network for each test, in a docker container (which ensures each test run in a clean environment), and executes the test. Tests are written as a contract, too. Neat!

Well, not so fast. sol-unit is an npm package, and to use the testing functions (assertions, etc.), we had to import the contract supplied by this package in our testing contracts. For that, there is a simple Solidity syntax:

import "../node_modules/sol-unit/.../Asserter.sol";

So far so good… or not. We hit a strange case when compiling our contracts. Apparently, you can’t import contracts with such a path. We ended up adding a command in our testmakefile target to copy those sol-unit contracts in the same folder as ours. After that, running sol-unit was simple and we started coding.

copy-sol-unit:
	@cp -f ./node_modules/sol-unit/contracts/src/* ./src/ethereum/

compile-contract:
	solc --bin --abi -o ./src/ethereum ./src/ethereum/ZeroDollarHomePage.sol ./src/ethereum/ZeroDollarHomePageTest.sol

test-ethereum: copy-sol-unit compile-contract
	./node_modules/.bin/solunit --dir ./src/ethereum

Running a Test Blockchain

Running a blockchain and deploying our contract to it was as simple as following the Eris documentation. We managed to resolve the few troubles we met using a bunch of commands that we integrated in our makefile. The whole process of running a new blockchain with our contract looks like this:

  • Reset any running eris docker containers, and remove some temporary files
  • Start the eris key service
  • Generate our account key, and store its address in a convenient file to be loaded later by the JS API
  • Generate the genesis.json, which is the “block 0” of the blockchain
  • Create and start the new blockchain
  • Upload the contract to the blockchain and save its address in order to call it when we need it

After a few days of work, we were able to run the contracts on a local Eris blockchain.

From Eris to Ethereum

At this point, we wanted to try out our contracts on a local Ethereum blockchain.

To communicate with contracts inside the Ethereum blockchain, we had to use the Web3 libraries. We learned a lot while trying to use them. We realized that eris was hiding a lot of the underlying complexity.

First, our initial assumption that a contract is similar to an API was not correct. We had to distinguish functions that were only reading data from the blockchain, and functions that were writing data to it.

The first kind (read-only functions) would return the resulting data asynchronously, just like an API would do. The second kind (write functions) would only return a transaction hash. The expected side effects of a write function (changes inside the blockchain) wouldn’t be effective until the corresponding blocks would be mined, which could take some time (from ten seconds to one minute in the worst case). Moreover, we haven’t been able to make those writing functions return values, so we had to change our solidity code to call a write function first, then call a read function to get the results.

We also discovered events, which can be used to be notified when something happens in a smart contract. The smart contract is responsible for triggering the events. They look like this with solidity:

event PullRequestClaimed(unit pullRequestId, uint estimatedDisplayDate);

And they can be triggered from any of the smart contract functions, like this:

PullRequestClaimed(pullRequestId, estimatedDisplayDate);

Those events are stored permanently in the blockchain. That means we could use the blockchain as an event store. It might be the easiest way to determine if a call to a function has been successfully executed: the smart contract may trigger an event at the end of its process with failure reasons, results of computation, etc… It’s worth noting that some integration packages for Meteor are already available.

Eventually, we refactored our smart contracts to be a lot simpler in order to get almost the same features. We had to get rid of the mappings (which we haven’t been able to use - our transactions weren’t mined by the Ethereum network for some reason).

The solidity language may be close to JavaScript, it is still very young and incomplete. Arrays don’t have the functions we’re used to work with in JavaScript (not even indexOf), and strings don’t have any functions. This might be addressed by some community efforts in the near future.

The Ethereum implementation looks like this:

// in src/ethereum/ZeroDollarHomePage.sol
contract ZeroDollarHomePage {
    event InvalidPullRequest(uint indexed pullRequestId);
    event PullRequestAlreadyClaimed(uint indexed pullRequestId, uint timeBeforeDisplay, bool past);
    event PullRequestClaimed(uint indexed pullRequestId, uint timeBeforeDisplay);
    event QueueIsEmpty();

    bool _handledFirst;
    uint[] _queue;
    uint _current;
    address owner;

    function ZeroDollarHomePage() {
        owner = msg.sender;
        _handledFirst = false;
        _current = 0;
    }

    function remove() {
        if (msg.sender == owner){
            suicide(owner);
        }
    }

    function newRequest(uint pullRequestId) {
        if (pullRequestId <= 0) {
            InvalidPullRequest(pullRequestId);
            return;
        }

        // Check that the pr hasn't already been claimed
        bool found = false;
        uint index = 0;

        while (!found && index < _queue.length) {
            if (_queue[index] == pullRequestId) {
                found = true;
            } else {
                index++;
            }
        }

        if (found) {
            PullRequestAlreadyClaimed(pullRequestId, (index - _current) * 1 days, _current > index);
            return;
        }

        _queue.push(pullRequestId);
        PullRequestClaimed(pullRequestId, (_queue.length - _current) * 1 days);
    }

    function closeRequest() {
        if (_handledFirst && _current < _queue.length - 1) {
            _current += 1;
        }

        _handledFirst = true;
    }

    function getLastNonPublished() constant returns (uint pullRequestId) {
        if (_current >= _queue.length) {
            return 0;
        }

        return _queue[_current];
    }
}

The process for claiming a pull request and returning the estimated display date evolved to become:

// make a [transaction](https://github.com/ethereum/wiki/wiki/JavaScript-API#web3ethsendtransaction) call to our smart-contract write function
contract.newRequest.sendTransaction(pullrequestId, {
    to: client.eth.coinbase,
}, (err, tx) => {
    if (err) {
        throw error;
    }

    // wait for it to be mined using [code](https://github.com/ethereum/web3.js/issues/393) from [@croqaz](https://github.com/croqaz)
    return waitForTransationToBeMined(client, tx)
        .then(txHash => {
            if (!txHash) throw new Error('Transaction failed (no transaction hash)');

            // get its receipt which might contains informations about event triggered by the contract's code
            // this function might also check wether the transaction was successful by analyzing the receipt for ethereum specific error cases (insufficient funds, etc.)
            return getReceipt(client, txHash);
        })
        .then(receipt => {
            // parse those logs to extract only event data
            return parseReceiptLogs(receipt.logs, contractAbi));
        })
        .then(logs => {
            if (logs.length === 0) {
                throw new Error('Transaction failed (Invalid logs)');
            }

            const log = logs[0];

            if (log.event === 'PullRequestClaimed') {
                // timeBeforeDisplay is a BigNumber instance
                return log.args.timeBeforeDisplay.toNumber();
            }

            if (log.event === 'PullRequestAlreadyClaimed') {
                const number = log.args.timeBeforeDisplay;

                if (log.args.past) {
                    // timeBeforeDisplay is a BigNumber instance
                    return number.negated().toNumber();
                }

                // timeBeforeDisplay is a BigNumber instance
                return number.toNumber();
            }

            if (log.event === 'InvalidPullRequest') {
                throw new Error('Invalid pull request id');
            }
        });
})

And with this code, our decentralized app worked in a local Ethereum network.

Deployment to Production

Running our application in a local environment was a challenge, but deploying it to production, in the real Ethereum network, was a battle.

There are a few gotchas to be aware of. The most important one is that contracts are immutable in code. This means that:

  • A contract that you deploy to the blockchain stays there forever. If you find a bug you your contract, you can’t fix it - you have to deploy a new contract.
  • When you deploy a new version of an existing contract, and any data stored in the previous contract isn’t automatically transferred - unless you voluntarily initialize the new contract with the past data. In our case, fixing a bug in the contract actually wipes away recorded PRs (whether already advertised, or waiting for ad display).
  • Every contract version has an id (for instance, the current ZeroDollarHomepage contract is 0xd18e21bb13d154a16793c6f89186a034a8116b74). Since past versions may contain data, keep track of past contract ids if you don’t want to lose the data (this happened to us, too).
  • As you can’t update a contract, you can’t rollback an update either. Make really sure that your contract works before redeploying it.
  • When you deploy a new version of an existing contract, the old (buggy) contract can still be called. Any system outside of the blockchain referencing the contract (like our Node admin app in Zero Dollar Homepage) must be updated to point to the new contract. We forgot to do it a few times, and scratched our head desperately to understand why our new code didn’t run.
  • Contracts authors can kill their contract if they include a suicide call in the code. But all the existing transactions of the contract remain in the blockchain - forever. Also, make sure that the kill switch deals with the remaining ether in the contract if you don’t want it to disappear.

Another gotcha is that every contract deployment and write operation in the blockchain costs a variable amount of ether. We managed to get 5 ETH (more about getting ether below), but we had no idea how much we would need to deploy our contract, or calling a transaction. It’s harder to test when each failed test costs money.

For the Node.js part, we decided to run it on an AWS EC2 instance, like most of our projects. To do so, we had to:

  • Run an Ethereum node on the server
  • Download the entire blockchain to this server
  • Unlock an account with some Ether on the node
  • Deploy our application and link it to the node
  • Register our smart contract into the blockchain through the node

Make sure your blockchain node server has plenty of storage. The current size of the blockchain is about 15GB. The default volume size on an EC2 instance is 8GB (sigh). We had many troubles because we hadn’t downloaded the entire chain (but we didn’t realize it immediately). For instance, we had an account with 5 ETH, but for a long time the system responded as if we hadn’t unlocked our account, or as if we had no ether. Downloading the rest of the chain fixed this issue.

Likewise, unlocking our precious account containing 5 ETH was not an easy task. We did not want to hardcode our passphrase in the application, and we wanted to run the node with supervisord to ease the deployment. We finally found a way that allowed us to change the configuration without exposing our passphrase with the following supervisordconfiguration:

[program:geth]
command=geth --ipcdisable --rpc --fast --unlock 0 --password /path/to/our/password/in/a/file
autostart=false
autorestart=true
startsecs=10
stopsignal=TERM
user=ubuntu
stdout_logfile=/var/log/ethereum-node.out.log
stderr_logfile=/var/log/ethereum-node.err.log

One final security note: The Remote Procedure Call (RPC) port of the blockchain is 8545. Do not open this port on your EC2 instance! Anyone knowing the instance IP could control your Ethereum node, and steal your ether.

Ether and Gas

As explained in our first post on the blockchain, deploying and calling a contract in the Ethereum blockchain isn’t free. Since a blockchain is expensive to run, any write operation must be paid for. In Ethereum, the price of calling a write contract method depends on the complexity of the method. Ethereum comes with a list of Gas Fees, which tells you how much Ether you should add to a contract call to have it executed.

In practice, that’s a very low amount of Ether, a fraction of a single Ether. The Ethereum blockchain introduced another currency for running contracts: Gas.

1 Gas = 0.00001 ETH 1 ETH = 100,000 Gas

The Gas to Ether conversion rate will vary in the future according to the supply of computing power, and the computation demand.

Charging a fee to process a transaction isn’t compulsory, but recommended. The Ethereum documentation says: “Miners are free to ignore transactions whose gas price is too low”. However, a mined block always give 5 ETH to the successful miner.

To call our own contracts, the Ethereum blockchain requires between 0.00045 and 0.00098 Ether (the price depends on the gas price and the gas used by the transaction).

How do you get Ether and Gas? You can buy Ether (mostly by exchanging Bitcoins), or you can mine it. In France, where we live, buying Bitcoins or Ether requires almost the same procedure as opening a bank account. It’s slow (a few days), painful, and depends on exchange rates fixed by offer and demand.

Mining Ether

So we decided to mine our Ether. That’s a good way to see if mining is profitable on Ethereum or not. We spawned a heavy Amazon EC2 instance, with strong GPU computing power (a g2.2xlarge instance). The price of this instance is 17$ per day. We installed Ethminer, and started our node. We quickly had to beef up the instance even more, because of high memory and storage requirements. The first thing a node does when it joins a blockchain is to download the entire history of past transactions. That takes a huge amount of storage: over 14GB for the blockchain’s history, and about 3GB for the Ethash Proof of Work.

Once our Ethereum node started, we had to mine for 3 days to create a valid block:

As a reminder, the Ethereum blockchain mines one block every 10 seconds. Mining a block brings up 5 Ether, which sell for roughly $55. The running cost for our beefy EC2 instance for these 3 days was about $51. All in all, it was cheaper to mine Ether on AWS than to buy it. But we were very lucky: since we mined our block, the network’s difficulty was multiplied by three.

How long can we run the ZeroDollarHomePage with 5 Ether? Let’s make the computation.

The Zero Dollar Homepage workflow implies one transaction per day, plus one transaction per claimed PR. Supposing contributors claim one PR per day, the yearly price in ether for running the platform would be at most 365 * 2 * 0,00098 = 0.72 ETH. With 5 ETH, we should normally be able to run the platform for almost seven years.

As you see, running a contract in Ethereum isn’t free, but at the current price of Ether, it’s still cheap. However, the Ether value varies a great deal. Since mining Bitcoin is becoming less and less profitable, some large Bitcoin mining farms switch to Ethereum. This makes mining harder, and makes Ether more expensive every day.

Final Surprise

Finally, our smart contract ended up working fine in our real world Ethereum node hosted on EC2.

But by the time we got there, Ethereum released their Homestead version, which brought a lot of new things and broke our code entirely. It took us about a week to understand why, through trial and error, and fix the code that wasn’t compatible anymore for obscure reasons.

Tip: The Homestead release documents a hidden Ethereum feature, private networks, to ease development. The lack of private networks was one of our reasons to use Eris in the first place.

The ZeroDollarHomePage platform is now up and running again. You can use it by opening a pull request on one of marmelab’s open-source repositories on GitHub, see the ads currently displayed on http://marmelab.com/ZeroDollarHomepage/, or browse the code of the application on marmelab/ZeroDollarHomePage. Yes, we’re open-sourcing the entire ad platform, so you can see in detail how it works, and reproduce it locally.

Debugging

The Ethereum developer experience is very bad. Imagine that you have no logs and no debug tools. Imagine that the only way to discover why a program fails is to echo “I’m here” strings every line to locate the problem. Imagine that sometimes (e.g. in Solidity contracts), you can’t even do that. Imagine that a program that works perfectly in the development environment (where you can add debug statements) fails silently in the production environment (where you can’t). That’s the developer experience in Ethereum.

If you store data in your smart contract, there is no built-in way to visualize the current state of this data after a transaction. That means you need to build your own visualisation tool to be able to troubleshoot errors.

The tools available to track Ethereum contracts and transactions are:

For instance, here is how our contract looks in etherscan:

Each transaction (call to a contract method) is logged there, together with a trace of the contract execution… in machine language. Apart from making sure your call actually gets to the contract, you can’t use it for debugging.

Also, these tools can only monitor the public Ethereum network. Unfortunately, you can’t use them to debug a local blockchain.

If you have ever seen Bitcoin transaction auditing sites, don’t expect the same level of sophistication for Ethereum. Besides, the bitcoin network only has one kind of transaction, so it’s easier to monitor than a network designed to run smart contracts.

Documentation

And that’s not all: the Ethereum documentation is not in sync with the code (at least in the Frontier version), so most of the time we had to look at the libraries to try to understand how we’re expected to code. Since the libraries in question use a language that no one uses (Solidity), good luck figuring out how they work. Oh, and don’t expect help from Stack Overflow, either. There are too few people like us who dared to implement something serious to have a good community support.

Let’s be clear: we are not criticizing the Ethereum community for their lack of efforts. Once again, there is a tremendous momentum behind Ethereum, and things improve at a rapid pace. Kudos to all the documentation contributors for their work. But by the time we developed our application, the documentation state was clearly not good enough for a new Ethereum developer to start a project.

You can find a few tutorials here and there, but most of the time, copy-pasted code from these tutorials simply doesn’t work.

Here are a few resources worth reading if you want to start developing smart contracts yourself:

Conclusion

After 4 weeks of work by 2 experienced developers, we managed to make our code work in the public Ethereum network with lots of effort. Regressions and compatibility breaks in the Ethereum libraries between Frontier and Homestead versions didn’t help. Check the project source code at marmelab/ZeroDollarHomePage for a detailed understanding of the inner workings. Please forgive the potential bugs in the code, or the inaccuracies in this post - we have a limited experience in the matter. Feel free to send us your corrections in GitHub, or in the comments.

We didn’t enjoy the party. Finding our way across bad documentation and young libraries isn’t exactly our cup of tea. Fighting to implement simple features (like string manipulation) with a half-baked language isn’t fun either. Realizing that, despite years of programming experience in many scripting languages, we are not able to write a simple solidity contract is frustrating. Most importantly, the youth of the Ethereum ecosystem makes it completely impossible to forecast the time to implement a simple feature. Since time is money, it’s currently impossible to determine how much it will cost to develop a Decentralized App.

In time and resources, ZeroDollarHomepage represents a development cost of more than €20,000 - even if it’s a very simple system. As compared to the tools we use in other projects (Node.js, Koa, React.js, PostgreSQL, etc.), developing on the blockchain is very expensive. It’s a great disappointment for the dev team, and a strong signal that the ecosystem isn’t ready yet.

Is this bad experience sufficient to make up our mind about the blockchain? How come many startups showcase their blockchain services as successful innovations? What’s the real cost of building a DApp? Read the last post in this series to see what we really think about the blockchain phenomenon.

The Blockchain Explained to Web Developers, Part 1: The Theory

Published on 28 April 2016 by Francois Zaninotto

The blockchain is the new hot technology. If you haven’t heard about it, you probably know Bitcoin. Well, the blockchain is the underlying technology that powers Bitcoin. Experts say the blockchain will cause a revolution similar to what Internet provoked. But what is it really, and how can it be used to build apps today? This post is the first in a series of three, explaining the blockchain phenomenon to web developers. We’ll discuss the theory, show actual code, and share our learnings, based on a real world project.

To begin, let’s try to understand what blockchains really are.

What Is A Blockchain, Take One

Although the blockchain was created to support Bitcoin, the blockchain concept can be defined regardless of the Bitcoin ecosystem. The literature usually defines a blockchain as follows:

A blockchain is a ledger of facts, replicated across several computers assembled in a peer-to-peer network. Facts can be anything from monetary transactions to content signature. Members of the network are anonymous individuals called nodes. All communication inside the network takes advantage of cryptography to securely identify the sender and the receiver. When a node wants to add a fact to the ledger, a consensus forms in the network to determine where this fact should appear in the ledger; this consensus is called a block.

The Thinker, by Rodin

I don’t know about you, but after reading these definitions, I still had troubles figuring out what this is all about. Let’s get a bit deeper.

Ordering Facts

Decentralized peer-to-peer networks aren’t new. Napster and BitTorrent are P2P networks. Instead of exchanging movies, members of the blockchain network exchange facts. Then what’s the real deal about blockchains?

P2P networks, like other distributed systems, have to solve a very difficult computer science problem: the resolution of conflicts, or reconciliation. Relational databases offer referential integrity, but there is no such thing in distributed system. If two incompatible facts arrive at the same time, the system must have rules to determine which fact is considered valid.

Take for instance the double spend problem: Alice has 10$, and she sends twice 10$ to Bob and Charlie. Who will have the 10$ eventually? To answer this question, the best way is to order the facts. If two incompatible facts arrive in the network, the first one to be recorded wins.

double spend

In a P2P network, two facts sent roughly at the same time may arrive in different orders in distant nodes. Then how can the entire network agree on the first fact? To guarantee integrity over a P2P network, you need a way to make everyone agree on the ordering of facts. You need a consensus system.

Consensus algorithms for distributed systems are a very active research field. You may have heard of Paxos or Raft algorithms. The blockchain implements another algorithm, the proof-of-work consensus, using blocks.

Blocks

Blocks are a smart trick to order facts in a network of non-trusted peers. The idea is simple: facts are grouped in blocks, and there is only a single chain of blocks, replicated in the entire network. Each block references the previous one. So if fact F is in block 21, and fact E is in block 22, then fact E is considered by the entire network to be posterior to fact F. Before being added to a block, facts are pending, i.e. unconfirmed.

How blocks group facts

Mining

Some nodes in the chain create a new local block with pending facts. They compete to see if their local block is going to become the next block in the chain for the entire network, by rolling dice. If a node makes a double six, then it earns the ability to publish their local block, and all facts in this block become confirmed. This block is sent to all other nodes in the network. All nodes check that the block is correct, add it to their copy of the chain, and try to build a new block with new pending facts.

Rolling dice

But nodes don’t just roll a couple dice. Blockchain challenges imply rolling a huge number of dice. Finding the random key to validate a block is very unlikely, by design. This prevents fraud, and makes the network safe (unless a malicious user owns more than half of the nodes in the network). As a consequence, new blocks gets published to the chain at a fixed time interval. In Bitcoin, blocks are published every 10 minutes on average.

In Bitcoin, the challenge involves a double SHA-256 hash of a string made of the pending facts, the identifier of the previous block, and a random string. A node wins if their hash contains at least n leading zeroes.

// a losing hash for Bitcoin
787308540121f4afd2ff5179898934291105772495275df35f00cc5e44db42dd
// a winning hash for Bitcoin if n=10
00000000009f766c17c736169f79cb0c65dd6e07244e9468bc60cde9538b551e

Number n is adjusted every once in a while to keep block duration fixed despite variations in the number of nodes. This number is called the difficulty. Other blockchain implementations use special hashing techniques that discourage the usage of GPUs (e.g. by requiring large memory transfers).

The process of looking for blocks is called mining. This is because, just like gold mining, block mining brings an economical reward - some form of money. That’s the reason why people who run nodes in a blockchain are also called miners.

Note: By default, a node doesn’t mine - it just receives blocks mined by other nodes. It’s a voluntary process to turn a node into a miner node.

Money and Cryptocurrencies

Every second, each miner node in a blockchain tests thousands of random strings to try and form a new block. So running a miner in the blockchain pumps a huge amount of computer resources (storage and CPU). That’s why you must pay to store facts in a blockchain. Reading facts, on the other hand, is free: you just need to run your own node, and you’ll recuperate the entire history of facts issued by all the other nodes. So to summarize:

  • Reading data is free
  • Adding facts costs a small fee
  • Mining a block brings in the money of all the fees of the facts included in the block

We’re not talking about real money here. In fact, each blockchain has its own (crypto-)currency. It’s called Bitcoin (BTC) in the Bitcoin network, Ether (ETH) on the Ethereum network, etc. To make a payment in the Bitcoin network, you must pay a small fee in Bitcoins - just like you would pay a fee to a bank. But then, where do the first coins come from?

A pile of Bitcoins

Miners receive a gratification for keeping the network working and safe. Each time they successfully mine a block, they receive a fixed amount of cryptocurrency. In Bitcoin this gratification is 25 BTC per block, in Ethereum it’s 5 ETH per block. That way, the blockchain generates its own money.

Lastly, cryptocurrencies rapidly became convertible to real money. Their facial value is only determined by offer and demand, so it’s subject to speculation. At the time of writing, mining Bitcoins still costs slightly less in energy and hardware than you can earn by selling the coins you discovered in the process. That’s why people add new miners every day, hoping to turn electricity into money. But fluctuations in the BTC value make it less and less profitable.

BTC to USD

Contracts

So far we’ve mostly mentioned facts storage, but a blockchain can also execute programs. Some blockchains allow each fact to contain a mini program. Such programs are replicated together with the facts, and every node executes them when receiving the facts. In bitcoin, this can be used to make a transaction conditional: Bob will receive 100 BTC from Alice if and only if today is February 29th.

Other blockchains allow for more sophisticated contracts. In Ethereum for instance, each contract carries a mini-database, and exposes methods to modify the data. As contracts are replicated across all nodes, so are their database. Each time a user calls a method on the contract and therefore updates the underlying data, this command is replicated and replayed by the entire network. This allows for a distributed consensus on the execution of a promise.

This idea of pre-programed conditions, interfaced with the real world, and broadcasted to everyone, is called a smart contract. A contract is a promise that signing parties agree to make legally-enforceable. A smart contract is the same, except with the word “technically-“ instead of “legally-“. This removes the need for a judge, or any authority acknowledged by both parties.

Public hearings of the Court presided over by H.E. Judge Rosalyn Higgins (February/March 2006)

Imagine that you want to rent your house for a week and $1,000, with a 50% upfront payment. You and the loaner sign a contract, probably written by a lawyer. You also need a bank to receive the payment. At the beginning of the week, you ask for a $5,000 deposit; the loaner writes a check for it. At the end of the week, the loaner refuses to pay the remaining 50%. You also realize that they broke a window, and that the deposit check refers to an empty account. You’ll need a lawyer to help you enforce the rental contract in a court.

Smart contracts in a blockchain allow you to get rid of the bank, the lawyer, and the court. Just write a program that defines how much money should be transferred in response to certain conditions:

  • two weeks before beginning of rental: transfer $500 from loaner to owner
  • cancellation by the owner: transfer $500 from owner to loaner
  • end of the rental period: transfer $500 from loaner to owner
  • proof of physical degradation after the rental period: transfer $5,000 from loaner to owner

Upload this smart contract to the blockchain, and you’re all set. At the time defined in the contract, the money transfers will occur. And if the owner can bring a predefined proof of physical degradation, they get the $5,000 automatically (without any need for a deposit).

You might wonder how to build a proof of physical degradation. That’s where the Internet of Things (IoT) kicks in. In order to interact with the real world, blockchains need sensors and actuators. The Blockchain revolution won’t happen unless the IoT revolution comes first.

Such applications relying on smart contracts are called Decentralized Apps, or DApps.

Smart contracts naturally extend to smart property, and a lot more smart things. The thing to remember is that “smart” means “no intermediaries”, or “technically-enforced”. Blockchains are a new way to disintermediate businesses - just like the Internet disintermediated music distribution.

Disintermediation

What Is A Blockchain, Take Two

In my opinion, the best way to understand the blockchain is to look at it from various angles.

What it does A blockchain allows to securely share and/or process data between multiple parties over a network on non-trusted peers. Data can be anything, but most interesting uses concern information that currently require a trusted third-party to exchange. Examples include money (requires a bank), a proof or property (requires a lawyer), a loan certificate, etc. In essence, the blockchain removes the need for a trusted third party.

How it works From a technical point of view, the blockchain is an innovation relying on three concepts: peer-to-peer networks, public-key cryptography, and distributed consensus based on the resolution of a random mathematical challenge. None of there concepts are new. It’s their combination that allows a breakthrough in computing. If you don’t understand it all, don’t worry: very few people know enough to be able to develop a blockchain on their own (which is a problem). But not understanding the blockchain doesn’t prevent you from using it, just like you can build web apps without knowing about TCP slow start and Certificate Authorities.

What it compares to See the blockchain as a database replicated as many times as there are nodes and (loosely) synchronized, or as a supercomputer formed by the combination of the CPUs/GPUs of all its nodes. You can use this supercomputer to store and process data, just like you would with a remote API. Except you don’t need to own the backend, and you can be sure the data is safe and processed properly by the network.

It's all a matter of perspective

Practical Implications

Facts stored in the blockchain can’t be lost. They are there forever, replicated as many times as there are nodes. Even more, the blockchain doesn’t simply store a final state, it stores the history of all passed states, so that everyone can check the correctness of the final state by replaying the facts from the beginning.

Facts in the blockchain can be trusted, as they are verified by a technically enforceable consensus. Even if the network contains black sheeps, you can trust its judgement as a whole.

Storing data in the blockchain isn’t fast, as it requires a distributed consensus.

Tip: If you have 20 spare minutes to get a deeper understanding, watch this excellent introduction video about Bitcoin, which also explains the blockchain:

Why It’s a Big deal

«The Blockchain is the most disruptive technology I have ever seen.» Salim Ismail

«The most interesting intellectual development on the Internet in the last five years.» Julian Assange

«I think the fact that within the Bitcoin universe an algorithm replaces the functions of [the government] … is actually pretty cool.» Al Gore

These smart people have seen a huge potential in the blockchain. It concerns disintermediation. The blockchain can potentially replace all the intermediaries required to build trust. Let’s see a few example applications, most of which are just proof-of-concepts for now:

  • Monegraph lets authors claim their work, and set their rules (and fares) for use
  • La Zooz is a decentralized Uber. Share your car, find a seat, without Uber taking a fee.
  • Augur is an online bookmaker. Bet on outcomes, and get paid.
  • Storj.io is a peer-to-peer storage system. Rent your unused disk space, or find ultra cheap online storage.
  • Muse is a distributed, open, and transparent database tailored for the music industry
  • Ripple enables low cost cross-border payments for banks

Blockchain use cases

Many successful businesses on the Internet today are intermediaries. Think about Google for a minute: Google managed to become the intermediary between you and the entire Internet. Think about Amazon: they became the intermediary between sellers and buyers for any type of good. That’s why a technology that allows to remove intermediaries can potentially disrupt the entire Internet.

Will it benefit to end users, who won’t need third parties to exchange goods and services anymore? It’s far from certain. Internet had the same promise of heavy disintermediation. Yet Google built the first market capitalization worldwide as an intermediary. That’s why it’s crucial to invest in the blockchain quickly, because the winners and losers of the next decade are being born right now.

You Won’t Build Your Own Blockchain

The technology behind the blockchain uses advanced cryptography, custom network protocols, and performance optimizations. This is all too sophisticated to be redeveloped each time a project needs a blockchain. Fortunately, aside of Bitcoin, there are several open-source blockchain implementations. Here are the most advanced:

  • Ethereum: an open-source blockchain platform by the Ethereum Foundation
  • Hyperledger: another open-source implementation, this time by the Linux Foundation. The first proposal was published very recently.
  • Eris Industries: Tools helping to manipulate Ethereum, Bitcoin or totally independent blockchains, mostly to build private networks. Their tutorials and explainers are a great starting point for an overview of the blockchain technology.

Ethereum

The maturity of these implementations varies a lot. If you have to build an application now, we’d advise:

  • Eris for a closed Blockchain, or to discover and play with the technology
  • Ethereum for a shared Blockchain

Also, Bitcoin isn’t a good choice to build an application upon. It was designed for money transactions and nothing else, although you can program pseudo-smart contracts (but you have to love assembly). The network currently suffers a serious growth crisis, transactions wait in line for up to one hour to get inserted in a block. Miners often select transactions with the highest fees, so money transfers in Bitcoin become more expensive than they are in a Bank. The developer community is at war, and the speculation on the cryptocurrency makes the face value move too much.

Numbers

How big are blockchains today? Let’s see some numbers.

Bitcoin:

Ethereum:

Ethereum stats

Conclusion

The blockchain technology is both intriguing and exciting. Could it be the revolution that gurus predict? Or is it just a speculative bubble based on an impractical idea? After reading a lot on the matter, we couldn’t form a definitive opinion.

When we face uncertainty, we know a great way to lift it: trying. That’s what we decided to do. Read the next post in this series to see what we’ve learned by building a real world app running on the blockchain.