Create an ML Pipeline
Last updated
Last updated
Select Pipeline in left pane of Azure ML Studio and click on "Create a new pipeline using classic prebuilt components".
Creating a pipeline will create a blank canvas for connecting various components of ML. Click "ph_and_hr_dataset" or another dataset that you have uploaded and drag it to the blank canvas. Note that the "Data" tab below will show all available datasets that were uploaded and the "Components" tab will display all ML components that are available for use in the ML pipeline.
Before proceeding further, set up a compute instance. Follow the tutorial on "Create Compute Instance"
Right click on the dataset block and click "Preview data".
Preview will show not only a preview of the data but it allows some visualisations of the data as well.
Click on the Profile tab as shown in the image below.
The profile option shows the histogram of all features in the data.
Click on any one of the feature row and it will display the Boxplot of that particular feature.
Click on the next feature and observe the boxplot visualisation.
Select the components and connect them together as shown in the image below.
Double click the "Linear Regression" component and select the hyperparameters of the Linear Regression algorithms. Use the Solution method "Gradient descent" instead of "Ordinary Least Squares". Other parameters can be left at default values for this experiment.
Double click the "Split Data" component and change "Fraction of rows in the first output dataset" field to 0.8.
Set the "Random seed" value to 1.
Next, double click "Train Model" and click on "Edit column".
Click the "Enter column name" field and enter the name of the outcome column e.g. pH for the pH dataset.
Type "pH" and click "Save".
Click "Submit"
After clicking "Submit" in the previous step, a pop-up will appear asking to create a new "Experiment" or select "Existing Experiment". Click "Create new"
Click the "Enter new experiment name" field and type "predicting-ph-from-hr"
Click "Submit"
Once the job has been submitted. Select "Pipelines" in the left pane of the Azure ML Studio and select the submitted pipeline under the "Pipeline jobs" tab.
Right click "Scored dataset" and select "Preview data"
Observe the actual and predicted values side-by-side.
Right click on the "Evaluate Model" component and click on "Evaluation results" in "Preview data" option.
Observe the error metrics.