Module 3: Advanced Data Science
Overview
After completing this module you will be able to use advanced data operation techniques as well as create complex AI models and charts in LOGIBLOX.
-
In section Advanced Data Massaging you will learn advanced data transformation techniques using LOGIBLOX. We will dive into the power of columns as key structural elements of understanding data.
-
Next, in Advanced Predictions you will learn everything about regression in LOGIBLOX, including non-linear regression, and you will create charts based on the outcomes. These are the first steps into Machine Learning. Regression is one of the most popular ML algorithms.
-
Finally, in Advanced AI you will be able to use AI, train AI models and use them for predictions.
Skills you will learn during this tutorial
Flow Builder:
Advanced Work with LOGIBLOX Flow Builder
Advanced Data Massaging - Column Excellence
- Fill Columns
- Manage Time Elements
- Transpose Columns
- Find & Replace in Columns
- Perform Mathematical Operations on Columns
- Concatenate Columns
Advanced Predictions - Regression Excellence
- Building and enhancing Regression Analysis
- Building Advanced & Insightful Charts
- Managing Predictions
Advanced AI
- Build AI Models
- Train AI Models
- Use Powerful Classifications
Missions
Missions are used to help you to practically apply LOGIBLOX tools.
1. Advanced Data Massaging
2. Advanced Predictions
- Mission 3: Creating Heatmap
- Mission 4: Creating Regression
- Mission 5: Modifying The Regression to Non-Linear
- Mission 6: Creating Advanced Chart With Regression
3. Advanced AI
- Mission 7: Feature Selection on the Dataset
- Mission 8: Training Time Series Model
- Mission 9: Using Prediction Model to Create Charts
- Mission 10: Training Classification Model
- Mission 11: Filling Empty Cells in the Dataset Using Classification Model
Data Used
In module 2 we will use the following datasets:
Name used in tutorials (Name of the file) - Transactions.xlsx
Typical questions: What information is provided in the dataset? How are tables and fields related to each other? How should the tables be connected to generate business insights? What columns exist? What is the time reference / order context?
This is a small excerpt of the main data table for the upcoming exercises:
| ORDERNUMBER | QUANTITYORDERED | PRICEEACH | ORDERLINENUMBER | SALES | ORDERDATE | STATUS | QTR_ID | MONTH_ID | YEAR_ID | PRODUCTLINE | MSRP | PRODUCTCODE | CUSTOMERNAME | PHONE | ADDRESSLINE1 | ADDRESSLINE2 | CITY | STATE | COUNTRY | TERRITORY | CONTACTLASTNAME | CONTACTFIRSTNAME | DEALSIZE |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10107 | 30 | 95.7 | 2 | 2871 | 2/24/2003 0:00 | Shipped | 1 | 2 | 2003 | Motorcycles | 95 | S10_1678 | Land of Toys Inc. | 2125557818 | 897 Long Airport Avenue | NYC | NY | USA | NA | Yu | Kwai | Small | |
| 10121 | 34 | 81.35 | 5 | 2765.9 | 5/7/2003 0:00 | Shipped | 2 | 5 | 2003 | Motorcycles | 95 | S10_1678 | Reims Collectables | 26.47.1555 | 59 rue de l'Abbaye | Reims | France | EMEA | Henriot | Paul | Small | ||
| 10134 | 41 | 94.74 | 2 | 3884.34 | 7/1/2003 0:00 | Shipped | 3 | 7 | 2003 | Motorcycles | 95 | S10_1678 | Lyon Souveniers | +33 1 46 62 7555 | 27 rue du Colonel Pierre Avia | Paris | France | EMEA | Da Cunha | Daniel | Medium | ||
| 10145 | 45 | 83.26 | 6 | 3746.7 | 8/25/2003 0:00 | Shipped | 3 | 8 | 2003 | Motorcycles | 95 | S10_1678 | Toys4GrownUps.com | 6265557265 | 78934 Hillside Dr. | Pasadena | CA | USA | NA | Young | Julie | Medium | |
| 10159 | 49 | 100 | 14 | 5205.27 | 10/10/2003 0:00 | Shipped | 4 | 10 | 2003 | Motorcycles | 95 | S10_1678 | Corporate Gift Ideas Co. | 6505551386 | 7734 Strong St. | San Francisco | CA | USA | NA | Brown | Julie | Medium | |
| 10168 | 36 | 96.66 | 1 | 3479.76 | 10/28/2003 0:00 | Shipped | 4 | 10 | 2003 | Motorcycles | 95 | S10_1678 | Technics Stores Inc. | 6505556809 | 9408 Furth Circle | Burlingame | CA | USA | NA | Hirano | Juri | Medium | |
| 10180 | 29 | 86.13 | 9 | 2497.77 | 11/11/2003 0:00 | Shipped | 4 | 11 | 2003 | Motorcycles | 95 | S10_1678 | Daedalus Designs Imports | 20.16.1555 | 184, chausse de Tournai | Lille | France | EMEA | Rance | Martine | Small | ||
| 10188 | 48 | 100 | 1 | 5512.32 | 11/18/2003 0:00 | Shipped | 4 | 11 | 2003 | Motorcycles | 95 | S10_1678 | Herkku Gifts | +47 2267 3215 | Drammen 121, PR 744 Sentrum | Bergen | Norway | EMEA | Oeztan | Veysel | Medium | ||
| 10201 | 22 | 98.57 | 2 | 2168.54 | 12/1/2003 0:00 | Shipped | 4 | 12 | 2003 | Motorcycles | 95 | S10_1678 | Mini Wheels Co. | 6505555787 | 5557 North Pendale Street | San Francisco | CA | USA | NA | Murphy | Julie | Small | |
| 10211 | 41 | 100 | 14 | 4708.44 | 1/15/2004 0:00 | Shipped | 1 | 1 | 2004 | Motorcycles | 95 | S10_1678 | Auto Canal Petit | (1) 47.55.6555 | 25, rue Lauriston | Paris | France | EMEA | Perrier | Dominique | Medium |
The file business_transaction contains raw sales data transaction records. However, some columns need transformation and completion to build the right baseline for further analytics (starting with formatting and missing information)
Preparation
To prepare for Module 3 create a new project folder named Module3.