Module 2: Foundational Data Science + Analytics
Overview
After completing this module you will be able to use the data handler and flow builder on LOGIBLOX with ease. In section Data Preparation and Massaging you will learn the basics about using LOGIBLOX to create folders, import datasets and using them in logics sheets. Next in Data Workflow and Automation you will get to know some of the tools (logic BLOX) as well as how logic sheets work. Finally, in Data Analytics and Exploration you will be able to create informative graphs using more advanced techniques.
Skills you will learn during this tutorial
Data Manager:
- Work with LOGIBLOX Data Manager
- Load Data
- View Data
Flow Builder:
- Work independently with Flow Builder
- Join Data
- Split Data
- Aggregate Data
- Sum Data
- Visualize Data
Missions
Missions are used to help you to practically apply the LOGIBLOX tools.
1.Data Preparation and Massaging
- Mission 1: Import and study training data into the data section of LOGIBLOX
- Mission 2: Build your first logic and create a selective view of the previously imported data
- Mission 3: Split one column in the training data to further process them later and delete the original column. Store a new table after these changes.
- Mission 4: Retrieve supplier names from reference table and add to main data table based on supplier ID
- Mission 5: Create Versions of Flows
2.Data Workflow and Automation
- Mission 6: Concatenate category reference tables into one table
- Mission 7: Filter out unnecessary records from the dataset
- Mission 8: Automate your Flows
3.Data Analytics and Exploration
- Mission 9: Aggregate 'Amount' column and filter by Supplier column and create bar chart
- Mission 10: Create pie chart using data operations
- Mission 11: Apply statistical operations on the new Amount column
- Mission 12: Create ABC Analysis
- Mission 13: Create XYZ Analysis
- Mission 14: Create XYZ Analysis
Data Used
In module 1 we will use the following datasets:
What information is provided in these tables? How are the tables connected? How should the tables be connected to generate business insight? What columns are there?
This is a small excerpt of the main data table for the following exercises:
| Document Number | Document Date | Supplier ID | Category | Posting Date | Account | Cost Centre | ID: Category Level 1 | ID: Category Level 2 | ID: Category Level 3 | ID: Category Level 4 | DB | split_this_column |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 49112921 | 5/4/2021 | 83500207 | Expenses Proj. 1B | 5/3/2021 | 4383 | Branch: New Scott | 44000000 | 44010000 | 44019000 | 44019090 | Operations | 44 CHF |
| 63028710 | 8/16/2021 | 81007530 | Expenses Proj. 1B | 8/23/2021 | 1898 | Branch: South Rebecca | 45000000 | 45010000 | 45010100 | 45010190 | Finance | 132 CHF |
| 49114865 | 8/31/2021 | 81001008 | Expenses Proj. 1B | 9/5/2021 | 6265 | Branch: Angelaborough | 42000000 | 42110000 | 42110100 | 42110101 | Real Estate | 1003 CHF |
| 49118806 | 5/4/2021 | 83500207 | Expenses Proj. 1B | 5/3/2021 | 3153 | Branch: North Bryanbury | 44000000 | 44010000 | 44019000 | 44019090 | Operations | 124 CHF |
| 63047041 | 10/2/2021 | 83500207 | Expenses Proj. 1B | 10/2/2021 | 7165 | Branch: Port Tammyhaven | 44000000 | 44010000 | 44019000 | 44019090 | Operations | 55 CHF |
| 48092044 | 12/5/2021 | 81000703 | Expenses Proj. 1B | 12/14/2021 | 5478 | Branch: Middletonport | 41000000 | 41010000 | 41010100 | 41010102 | Retail | 44 CHF |
| 61800452 | 11/1/2021 | 83500207 | Expenses Proj. 1B | 11/1/2021 | 4313 | Branch: Trevorborough | 44000000 | 44010000 | 44019000 | 44019090 | Operations | 267 CHF |
business_transaction contains raw data transactions. As you can see, this table contains only references to various suppliers and commodity groups by ID.
It contains some errors with regards to content and to formatting. Resolving these errors is part of the next exercises.
The other five tables (lookup1, lookup2, lookup3, lookup4 and lookup_suppliers) contain the mapping of the supplier & category IDs to their respective names.
The categories follow a certain hierarchy where category 1 is the highest level and category 4 the lowest level.