Mission 2 - Advanced Splitting
Estimated time for completing this mission: 20 mins
Learning Objective
Understanding how to use LOGIBLOX in order to extract meaningful data from columns with Regular Expression.
Scenario
The column STATUS contains valuable information regarding the status of each order. However, the value is "hidden" in messy data.
Your mission is to use Regular Expression as well as column operations to derive the desired value which can then be used for classification/regression.
BLOX used in this mission:
- Basics/Start
- MyData/CleanedData
- Database/Split Column
- Database/Delete Column
- Database/Rename
- Database/Table
- Database/Save
Data
The previously saved data will be used for this exercise so from the "MyData" section, chose the "Module3" folder and then chose the "CleanedData" database.
How To Guide
Please refer to the Navigation Guide to perform the steps below
Flow Builder:
Extracting useful information from columns
- In Module3 folder press the green plus button to create new logic named Advanced Splitting
- Drag-and-drop logics that will be used for this mission including the dataset CleanedData
- Connect "Start" BLOX to the "MyData" BLOX
- Connect the output from "MyData" BLOX to a "Split Column" BLOX where you have to specify which column should be split and on which character (In our case, STATUS and the character is "_")
- This will result in a separate table with three columns from which the column named "1" needs to be kept
- Hence, the next step is to connect the output of the "Split Column" BLOX to a "Delete Column" BLOX and delete columns "STATUS", "0" and "2"
- Next connect the output from the "Delete Column" to a "Rename" BLOX where you have to specify which column you want to rename (in our case, column "1") and the new name (in our case STATUS)
- Finally, connect the result of the "Rename" BLOX to the "Save" BLOX, specify the folder (Module3) and name (preferably "FinalData")
Results and Summary
Challenge
To ensure your understanding of splitting, as an extra challenge, try to split the ORDERDATE column into year, month and day.


