Mission 2: Cleanse Data with AI
Estimated time: 10 minutes
Learning Objective
Now that you've learned how to create data pipelines, it's time to apply those skills to clean and refine a dataset. In this mission, you will perform specific actions to ensure the data is accurate, consistent, and ready for use.
Dataset
Download the required dataset: Churn.xlsx
Prerequisites
Please refer to the Navigation Guide to familiarize yourself with the platform interface.
Tip
If you're unsure about any steps, review Mission 1 first to understand the basics of the Data Transformer.
Data Cleansing Goals
We want to perform the following transformations:
- "medium_of_operation" - Replace
?withUnknown - "membership_category" - Remove the word
Membership - "offer_application_preference", "used_special_discount", "past_complaint" - Convert
Yes/Noto1/0 - "points_in_wallet" - Fill empty fields with
0
Step-by-Step Instructions
1. Import the Dataset
In the "Module 1" folder, click the "Add Item" button, select Add Data, choose Excel, and import the Churn dataset.
2. Open the Data Transformer
Right-click the Churn dataset and select Transform Data to launch the AI Data Transformer.
3. Build Your Cleansing Pipeline
Create a multi-step cleansing pipeline by entering each prompt in the search bar. You can try creating the pipeline yourself, or follow the detailed steps below.
Step 1: Replace Unknown Values
Replace placeholder characters with meaningful text:
replace the ? with the word Unknown in "medium_of_operation"
Press Enter and click Add Step to create the next transformation.
Step 2: Remove Text from Category
Clean up category names by removing redundant text:
remove the word "Membership" from "membership_category"
Press Enter and click Add Step.
Step 3: Convert Yes/No to Binary
Standardize categorical data by converting Yes/No values to binary format across multiple columns:
turn Yes and No into 1 and 0 in "offer_application_preference" "used_special_discount" and "past_complaint"
Press Enter and click Add Step.
Step 4: Fill Missing Values
Handle missing data by filling empty fields with default values:
fill empty fields with 0 in "points_in_wallet"
Press Enter.
Continue to Next Mission
Don't exit the Data Transformer dialog yet. Keep it open and continue directly to Mission 3: Export Pipelines to learn how to export your pipeline as a reusable Flow.
Visual Guide
Step 1: Open Data Transformer
Step 2: Replace Unknown Values
Step 3: Remove Text from Category
Step 4: Convert Yes/No to Binary
Step 5: Fill Missing Values
Next Step
Keep the Data Transformer dialog open and proceed to Mission 3 to export this pipeline as a Flow.
Common Data Cleansing Tasks
| Task | Example Command |
|---|---|
| Replace values | Replace "N/A" with "Unknown" in "Status" |
| Remove text | Remove the word "ID" from "Customer_ID" |
| Convert format | Turn Yes and No into 1 and 0 in "Active" |
| Fill missing | Fill empty fields with 0 in "Quantity" |
| Standardize text | Convert all text to uppercase in "Country" |
Summary
You've successfully learned how to:
✓ Import datasets for data cleansing
✓ Build a multi-step data cleansing pipeline
✓ Replace placeholder values with meaningful data
✓ Remove unwanted text from columns
✓ Convert categorical values to binary format
✓ Fill missing values with defaults
✓ Save cleansing pipelines for reuse




