Know-How
This section will provide you with all the necessary information to understand and complete the "Data Workflows and Automation" section which revolves around cleansing and tailoring the datasets for further usage using only LOGIBLOX tools. They are specifically designed to make your tasks as easy and comprehensible as possible.
Guide on data automation
Introduction
Data automation is the process of updating data on your open data portal programmatically, rather than manually. Automating the process of data uploading is important for the long-term sustainability of your open data program. Any data that is updated manually risks being delayed because it is one more task an individual has to do as part of the rest of their workload.
Elements and Procedure
There are three common elements to data automation: Extract, Transform, and Load.
Extract: the process of extracting your data from one or more sources systems.
Transform: the process of transforming your data into the necessary structure, such as a flat file format like a CSV. This could also include things like changing all state abbreviations to the full state name.
Load: the process of loading the data into the final system, in this case the open data portal.
As much data as possible! The more that you adopt an “automate by default” approach to uploading data, the less resources you will need long term for maintaining high data quality. Here are some tips for finding candidate datasets for automatic uploads:
Is the dataset updated quarterly or more frequently? Are there transformations or any form of manipulation that need to be done to the dataset prior to uploading? Is the dataset large (greater than 250MB)? Can you only get the changed rows for each subsequent update (rather than the full file)? Is it possible to get data from the source system, rather than from an individual? Datasets that prompt a “yes” to any of the questions above are great candidates for automating updates, because automation can remove the risk of lack of time and resources later on.
Taken from: https://support.socrata.com/hc/en-us/articles/212871018-Data-Automation-Overview#:~:text=What%20is%20data%20automation%3F,of%20your%20open%20data%20program.
Well done! Now lets move on to the first mission of this section!