Despite all the advancements we see with generative AI technologies, cleaning and preparing datasets is still a time-consuming challenge for data professionals. Anaconda’s 2022 State of Data Science report indicates that 38% of a data scientist’s time is spent on data prep and cleansing.
Alteryx Designer enabled easy ‘drag & drop’ data prep for all, reducing the time to deliver clean datasets, making it easier to spot and detect data quality issues, and allowing more advanced data science techniques packed into a simple toolkit.
In May 2021, Alteryx released Designer in the Cloud, with the move to the cloud to enable more users to easily clean and prepare their data. However, at the time of writing, there isn’t a complete match between what is available on the Desktop to what can be achieved on the Cloud.
Differences In Functionality
Designer Cloud is a lightweight version of the Desktop application, currently surfacing the core data prep tools. Designer Cloud currently offers 27 tools compared to over 270 tools available on Desktop. This is not to say that Cloud only has 10% of the ability of the Desktop app but the tools operate differently.
For example, data import and database connections are handled differently in the Cloud, these functions are not available as designer tools; instead, they rely on data you’ve uploaded to the Cloud. Other functions such as Reporting is now the Report Builder, which is separate to Designer Cloud.
On the Cloud you can see Designer as meeting the core data prep needs, with more specialised functionality available elsewhere on the Cloud or in the Desktop version of Designer.
Differences in Look and Feel
Section | Desktop | Cloud | Colour |
Tool Palette | Top bar | Left panel | Blue |
Config Menu | Left panel | Right panel (hides when not used) | Green |
Results Grid | Bottom | Bottom | Purple |
It is a bit of a shake-up to optimise Designer for a web view, if you’re more used to the desktop version it may take a little time to re-orientate and unlearn some of those instinctive movements.
Other changes you may notice on the Cloud are:
Undo & Redo are now on the canvas (top left)
Comments and containers are not added tools but are also on the canvas (top left)
The tool icons have had a slight redesign
Differences in Practice
To test the typical differences, I’ve taken a recent PreppinData Challenge, Week 41 from 2023. In this task we are given two datasets, asked to fix some typos and aggregate our data for an output.
Here are the two workflows
Overall, aside from the icon changes, they look pretty similar. However, there are a few differences to be aware of:
1. Adding Data
On desktop, I could drop and drop my csv files to the screen and Alteryx will automatically create Data Input tools with file paths to where my files are stored locally. On Cloud, I first have to upload my files to the Alteryx Cloud. However, there is a 1GB limit for larger datasets you would have to try an alternative method such as the S3 Private Data Storage or setting up a connector to your database/data store.
On Alteryx Cloud I can then manage the file, and utilise some collaborative features like sharing this data with coworkers or even transferring the ownership to a team member.
2. Viewing Data
If I want to view my data, on Desktop I would attach a Browse tool, run my workflow and view the data. On Cloud there’s no Browse tool, also the run button is disabled, so what’s going on? Cloud takes a 10Mb sample of your data and runs that through your workflow continuously, so it is essentially is always running when you add and update a tool on the canvas. There is no Browse tool in Cloud by design. By clicking the tool anchors you will be able to see the sampled view of the data,
Important to note here it’s a sample, it’s not necessarily the correct answer as I discovered that my result was showing Egypt and China, rather than Brazil and USA, despite working with much less than 10Mb of data.
To see the data unsampled we need to run the workflow. Adding an Output tool, and setting a destination, enables the button that allows you to run a workflow. Running the workflow will assign a job to run, clicking the output you will see the job running or completed, when completed you are then able to download your output from the Cloud.
3. Adjusting your Toolkit
There will be multiple ways to solve a problem, some more effective, or more efficient than others. Moving down to 27 tools on Cloud means we have to know all the use cases for the tools available to us.
For the task “Rank of each Nationality by classroom” on desktop I went straight to the Multi-Row Formula tool to create a rank on Nationality and restart it at each classroom. That tool isn’t available on Cloud – so what do I do? After some research, I found the Tile tool that could create a numeric column of Unique Value and group by a column, which creates the same multi-group ranking output.
Other tasks like “get rid of spelling mistakes in the Nationality field” are a bit more advanced. We need to change our country names from Brasil to Brazil, France instead of Franc, etc. and this would be solved with the Fuzzy Match tool, this tool isn’t in Cloud currently so what do I do? Given the small nature of this data I’ve renamed the typos using a formula tool, for a larger dataset you could import a list of correct spellings and join it to the typos by taking the first few characters, e.g. “Bra” to join “Brasil” to “Brazil”, and work through any mismatches or duplicated records. Not ideal but not game over, however more challenging scenarios may be better handled on Desktop.
Who’s Cloud Designer For?
Many companies are trying to move users from desktop applications to the cloud, but that is not the case here. Desktop designer and Cloud designer are intended for different audiences. For a new user opening up Alteryx with hundreds of tools can be overwhelming, to open Cloud and see a collection of the most used tools feels more manageable and encouraging for users to get started solving data problems. Eventually, these users will run into situations that require the desktop application and that will be their point to start with the more advanced tool.
New to Alteryx? Start with the Cloud Designer, learn the platform and test your use cases.
Interested in learning data prep with Alteryx?
Start a free trial with Alteryx: https://www.alteryx.com/products/designer-cloud
TTT Alteryx Course: https://www.youtube.com/@theinformationlab/featured
PreppinData Challenges: https://preppindata.blogspot.com/