A quick guide to In-DB tools Part 1

9 April 2019

In Alteryx we have In-Database tools. This allows users to connect directly to your database. The aim of this is that it lets you do ETL and calculations in the source system without moving the data out of it.

The In-DB tools let users visually build their SQL query, which is processed in the database. Therefore, when writing your formula or filter the data in the In-DB tools only the SQL syntax will work and not Alteryx-specific functions.

How do In-DB toolswork?

The first thing a user does is to manage their connections to the database in question. Manage In-DB Connection lets the user edit, build and view existing / new connections:

Or you can doit this way:

Following this, you need to input the credentials of your connection and select the connection type if you are unsure about your credentials then talk to IT.

The Connect In-DB tool will always be the first tool you drag in your canvas. This is where you connect to the database and the table that you want to connect to and build your query. When you write out your SQL query it’s ideal to avoid doing:

‘SELECT * FROM…’

I wouldn’t recommend bringing in the entire table unless you need to. This is because you'd want to reduce the size of the data, by only selecting the columns that you need. This will reduce the run time and increase performance.

The Browse Data In-DB tool: views the data and what’s going on in the workflow. This tool also caches the data. The biggest disadvantage of In-DB tool is that it doesn’t allow you to view what’s going on unless you have a browse tool. This is the only way to view the data in the workflow when utilizing In-DB tools.

The Write data In-DB tools: creates new tables, temporary tables and/or updates existing tables within the database, this is useful for when you Alteryx workflow goes on the scheduler.

Temporary Tables: whatdoes this mean?

The Write data In-DB tools: you can create temporary tables to be written up in the database. Temporary tables are deleted at the end of the Alteryx run. A key thing to remember when creating temporary tables in a database is that these are not managed by Alteryx. This is set up by your database administrator / IT department. Alteryx creates temporary tables by leveraging what has already been set up in the database itself. For this reason, there isn’t a way to change where the temporary tables are being created from Alteryx.

In-DB vs Standard Tools

In-DB tools are faster than the standard Alteryx tools since In-DB tools can blend and analyse large datasets without the need for the data to move outside of the database, unlike the standard tools.

The standard Alteryx tools will bring the data over the network to your local computer. The In-DB Browse and Write Data In-DB will stream out your database and into your local machine, which is why the In-DB Browse tools can significantly slow down your workflow.

Quick Tips:

  • Filter your data in the Connect In-DB tools or as early as possible
  • Reduce the use of In-DB Browse tool this increases run time exponentially and decrease performance, ideally if you’re going to put your Alteryx workflow on scheduler I would recommend removing any unnecessary In-DB Browse tool (especially the In-DB browse tool).
  • Avoid In-DB Stream out tool if possible

If you’reusing IN-DB tools you’ll have moments when you’re wondering: “why is this soslow?” This is because IN-DB tools converts what is happening into SQL code theis being executed, you can view this SQL statement using the Dynamic Output In-DBtools and select the query option. 

Benefits to usingIN-DB tools:

  • Faster
  • Youcan create hybrid workflow by streaming out this also gives you flexibility

Predictive In-DB tools

You canconnect to predictive tools without streaming out however, this depends on theDB that you are connecting to:

Watch out for Part 2 where I'll be covering how to split to rows using SQL and other In-DB quick tips.

Author:
Soha Elghany
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Information Lab and data industry
Subscribe now
© 2024 The Information Lab