Data Flow 1.0

So you set up this more than elegant data model on your Postgres database. It consists of several layers of views which are all sensible units in and of themselves, which are reusable in various contexts and which in sum prepare your data in exactly the way you want it for your Tableau dashboard. Perfect. You deploy your Tableau dashboard and everything is working as intended.

Then you jump into another project and only return when changed requirements demand for changes to your beautiful data model. You open up your database tool and all it tells you is the following:

Erm… what? (Let´s ignore the awful view names for the sake of the argument…) Where are all those clever processing stages and dependencies you created so painstakingly? So you remember some of the calculations and joins which caused you a lot of headaches but it´s all several months in the past…

You still know that the views which are relevant to your Tableau dashboard are “data_modeller_test_3”, “order_statistics” and “init_order_statistics_by_category”. You also remember that there should be data from the tables “orders” and “people” in there. But do you really have to go through all that code to find out about all dependencies and all the code sections you need to change due to the new requirements?

If you´re anything like us, you experienced situations like this or similar ones (think of handing over the project to another colleague, for example) a lot of times. This is why we came up with the idea to create a tool which supports you in developing and maintaining your data model on your Postgres database. We call it “Data Flow” and want to share it with you so that it may help you managing your own data models.

In today´s first blog entry on Data Flow, we´re presenting two user defined functions (UDFs) as well as the Data Flow Dashboard to you. The UDFs were developed to provide the data source for the workbook while the workbook automatically generates a network visualization of your data model and provides further useful information on the model. All files may be found at the bottom of this article.

This is what the Data Flow Dashboard looks like:

Both UDFs make use of system tables like “pg_depend” which Postgres handily provides. While return_data_model() accepts one or several “end_nodes”, i.e. views that you´re reporting on (see “data_modeller_test_3”, “order_statistics” and “init_order_statistics_by_category” above), and iterates through all upstream dependencies to sort out the data model you created, return_data_model_tableau_prep() calls upon return_data_model() and structures the function´s output in a way that may be used to create the network visualization of the data model depicted above.

  

Both UDFs are ready to be created and used on Postgres servers using a version later than 9.5.10. They should function on older versions as well but this has not yet been tested. Just create them on the database in question and direct your copy of the Data Flow workbook towards that database.

Aside from filtering on relevant end nodes, the Tableau workbook provides further features such as highlighting of dependencies and displaying the create code of selected elements. You may also decide whether you want to color code data model elements with respect to their node_role (i.e. root, intermediary or end node) or whether you want to color the nodes between data model elements depending on their I/O type (i.e. downstream or upstream dependency).

In upcoming further development stages of this project we will add further UDFs which allow to track the creation of as well as changes and refreshments of data model elements. This will provide workbook users with important information on all data model elements. Another extension will see UDFs which will make the process of changing a data model easier by taking care of necessary drop / create operations for you. Stay tuned for more, Data Flow will only get more powerful over time.

If you found this article to provide useful insights to you, please consider sharing it with others who might benefit from it. Also, please don´t hesitate to reach out to us if you´re interested in more details!

Download the sql and twb files