ETLMicrosoft Business IntelligenceSSIS

SSIS Dependency Diagram with DiagrammeR and Sprockit

By 5 June 2019June 23rd, 2020No Comments

We can use DiagrammeR & Sprockit to discover Server Integration Services (SSIS) Dependencies. SSIS is a mature data integration tool bundled with SQL Server.

SSIS like many other tools can lead to ‘dependency hell’. This is where packages or stored procedures are developed that carry out many tasks. With many of these packages it can be very hard for a new developer to understand the dependencies. This makes for expensive maintenance, testing, error recovery.

There are frameworks around that can limit this.

One of the execution frameworks I’ve had lots of success with is Sprockit.

Sprockit is a powerful but lightweight ETL process controller for managing SSIS packages and T-SQL stored procedure execution. It’s free, open-source and written purely in T-SQL. Richard Swinbank developed it.

In this post I will show how you can create a diagram of dependencies with R stats.

Taking up from the end of the tutorial on Richard Swinbank’s site.

We first need to create a stored procedure that we can query from R. This procedure creates Digraph text.

I have to give a lot of credit to Richard as he developed the initial SQL before I hacked it to work with DiagrammeR

USE SprocketTutorial;
GO

SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO

CREATE PROCEDURE sprockit.Diagrammer
AS

DECLARE @gv NVARCHAR(MAX) = N'';

-- Nodes (processes)
SELECT @gv += N'n' + CAST(ProcessId AS VARCHAR) + N' [label=''' + ProcessName + N''']' FROM sprockit.Process;

-- Edges (process dependencies)
SET @gv += N'';

SELECT @gv += N'n' + CAST(PredecessorId AS VARCHAR) + N' -> n' + CAST(SuccessorId AS VARCHAR)
FROM sprockit.uvw_ProcessDependency;

SELECT 'digraph G {

graph [layout = dot]

node [shape = circle,
style = filled,
color = lightgrey,
fontname = Helvetica,
fixedsize = true,
width = 2.5]' + @gv + '}' AS res;

RETURN;
GO

I’ve included some format code in the code above.

graph [layout = dot]
node [shape = circle,
      style = filled,
      color = lightgrey,
      fontname = Helvetica,
      fixedsize = true, 
      width = 2.5]

This code is used by DiagrammeR. You can adjust this code for your own taste. With this stored procedure done we can look at the R code. I’m using RODBC to connect to SQL Server.
The database is on my machine and I am an admin on the database called SprocketTutorial. When the data gets into R it comes with \r\n – these need to be removed.

require("RODBC")
require(DiagrammeR)

#open the ODBC connection
dbhandle <- odbcDriverConnect('driver={SQL Server};server=DESKTOP-31S60FC;database=SprocketTutorial;trusted_connection=true')

# construct a query string
query <- "EXEC [sprockit].[Diagrammer]";
# execute the query
df<-sqlQuery(dbhandle, query)

#Remove \R\N
clean <- gsub("\r\n"," ", df$res)

grViz(clean)

Based on the tutorial on Richards site we have the following diagram :

You can have a look at the DiagrammeR site as there is a lot more power available for creating these graph diagrams.

This diagram is great for helping developers understand the dependencies that are at the heart of the Sprockit execution tool.

You can run this code as part of your CR pipeline to ensure your documentation is kept up to date.