SSIS Series: How to use Conditional Split

From Microsoft, the Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language. The transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.

You can configure the Conditional Split transformation in the following ways:

  • Provide an expression that evaluates to a Boolean for each condition you want the transformation to test.
  • Specify the order in which the conditions are evaluated. Order is significant, because a row is sent to the output corresponding to the first condition that evaluates to true.
  • Specify the default output for the transformation. The transformation requires that a default output be specified.

Let’s take a look at how this transformation might be used in the real world.

Open Visual Studio and drag a Data Flow task into the design pane. Open the Data Flow task and drag in an OLE DB Source task. For this post, I’m going to use the AdventureWorks2019 database and the HumanResources.vEmployeeDepartment view.

This view has some good data to play around with, but we’re going to focus on the Department and Start Date columns. Let’s pretend the bossman needs to see all of the Employees in the Quality Assurance (QA), Production and Sales Department in a separate database table. Bonus, he needs to see all of the Production employees split up into two more tables based on who started before and after Jan 1, 2010. All other employees can go into their own table. Got it? Great! That’s 5 total tables. QA=1, Production=2, Sales=1, Leftovers=1 Let’s go.

Back in Visual Studio, drag in a Conditional Split task and connect it to our OLE DB Source.

Open the Conditional Split task editor and you’ll see a few options (from left to right, top to bottom):

  1. We can use columns and/or variables and parameters in our expressions that define how to split the data flow.
  2. We can use functions such as Date/Time, NULL and String in our expressions that define how to split the data flow.
  3. These are the conditions that define how to split the data flow. These need to be set in priority order; any rows that evaluate to true for one condition will not be available to the condition that follows.


Let’s start adding some conditions for our data. First, we’ll add a condition for all of our QA Department Employees. I’ll name the output “QA” and my condition is pretty simple whereas Department == “Quality Assurance”.

I’ll do the same for Production, Sales and Leftovers (everything else that doesn’t satisfy a condition). Since Leftovers is everything else we’ll just change the name of the Default Output name to identify it.

Let’s go ahead and add our destination tasks (except for Production since we need another condition) and link them to the appropriate condition. See below for QA as an example. When we drag our connector to our destination task we get prompted with an Input Output Selection box. Here is where we choose our Condition that will match up with our table. For the screenshot below, we’ll choose QA output for our QA destination.

Now that we have QA mapped, go ahead and map Sales and Leftovers.

Looks great!  QA, Sales and Leftovers are mapped successfully. Let’s take a look at adding another Conditional Split task for Production. Drag a Conditional Split task into the design pane and connect it to the current Conditional Split. It automatically maps to Production since it’s the only output left.

From our new Conditional split task, let’s open the editor and configure the date conditions for Production. We’ll leave the Default output name box as is since we shouldn’t have any leftover data from this split.

Now we can map the two new conditions to their appropriate destinations.

Cross fingers and hit Execute.

Yay, no red X! Let’s take a look at our SQL tables to make sure everything exported correctly.

Boom. Let’s go grab a bourbon!

SSIS Series: How to use SSIS Balanced Data Distributor

From Microsoft, the Balanced Data Distributor (BDD) transformation takes advantage of concurrent processing capability of modern CPUs. It distributes buffers of incoming rows uniformly across outputs on separate threads. By using separate threads for each output path, the BDD component improves the performance of an SSIS package on multi-core or multi-processor machines.

The Balanced Data Distributor transformation helps improve performance of a package in a scenario that satisfies the following conditions:

  1. There is large amount of data coming into the BDD transformation. If the data size is small and only one buffer can hold the data, there is no point in using the BDD transformation. If the data size is large and several buffers are required to hold the data, BDD can efficiently process buffers of data in parallel by using separate threads.
  2. The data can be read faster than the rest of the data flow can process it. In this scenario, the transformations that are performed on the data run slowly compared to the rate at which data is coming. If the bottleneck is at the destination, the destination must be parallelizable though.
  3. The data does not need to be ordered. For example, if the data needs to stay sorted, you should not split the data using the BDD transformation.

Let’s dive in and take a quick look at how the Balanced Data Distributor works.

Go ahead and open Visual Studio, it might take a minute to load.

We’re going to use AdventureWorks2019 database for this example. The SalesOrderDetail table has over 121k records so that’s a good candidate.

SELECT * FROM Sales.SalesOrderDetail

Now that we have data let’s go back over to Visual Studio and see if it’s still spinning.

Drag in a Data Flow task, double click to open and then let’s drag in an OLE DB Source and a Flat File Destination task.

I’m going to configure the source to point to my AdventureWorks2019 database and the destination to point to a .csv file on my local laptop.

Easy enough. Let’s go ahead and run this and see what happens.

We can see that our 121k rows were written to our CSV file and the CSV file ended up being 12.3 MB. That’s pretty large for an email file attachment and might crash your laptop even trying to open this file. BDD not only offers performance benefits by using multiple threads, but it can also break up large files into smaller ones. IMO, this is what makes this task great.

We need to get this file down to less than 4MB so we’ll need to break this up 4 times. With that said,  let’s add the BDD task between the source and destination tasks. There is nothing configurable *in* this task, however, there are some properties that may need to be tweaked. After adding the BDD task, we’ll need to add 3 more flat file destination tasks and 3 more flat file connection managers. End result should look something like this:

That’s really about it. Let’s fire it off and see the results of our flat files.

It worked! 121k rows were written across 4 flat files. Our flat file size is less than 4MB each and we can send these very easily via email. This is a very quick and easy way to split data between files, however, there is no order to these records so an ORDER BY is not helpful here. If the goal is to separate data by a category or condition then you’ll need to use a Conditional Split task which I wrote about in this post.

Creating a Dynamic Date Range Title in Power BI

Creating a dynamic title in Power BI helps present the data and let’s the viewers know what the data is filtered on. In this post I’ll go over how to do this…

I have a sales report that I’d like to add a title that is based on the Order Date Slicer. Currently, the title is static text “Sales Report”

Sales Report

To create my dynamic title, I’ll first need to create a measure table that has my Order Date data. In this case, that table is FactInternetSales and the column is OrderDate.

To create a measure, click New Measure in the Power BI Desktop ribbon

Power BI Desktop Ribbon

Next, you’ll see a window where you can type code. In this example, I’ll use the following DAX

Next, you’ll see a window where you can type code. In this example, I’ll use the following DAX

Order Date Title = “Sales For ” &
MIN ( FactInternetSales[OrderDate] ) & ” to “
& MAX ( FactInternetSales[OrderDate] )

Let’s walk through this real quick.

The first line (Order Date Title = “Sales For “ &) is basically naming the measure and adding the beginning text for the title.
The second line (MIN ( FactInternetSales[OrderDate] ) & “ to “) is finding the minimum order date from FactInternetSales.OrderDate and then adding the “to” text.
The last line (MAX ( FactInternetSales[OrderDate] ) is finding the maximum order date from FactInternetSales.OrderDate.

This one was pretty easy. Once I’ve typed my DAX, hit the checkmark to make sure there are no errors and the click off screen.

DAX

Our measure has been created! Let’s go back and find it under the FactInternetSales fields pane.

Power BI Fields

Next, let’s click on the Card Visualization and move and size it appropriately to fit in our title space.

Card Visualization

While the card is highlighted, click on the new measure from the Fields pane and it will populate the card with the measure we created.

The only thing left to do is format the title and we’re all set! If we change the Order Date Slicer, you’ll notice the title changes with the date. See live example at the beginning of this post.

March Madness Power BI Dashboard

At the beginning of the year I set a goal to learn something new. I’ve always loved business intelligence and bringing data to life in the form of dashboards and charts so for the 1st half of the year I wanted to focus on Microsoft’s Power BI. I’m not going to explain what Power BI is, but if you want to read up on it go here: https://powerbi.microsoft.com/en-us/

This post is just going to show off my dashboard. ? See live example above.

I’m a huge sports fan and the best time of the year happens to fall in March. Besides my birthday being in March, it’s also March Madness. Hours and hours of basketball. I could of used AdventureWorks for my dataset, but I wanted to use something I’m interested in. I found some data containing every NCAA tournament game result since 1985 (when the tournament was expanded to the 64 team bracket). The dataset contains the year, round (1-6), seed of the teams (1-16), region (1-4) and the scores. Perfect. Let’s use this to create a dashboard.

There’s not a ton of data, but I used what I could and tried to answer some questions around wins and upsets. Here’s a screenshot of the final product:

March Madness Power BI Dashboard

You can see Wins By Team (Duke with 93, North Carolina with 78, etc), Wins by Seed, National Championships, and Upsets vs Wins by Year. You can also see that a total of 2142 games have been played with 199 different teams in the tournament.

This was really fun and answers a lot of the questions I was thinking in my head while designing. The top left corner also has slicers which help filter the data. For example, if I wanted to see only the data for 2015 I could change the Year slicer to 2015 and it would update all my visualizations:

March Madness Power BI Dashboard filtered by year 2015

You can see that Duke won the National Championship from the National Championships visualization. If you hover over the Wins and Upsets visualization, you’ll see there were 30 upsets out of 63 games.

Let’s say I want to view data for a certain Team. Let’s choose Alabama Crimson Tide. If I change the Team slicer to Alabama I can see some data based around this team.

March Madness Power BI Dashboard filtered by team Alabama Crimson Tide

Alabama has won 19 NCAA tournament games, 0 national championships, has been a 5 or 7 seed 21% of the time and they’ve had a few upsets along the way. Not bad for a football school.

What about data for the National Championship game? I can change the Round slicer to 6, which is the National Championship round and view the data this way.

March Madness Power BI Dashboard filtered by Championship game

I can see out of 34 games, there has only been 16 different teams make the National Championship. Duke leads the way with 6, followed by North Carolina and Connecticut with 4. The 1 seed has played in this game 59% of the time, and there were upsets in 1988, 1990, 1997, 2003, 2006, and 2016.

We can also click on the visualizations themselves to view data. For example, if we reset our slicers to show all data and click on the #1 seed in the Wins By Seed Donut Chart we see the following:

March Madness Power BI Dashboard filtered by #1 seed

We can see that the #1 seed has played in 419 games with a total of 41 different teams. Duke has won 51 games as the #1 seed while North Carolina has won 46. Duke has also won the National Championship 4 times as the #1 seed and in 1999 the #1 seed won 17 games which is the highest.

Really cool stuff. I loved working on this project and working with this data.

Collect SQL Server Performance Counters and Build Reports with SSRS

Third party tools are awesome for capturing performance metrics, but some small shops (and even large shops) don’t budget for this type of software leaving it up to the DBA to create their own monitoring solution.

There are a few different ways to capture certain performance metrics, but in this post I’ll focus on using the sys.dm_os_performance_counters DMV and how to view this data in a more readable form using SQL Server Reporting Services graphs. The DMV doesn’t include all the counters as Performance Monitor, but it does show the SQL Server related counters. (Note that some of the counters in this DMV are of cumulative values since the last reboot.)

You can query this DMV using the following query:

SELECT * FROM sys.dm_os_performance_counters

SQL Freelancer SQL Server Performance Counters SSRS

As you can see from the screenshot above, this can be hard to read. Also, it only shows the current values at the time the query is executed so you don’t know anything about the past and it makes it hard to see how the data fluctuates during the business day.

Collecting the SQL Server Monitoring Report Data

For the purpose of simplicity, this tip will focus on one counter, Page Life Expectancy (PLE). I’ll show you how to capture data and create a graph for analysis.

First, we’ll need to create a table that will store our metrics:

CREATE TABLE [dbo].[CounterCollections](
[ID] [int] IDENTITY(1,1) NOT NULL,
[object_name] [varchar](128) NOT NULL,
[counter_name] [varchar](128) NOT NULL,
[cntr_value] [bigint] NOT NULL,
[collection_datetime] [datetime] NOT NULL )

Next, we’ll need to create a script that will insert our data into the table we created above:

INSERT INTO CounterCollections
SELECT object_name, counter_name, cntr_value, GETDATE() collection_datetime
FROM sys.dm_os_performance_counters
WHERE object_name = ‘SQLServer:Buffer Manager’
AND counter_name = ‘Page life expectancy’

Finally, we’ll need to create a SQL Server Agent Job that will run the script above on a specified schedule:

SQL Freelancer SQL Server Performance Counters SSRS
I’ll run this job every 5 minutes:

SQL Freelancer SQL Server Performance Counters SSRS

Creating the SQL Server Monitoring Report

While the table gathers data we can switch over to Business Intelligence Development Studio (BIDS) or the SQL Server Data Tools (SSDT) and create a SSRS Report that will display our data.

In this example, I’m using SSDT. Choose File, New Project. On the New Project dialog choose Report Server Project and name the Project:

SQL Freelancer SQL Server Performance Counters SSRS
Click here to view the rest of this post.

Power Query for Excel

In this post, I’ll discuss the prerequisites and how to install and enable Power Query (Data Explorer) and I’ll show you how to use this new feature.

Prerequisites:

  • Requires Microsoft Office 2010 SP1 or Microsoft Excel 2013 32-bit or 64-bit
  • Requires Windows Vista (with .Net 3.5 SP1 or greater), Windows Server 2008 (with .Net 3.5 SP1 or greater), Windows Server 2008 R2, Windows 7 or Windows 8.

Installation:

Download and install the preview from Microsoft Download Center

Once the feature is installed open Excel and go to File, Options, Add-Ins. Select COM Add-Ins and click Go.

SQL Freelancer SQL Server Excel Data Explorer Power Query BI
Check Microsoft “Data Explorer” Preview for Excel and click OK.
SQL Freelancer SQL Server Excel Data Explorer Power Query BI
Once Data Explorer has been enabled, the tab will appear above the Office ribbon.

SQL Freelancer SQL Server Excel Data Explorer Power Query BI

Now that we have Data Explorer installed and enabled we can get to the fun stuff.

To see a detailed list of each element you can visit Microsoft Data Explorer Help

If you click on the Data Explorer tab you will notice Get External Data. Just like Excel and PowerPivot, this is where our data source will come from. There are multiple data sources to choose from including websites, files, databases, Active Directory, and even Facebook. Yes….Facebook.

SQL Freelancer SQL Server Excel Data Explorer Power Query BI

In this tip, I’ll show you examples of two data sources. We’ll pull data from a website and create a map report using PowerView and we’ll pull data from Facebook and make a report using PowerPivot.

Web Data Source

In the first example, I’ll pull table data from the web using Wikipedia. Choose “From Web” from the Get External Data section and use the following URL:http://en.wikipedia.org/wiki/List_of_countries_by_population

SQL Freelancer SQL Server Excel Data Explorer Power Query BI

Click OK

In the Query Editor under Navigator, select Countries. This will display rank, country, population, date, % of world population and source. We can filter columns just like in Excel by selecting the header arrow and choosing the appropriate filter. In this example, we’ll filter Source to only includes records that are an official estimate.

SQL Freelancer SQL Server Excel Data Explorer Power Query BI
Click here to view the rest of this post.

Microsoft’s new 3D data visualization BI product – Power Map

Microsoft has introduced a new BI product that will provide 3D data visualization using Bing Maps. This product is still in beta and is codenamed “GeoFlow” but should be out late 2013 or early 2014. This preview will allow you to plot geographic and temporal data visually, analyze that data in 3D, and create visual tours. It is a really cool product and I can’t wait to use it in a business atmosphere.

In this post, I’ll discuss the prerequisites and how to install “GeoFlow” and I’ll give an introduction on how to use this feature.

Prerequisites:

  • Requires Microsoft Office Professional Plus 2013 or Office 365 ProPlus
  • Supports 32-bit or 64-bit machines
  • Requires Windows 7, Windows 8, or Windows Server 2008R2 (requires .NET Framework 4.0)
  • Requires internet access

Installation

Download and install the preview from Microsoft Download Center (make sure Excel is closed while installing)

Open Excel and “Maps” will appear as a new item in the “Insert” tab of the Excel ribbon:

SQL Freelancer SQL Server Excel GeoFlow Power Map BI
Now that “GeoFlow is installed, we can begin using the new feature.

For this tip, I’m using the AdventureWorks2012 database and the result set from the following query imported into Excel:

  SELECT  
  OrderDate, 
  ShipDate, 
  AddressLine1, 
  City, 
  PostalCode, 
  StateProvinceCode
  FROM [AdventureWorks2012].[Sales].[SalesOrderHeader] soh
  JOIN [AdventureWorks2012].[Person].[Address] ps ON soh.ShipToAddressID = ps.AddressID
  JOIN [AdventureWorks2012].[Person].[StateProvince] psp ON psp.StateProvinceID = ps.StateProvinceID
  WHERE StateProvinceCode = 'CA'

SQL Freelancer SQL Server Excel GeoFlow Power Map BI
Once you have the data in Excel, click on the “Insert” tab on the Office ribbon and click Map. You should now see the “GeoFlow” globe:

SQL Freelancer SQL Server Excel GeoFlow Power Map BI
There are a few tabs you can use right away in the ribbon bar. Under the “Map” item you can change Themes and add Map Labels. I like to enable the Map labels so I can see a more detailed version of the map. You can also zoom in and out and move the map using the arrows and +/- buttons in the bottom right corner of the map.SQL Freelancer SQL Server Excel GeoFlow Power Map BI
On the right side you see the “Layer” pane. This is where you will choose your geography visualizations. For this example Ill select “PostalCode” under the range section and since we know that this column stores zip codes we’ll map it to “Zip” under the Geography section:

SQL Freelancer SQL Server Excel GeoFlow Power Map BI

Once the fields are selected, click Map It.

You might notice that 86 records were updated meaning that there are 86 unique zip codes in California and it placed a marker on each zip code:

SQL Freelancer SQL Server Excel GeoFlow Power Map BI
Click here to view the rest of this post.

Building Key Performance Indicators (KPIs) with PowerPivot

A KPI (Key Performance Indicator) is a graphical representation that displays progress against a predefined measure or business goal. KPIs make it easier for end users to evaluate the amount of progress without reading a bunch of data.

In this post, I’ll use AdventureWorksDW2012 sample data so you can follow along with me. The database can be downloaded here.

Let’s get started.

Enabling PowerPivot in Excel 2013

To enable PowerPivot, open Excel, go to File, Options, Add-Ins, select COM Add-ins and click Go. This will open up the COM Add-Ins dialog box. Click “Microsoft Office PowerPivot for Excel 2013” and hit OK. After successfully enabling PowerPivot, the tab should appear at the top of the Excel spreadsheet:

SQL Freelancer SQL Server PowerPivot KPI

Importing Data

Open Excel, click the PowerPivot tab, Manage:

SQL Freelancer SQL Server PowerPivot KPI
Upon clicking Manage, a new window should appear. From this window, you will import data. Click From Database and select From SQL Server:

SQL Freelancer SQL Server PowerPivot KPI
Type in the Server Name, Authentication mode, and browse to the AdventureWorksDW2012 database:
SQL Freelancer SQL Server PowerPivot KPIClick Next, choose “Select from a list of tables and views to choose the data to import” and click Next. The next screen is where we will select our data to import. For this example, choose FactInternetSales and click “Select Related Tables”. The Select Related Tables button enables you to automatically select every table that is related to the source table selected:

SQL Freelancer SQL Server PowerPivot KPI
After clicking Finish, the import will begin. Once the import finishes successfully you should be able to view all the tables separated into sheets:

SQL Freelancer SQL Server PowerPivot KPI

Creating PivotTable

Before creating a KPI we will need to slice and dice our data into a PivotTable. To do this, click PivotTable on the ribbon bar and choose New Worksheet:

SQL Freelancer SQL Server PowerPivot KPI
Click here to view the rest of this post.

PowerPivot with Excel 2013

Excel 2013 changes things up a bit when it comes to installing PowerPivot. In previous versions you had to download the component and install, but with Excel 2013 it comes installed as an add-in, but disabled by default. To enable PowerPivot, open Excel, go to File, Options, Add-Ins, select COM Add-ins and click Go.

SQL Freelancer SQL Server Excel PowerPivot
This will open up the COM Add-Ins dialog box. Click “Microsoft Office PowerPivot for Excel 2013” and hit OK.

SQL Freelancer SQL Server Excel PowerPivot
After successfully enabling PowerPivot, the tab should appear at the top of the Excel spreadsheet.

SQL Freelancer SQL Server Excel PowerPivot

Creating a dashboard

There are a few different ways in which to import data into Excel to use with PowerPivot. Some of these ways include:

  • From database

SQL Freelancer SQL Server Excel PowerPivot

  • From Data Service

SQL Freelancer SQL Server Excel PowerPivot

  • From other sources such as Oracle, Excel, flat files, etc.

For this example, and simplicity sake, I will just run a query and simply copy and paste my results into the Excel spreadsheet. The query results look like this:

SQL Freelancer SQL Server Excel PowerPivot
Once the results are copied and pasted into Excel, click the PowerPivot tab and click Add to Data Model:

SQL Freelancer SQL Server Excel PowerPivot
On the create table dialog box, make sure you select the range for your data and click “My table has headers”

SQL Freelancer SQL Server Excel PowerPivot
After clicking OK, the PowerPivot window should appear. To start creating the dashboard, click PivotTable, PivotChart, then select New Worksheet:

SQL Freelancer SQL Server Excel PowerPivot
Click here to view the rest of this post.

Creating a SharePoint Key Performance Indicator (KPI)

A KPI (Key Performance Indicator) is a graphical representation that displays progress against a predefined measure or business goal. KPIs make it easier for end users to evaluate the amount of progress without reading a bunch of data.

We will use the sample data (see below) I used in the previous post to create our KPI (this data is static since we imported it into Sharepoint as a separate list, so it will not report real time data)

Let’s say, for example, our DBA Manager wants to upgrade all the SQL Servers to at least SQL Server 2008 R2 and wants to know how many SQL Servers are not compliant. This is a good scenario for using a KPI to display this data.

SQL Freelancer SQL Server Sharepoint KPI


Creating the KPI

In Sharepoint, browse to the page where you want the KPI displayed and go to Site Actions > Create:

SQL Freelancer SQL Server Sharepoint KPI
Custom Lists > KPI List:

SQL Freelancer SQL Server Sharepoint KPI
For this example, I’ll create a list called SQL Versions KPI:
SQL Freelancer SQL Server Sharepoint KPI
Once, the KPI list is created you should see a blank list. To add a KPI, click New > Indicator using data in Sharepoint list (you can also use other data sources for KPI’s):

SQL Freelancer SQL Server Sharepoint KPI
First, give the KPI a name:

SQL Freelancer SQL Server Sharepoint KPI
Click here to view the rest of this post.