Tuesday, July 28, 2015

SSIS Tutorial

Business Intelligence Development Studio (BIDS)
Day 1:
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, New, Project on the top level menu to display the New Project dialog.  Select Business Intelligence Projects as the project type, then Integration Services Project as the template; fill in the dialog as shown below:
               


After creating this new project and solution, I can navigate to the samples folder in Windows Explorer and see the following:


The samples folder holds my solution file (samples.sln) and contains the Tutorial-Sample-1 folder which holds my project.  The samples folder was created as a result of clicking Create directory for solution.
Returning to BIDS you will now see the following in the Solution Explorer which is located in the top right corner of the window:
By default a new SSIS package is added when you create an Integration Services Project; you can right click on it and rename it.
Now that we have successfully created a new project and a solution, let's add another new project to this solution.  Click the top level File menu in BIDS then Add, New Project and fill in the Add New Project dialog as shown below:

I have created this additional project to hold the SSIS packages created by executing the Import and Export wizard in SSMS.  This will create an SSMS-Samples folder to hold our new project; this folder will be underneath our samples folder.  I would like to suggest that you pay attention to what you're doing while creating new projects.  Make sure that the location and name are what you want.  The default values that are automatically filled in are rarely useful.  You can easily create a project and not know it's name or location.

Now when we look at the Solution Explorer in BIDS we see the following:

http://www.mssqltips.com/tutorialimages/205_solution_explorer_2.JPG

Our solution now contains two projects.  The SSMS Import and Export wizards provide the option to save your work as an SSIS package.  If you choose to save your work, you may want to add the package to an existing project.  Right click on the SSIS Packages node in the Solution Explorer of the SSMS-Samples, select Add Existing Package, and fill in the dialog as follows:

      
As stated in the title of the dialog, you are adding a copy of the SIS package to the project.  The SSIS package is copied from the package path you specify into your project's folder.

Creating projects and solutions with SSIS
Day 2:
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:

              

Expand the SSIS Packages node under the SSMS-Samples project in Solution Explorer and you will see the following:

                                   

Double click on the SSIS package SSMS-Export-to-Excel.dtsx to open the package.  Let's walk through the following aspects of the SSIS package as displayed in BIDS:
  • Designer
  • Connection Managers
  • Toolbox
  • Properties Window
The designer is the large area in the middle of the window and is shown below:

The following tabs are available in the designer:
  • The Control Flow tab (shown above) contains the tasks that the SSIS package performs and the flow from one task to another.
  • The Data Flow tab (shown below) is another designer that contains the details for a given data flow task; e.g. retrieve data from some data source, optionally perform some transformations on the data, then write it to some other data source.
  • The Event Handlers tab is yet another designer where we can specify tasks to be performed when a particular event is raised.
  • The Package Explorer tab represents the entire package in a tree-view.
The following is the Data Flow designer for the Data Flow task in the Control Flow tab as shown above:

The Connection Managers window contains the various data sources and destinations that the package uses:
                     

The Connection Managers are defined once then referenced in the various tasks such as the Execute SQL Task, an OLEDB Data Source, or a OLEDB Destination.
The Toolbox contains the tasks that are available to the Control Flow, Data Flow or Event Handlers designers.  To build a package you simply drag tasks from the Toolbox onto the designer and connect them in the order you want to execute.  The following Toolbox tasks are available in the Control Flow designer tab:


The following Toolbox tasks are available in the Data Flow designer:


The Properties Window is available for us to edit and update the properties of a task in the designer, or a connection in the Connection Managers area.  Click on the Preparation SQL Task in the Control Flow shown above and you will see the following in the Properties Window:


The Preparation SQL Task is an Execute SQL Task and is used to execute a SQL statement.  In this case we execute a CREATE TABLE statement but it could be any SQL statement or stored procedure.  When you right click on an object you get a popup dialog that you can use to set the various properties; e.g. right click on the Preparation SQL Task and select Edit from the menu and the following multi-page dialog will be displayed:


Overview of projects and solutions in SSIS 
Day 3:
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:


You should see the following in the Solution Explorer:


To add a new SSIS package right click on the SSIS Packages node under the Tutorial-Sample-1 project and select New SSIS Package from the popup menu.  A new package will be created under the SSIS Packages node and will be named Package1.dtsx (or something similar).  Right click on Package1.dtsx, select Rename from the popup menu, and enter CreateSalesForecastInput as the new name for the package.
The designer will now display an empty Control Flow.  In a future section we will drag and drop tasks onto this design surface to generate our Excel spreadsheet.  For now click anywhere in the Control Flow designer to bring up the package properties window.


The package properties window contains many properties that can be set for the package.  The following is an example:


I collapsed the property groups and left expanded the few that we will discuss here:
  • ProtectionLevel
  • IsolationLevel
  • TransactionOption
The ProtectionLevel provides for various options for encrypting the package and/or portions of the package.  I point this one out because the default value of EncryptSensitiveWithUserKey will often cause problems.  An example of sensitive information is a database connection string that contains a password in it.  The default setting will encrypt this information using the user key of the person who created the package.  When another user executes the package it will fail because that user's key will not be able to decrypt the connection string.  A good way around this is to change the ProtectionLevel to DontSaveSensitive, meaning you aren't going to put sensitive information in the package so there's no need to worry about encryption.

The IsolationLevel property is setting the Transaction Isolation Level.  Note that the default value is Serializable which may not be what you really need.  Serializable is the level where read locks are held until a transaction commits or rolls back which provides that no data read can be updated.  In addition range locks are held so that no data can be inserted such that rerunning any query in the transaction would return the rows added which weren't there at the beginning of the transaction.  Surely there are times when this transaction isolation level is warranted but certainly not always.   The Serializable level generally results in additional locking and decreased concurrency so you should consider whether you really need this and choose one of the other levels such as ReadCommitted.

The TransactionOption allows you to choose how the package behaves with respect to database transactions.  The default value of Supported is probably a good one.  If a package is executed within an existing transaction (i.e. called from another package) it will enlist or join the transaction which certainly makes sense as a default.  The other options are Required and NotSupported.  Required means that the package will always execute within a transaction, either joining the transaction of the caller or creating its own transaction.  NotSupported means the package does not join an existing transaction or create its own.

Creating a Simple SSIS Package
Day 4:
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:

Expand the SSIS Packages node under the Tutorial-Sample-1 project in Solution Explorer and you will see the following:


Double click on the SSIS package CreateSalesForecastInput.dtsx to open the package.  You should see the Connection Managers area of the designer in the middle of the screen near the bottom as shown below:


We are going to add a connection manager for the AdventureWorksDW database and another for the Excel spreadsheet that we will create and use as the sales forecast input.  To add the AdventureWorksDW connection manager simply right click inside the Connection Managers area then choose New OLEDB Connection from the popup menu.  The Configure OLEDB Connection Manager will be displayed; click the New button to display the Connection Manager dialog and fill it in as follows:


In my case the AdventureWorksDW database is on my local machine; change the Server name property as necessary for your environment.  If possible choose Use Windows Authentication to avoid having to specify a user id and password; this would be sensitive information that should be encrypted.  Click the Test Connection button to make sure you can connect to the database.  Click OK to complete this step.

To add a connection manager for our Excel spreadsheet, right click inside the Connection Managers area then choose New Connection from the popup menu, then select EXCEL from the Add SSIS Connection Manager dialog.  The Excel Connection Manager dialog will be displayed; enter a file name as shown below:


We are now finished adding the necessary Connection Managers to our package.

Day 5:
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:


Expand the SSIS Packages node under the Tutorial-Sample-1 project in Solution Explorer and you will see the following:


Double click on the SSIS package CreateSalesForecastInput.dtsx to open the package.  You should see the Toolbox on the left hand side of the screen as shown below:


As you can see there are quite a few built-in tasks available.  For our current package we only need a Data Flow task.  Drag and drop the Data Flow Task from the Toolbox onto the Control Flow designer.  After doing so the Control Flow designer should look like this:


Right click inside the Data Flow Task rectangle then select Rename from the popup menu.  Enter Create Sales Forecast Input Spreadsheet as the new name for the task. 
The designer includes a Data Flow tab which presents another design surface where you drag and drop the tasks that you want the data flow to perform.

SQL Server Integration Services (SSIS) Data Flow
Explanation
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:
                

Expand the SSIS Packages node under the Tutorial-Sample-1 project in Solution Explorer and you will see the following:

   

Double click on the SSIS package CreateSalesForecastInput.dtsx to open the package.  Click on the Data Flow tab in the designer and you should see the empty Data Flow designer as shown below:

The Toolbox will now display the tasks that are available in the Data Flow as shown below:


As you can see there are quite a few built-in tasks available.  For our current package we want to retrieve some data from the AdventureWorksDW database and output it to an Excel spreadsheet.  To do this we need an OLE DB Source and an Excel Destination.  In addition we need to add a column to the Data Flow; this column will be the numeric column in the spreadsheet where the user enters the forecast amount.  Drag and drop an OLE DB Source, Derived Column, and Excel Destination from the Toolbox onto the Data Flow designer.  After doing so the Data Flow designer should look like this:


The red icons inside of the tasks are an indicator that there is some configuration required.  Right click on the OLE DB Source then select Edit from the popup menu.  This will open the OLE DB Source Editor on the Connection Manager page.  Fill in the dialog as shown below:


We setup the OLD DB connection manager in the Connection Managers step earlier.  The SQL command text contains a simple query to retrieve the list of cities and states in the United States.
Click on Columns to display the column mappings as shown below:


By default all columns returned from the query are selected and the Output Column names are the same.  The Output Column names can be edited; we will leave them as is.  Click OK to complete the OLE DB Source Editor.  You will now see a green and a red line dangling from the OLE DB Source task on the Data Flow designer.  The green line is the normal data flow; the red line is the error flow.  Drag the green line and connect it to the Derived Column task as shown below:


Notice that the red icon is now gone from the OLE DB Source task and it is now properly configured.

The Derived Column task allows us to add a column to the Data Flow.  Right click on it, select Edit, then fill in the Derived Column transformation Editor as shown below:


We have now added a new currency column to the data flow with a name of Forecast, and a value of 0.  Click on the Derived Column task and drag the green arrow to connect to the Excel Destination task.

Right click the Excel Destination task and select Edit from the popup menu.  The Excel Destination Editor dialog will be displayed.  Click the New button next to the Name of the Excel Sheet to display the Create Table dialog as shown below:


The Create Table dialog allows us to create a new table in the Excel spreadsheet.  The columns and their types are determined by the data flow.  We configured an OLE DB Source task that executes a query and a Derived Column task that added the Forecast column to the data flow.  You can edit the CREATE TABLE script if you like.  Click OK on the Create Table dialog and the Excel Destination Editor Connection Manager page will look like this:


We configured the Excel Connection Manager in step two of this tutorial.  Click on Mappings to display the Mappings page as shown below:


The above dialog sets the Destination Column names to the Input Column names by default.
At this point we have completed the Data Flow. Note that there are no longer any red icons in the task rectangles. We will proceed to the final step in this section of the tutorial and execute the package.

Executing a Package in BIDS 


Explanation
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:

Expand the SSIS Packages node under the Tutorial-Sample-1 project in Solution Explorer and you will see the following:


Right click on the SSIS package CreateSalesForecastInput.dtsx and select Execute Package from the popup menu.  While the package is running you will see each task turn green if it completes successfully or red if it fails.  Here is the Control Flow after the package has run successfully:


Here is the Data Flow after the package has run successfully:


After the package completes click Debug on the top level menu then select Stop Debugging.
We can open the Excel spreadsheet that the package created and see the following (only a portion of the spreadsheet is shown):


The above spreadsheet shows the columns from our query plus the Forecast column which we added with the Derived Column task.  This completes our tutorial section on creating a simple SSIS package.

Deploying SSIS Packages   

Overview
When we are ready to deploy our packages we have the following options available:
  • Deploy to the file system
  • Deploy to the package store
  • Deploy to SQL Server
The simplest approach to deployment is probably to deploy to the file system.  As SSIS package is actually just an XML file and it can simply be copied from its project location to a folder on the deployment target.  You can use the DOS COPY command, Windows Explorer, etc. to perform the copy operation.  The package store is a particular folder on the file system; the default for SQL Server 2005 is C:\Program Files\Microsoft SQL Server\90\DTS\Packages.  SSIS packages deployed to SQL Server are stored in the msdb database.
There are three ways to deploy our packages:
  • Create a deployment utility from our project
  • Use the DTUTIL command line tool
  • Use SQL Server Management Studio (SSMS)
Explanation
To begin launch BIDS by selecting SQL Server Business Intelligence Development Studio from the Microsoft SQL Server program group.  Click File, Open, Project / Solution on the top level menu to display the Open Project dialog.  Navigate to the location of the solution as shown below then click Open:
·        


        Navigate to the Tutorial-Sample-1 project in Solution Explorer as shown below:
·        

      Right click on the Tutorial-Sample-1 project and select Properties from the popup menu.  Click Deployment Utility in the Configuration Properties list and fill in the dialog as follows:
·        


      Note that the CreateDeploymentUtility property is set to True; the default is False.  The DeploymentOutputPath specifies the location where the deployment files will be written.  The default is shown above and is relative to the project folder.  Click OK to save the settings.
      
            Right click on the Tutorial-Sample-1 project in the Solution Explorer and select Build from the popup menu.  This will build the project and invoke the deployment utility.  If all of the SSIS packages are in a valid state, you will see the message Build Succeeded in the bottom left of the window.  Navigate to the bin\Deployment folder underneath the project folder to view the deployment files.  You will see the following files:


·         The above files represent the deployment.  You can copy them to the deployment target then double click on the Tutorial-Sample-1.SSISDeploymentManifest file to perform the deployment.

Command line deployment tool for SSIS packages 

Explanation
To begin open a Command Prompt and navigate to the Tutorial-Sample-1 project folder as shown below:


In the examples that follow, I will show how to deploy the CreateSalesForecastInput.dtsx package to the file system, package store, and SQL Server.
To deploy to the file system, you could use the DOS COPY command, Windows Explorer, etc. or the following DTUTIL command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx
       /COPY
       FILE;C:\temp\CreateSalesForecastInput.dtsx

Replace the path C:\temp as appropriate.
To deploy to the package store, type the following command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx
       /COPY
       DTS;CreateSalesForecastInput

 To deploy to SQL Server, type the following command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx
       /COPY
       SQL;CreateSalesForecastInput
The above command deploys to the default SQL Server instance on the local machine.  To deploy to a different SQL Server add the command line parameter /DESTSERVER "SERVERNAME\INSTANCENAME".

Deploying SSIS packages with SQL Server Management Studio  

Overview
SQL Server Management Studio (SSMS) can be used to deploy SSIS packages to SQL Server or to the Package Store.
Explanation
To begin launch SSMS and connect to Integration Services.  Note that the SQL Server Integration Services service must be running in order to do this.  You will see the following in the Object Explorer:

As you can see there are two nodes under Stored Packages: File System and MSDB.  File System is actually the package store with a default location in SQL Server 2005 of C:\Program Files\Microsoft SQL Server\90\DTS\Packages.  MSDB is of course the MSDB database.

In the examples that follow we will deploy the CreateSalesForecastInput.dtsx package from its location in the project folder to the package store and the MSDB database.

To deploy to the package store, right click on the File System node and select Import package from the popup menu.  Fill in the Import Package dialog as shown below:

Click OK to import the package.
To deploy to the MSDB database, right click on the MSDB node and select Import package from the popup menu.  Fill in the Import Package dialog as shown below:

Executing SSIS Packages   
Overview
We have the following three options available to execute an SSIS package:
  • DTEXEC command line utility
  • DTEXECUI windows application
  • SQL Server Agent
Command line tool to execute SSIS packages  

Overview
SQL Server includes the command line tool DTEXEC.EXE which can be used to execute an SSIS package.  DTEXEC can be run from a Command Prompt or from a batch (.BAT) file.

Explanation
To begin open a Command Prompt and navigate to the Tutorial-Sample-1 project folder as shown below:


It is not necessary to run DTEXEC from the folder where the SSIS package is located; it's just easier to change to the directory for demonstration purposes.  Type the following command to execute the CreateSalesForecastInput.dtsx package:

DTEXEC /FILE CreateSalesForecastInput.dtsx

To see the complete list of command line options for DTEXEC type:
DTEXEC /?

Windows application to execute SSIS packages   

Overview

SQL Server includes the Windows application DTEXECUI.EXE which can be used to execute an SSIS package.  DTEXECUI  provides a graphical user interface that can be used to specify the various options to be set when executing an SSIS package.  You can launch DTEXECUI by double-clicking on an SSIS package file (.dtsx).  You can also launch DTEXECUI from a Command Prompt then specify the package to execute.
Explanation

To begin open Windows Explorer (or My Computer) and navigate to the Tutorial-Sample-1 project folder.  Double-click on the CreateSalesForecastInput.dtsx SSIS package and you will see the following multi-page dialog displayed:


As you can see there are many settings available when you use this utility.  As a general rule you can simply click the Execute button to run your package.  You can also fine tune your execution by clicking through the various screens and entering your own settings.  After changing the settings click on Command Line which will show you the DTEXEC command line based on the settings you have chosen.

Scheduling SSIS packages with SQL Server Agent   


Overview
SQL Server Agent includes the SQL Server Integration Services Package job step type which allows you to execute an SSIS package in a SQL Server Agent job step. This can be especially handy as it allows you to schedule the execution of an SSIS package so that it runs without any user interaction.

Explanation
To begin open SSMS, connect to the Database Engine, and drill down to the SQL Server Agent node in the Object Explorer.  Right click on the Jobs node and select New Job from the popup menu.  Go to the Steps page, click New, and fill in the dialog as shown below:

In the example above the SSIS package to be executed is deployed to SQL Server (i.e. the MSDB database).  You can also execute packages deployed to the file system or the SSIS package store.

Note that the Run as setting is the SQL Agent Service Account.  This is the default setting although from a security standpoint it may not be what you want.  You can setup a Proxy that allows you to give a particular credential permission to execute an SSIS package from a SQL Server Agent job step.

The first step to setting up the proxy is to create a credential (alternatively you could use an existing credential).  Navigate to Security then Credentials in SSMS Object Explorer and right click to create a new credential as shown below:



Navigate to SQL Server Agent then Proxies in SSMS Object Explorer and right click to create a new proxy as shown below:


You must specify the credential and check SQL Server Integration Services Package.  Now when you create or edit a SQL Server Agent job step, you can specify the proxy for the Run as setting as shown below:


 Execute SQL Task:-
Script Component:-


Data Conversion and Derived Columns
Right click Connection Managers in Solution Explorer and choose New Connection Manager:


Choose your Connection Manager type. In this example, we'll use OLEDB. Next, configure the Connection Manager to point to your dataset. In this example, I'll use localhost and the AdventureWorks2008R2 database:


Test the connection and click OK. Next, drag a Data Flow task from the SSIS toolbox onto the design screen:


Right click the Data Flow task and choose Edit. You are now inside the data flow task. This is where all the action happens. Drag an OLEDB source task from the SSIS toolbox to the design screen:

Right click the OLEDB task and choose Edit. This screen is where we will define the Connection Manager we created earlier. Under OLEDB connection manager choose the connection you created. Leave data access mode as Table or view. Change the name of the table or the view to the table that contains the data types to change. In this example, I'll use a table named Sales.CurrencyRate:


To preview the data click Preview. In my example, I want to change the CurrenyRateDate and the ModifiedDate columns from a datetime data type to a date data type:


Click OK to close the OLEDB Source task. Drag the Data Conversion Transformation task onto the design screen. Connect the OLEDB Source task to the Data Conversion task:


Right click the Data Conversion task and choose Edit. Here is where we will convert our data types. Since I am converting CurrencyRateDate and ModifiedDate I will click on each of them in the Available Input Columns list:


The Input Column selection is the source column while the Output Alias is the name of the new column after the data type conversion. The Data Type column is where you will change the data type. If you select a string value in the Data Type column specify the length in the Length column and if you are changing numeric data specify the Precision and Scale in the appropriate columns.

In this example, I'm changing CurrencyRateDate and ModifiedDate from "database timestamp" to "database date" since I don't want to show the time in the destination:

Click OK. Drag the Derived Column task from the SSIS toolbox onto the design screen. Connect the Data Conversion task to the Derived Column task:


Right click on the precedence constraint between Data Conversion and Derived column and click Enable Date Viewer. This will allow us to view the data as it passes through the constraint:


Let's view our data. Click the Start Debug button on the toolbar to debug:


Tada! You can see the data has changed from a datetime data type to a date data type:


Now that I have the data converted the way I want it I can remove the Derived column task and put an Excel destination task in its place:


I already have an Excel spreadsheet setup on my local hard drive with the column names as headers. Point the Excel destination task to this spreadsheet and execute the task to move the converted data into Excel:



Using the T-SQL CAST Function in SQL Server Integration Services to Convert Data Types

Now that we've went over how to use the Data Conversion Transformation task I'll show you a quick example of using the CAST function in your T-SQL code. Remove the Data Conversion task, right click the OLEDB Source Task and choose Edit:

Change the Data access mode to SQL Command and use the following SQL. You will notice that I'm using CAST to change the data type to date:

SELECT CurrencyRateID
 ,CurrencyRateDate
 ,FromCurrencyCode
 ,ToCurrencyCode
 ,AverageRate
 ,EndOfDayRate
 ,ModifiedDate
 ,CAST(CurrencyRateDate AS DATE) CopyofCurrencyRateDate
 ,CAST(ModifiedDate AS DATE) CopyofModifiedDate
FROM AdventureWorks2008R2.Sales.CurrencyRate


Connect the OLEDB Source back to the Derived Column, Enable the Data Viewer and click the Start Debugging button:



No comments:

Post a Comment