Posts Tagged ‘SSIS’

Task: to create a SSIS package to incremental data loading by means of Change Tracking



The process of creating a SSIS solution consists of the following steps:
1. creating of a SSDT (SQL Server Data Tools) SSIS project
2. enabling of the ‘CHANGE TRACKING’ feature on a database and table level
3. creating of the ‘MySettings’ table and inserting a default value into the ‘LastVersion’ column
4. creating of the ‘Version_Get’ stored procedure
5. getting and setting of the ‘@LastVersion’ and ‘@CurrentVersion’ variables
6. creating of the FAKE DWH ‘BusinessEntity_DWH’ table
7. FAKE inserting into the ‘BusinessEntity’ table
8. incremental data loading into the fake DWH ‘BusinessEntity_DWH’ table (the core of ‘CHANGE TRACKING’)
9. saving of the ‘@CurrentVersion’ variable into the ‘LastVersion’ column
10. disabling of the ‘CHANGE TRACKING’ feature on a table and database level; deleting of created objects

Change Tracking – Control Flow

Figure 1: Change Tracking – Control Flow


Change Tracking – Data Flow

Figure 2: Change Tracking – Data Flow


Source code: SSIS Package (SQL Server 2012 – SSDT)
Note: do not forget to customize the ConnectionString in the Project.params setting

Additional references:
Task: to create a SSIS package to delete duplicates by means of Fuzzy logic



The process of creating a SSIS solution consists of the following steps:
1. creating of a SSDT (SQL Server Data Tools) SSIS project
2. defining of connection to data sources
3. defining of connection to saving duplicate records
4. defining of connection to saving error messages
5. defining of connection to a database
6. creating of a table in a target database
7. creating of a Data Flow diagram
8. removing of exact duplicates
9. removing of “similar” duplicates by means of Fuzzy logic
10. saving of retrieved duplicates

Deleting duplicates by means of Fuzzy logic

Figure 1: Deleting duplicates by means of Fuzzy logic


Source code: SSIS Package (SQL Server 2012 – SSDT)
Note: do not forget to customize the Connection Managers in solution setting

Additional references:
Task: to give a basic look at the import of a CSV file to a SQL Server table by means of SSDT by Microsoft Visual Studio



Sales per region,Africa,Asia,Europe
January,34,67,56
February,36,87,78
March,31,56,88
April,29,67,92
May,54,71,68
June,68,71,54
July,54,71,68
August,68,71,54
September,92,67,29
October,88,56,31
November,78,87,36
December,56,67,34

Source code 1: CSV data



USE [Test]
GO

CREATE TABLE [dbo].[SSIS_CSV_Import]
(
    [Month]      [varchar](9)    NOT NULL,
    [SaleAfrica] [numeric](2, 0) NOT NULL,
    [SaleAsia]   [numeric](2, 0) NOT NULL,
    [SaleEurope] [numeric](2, 0) NOT NULL
)
GO

Source code 2: Table structure


Flat File Connection Manager Editor – General

Figure 1: Flat File Connection Manager Editor – General


Flat File Connection Manager Editor – Columns

Figure 2: Flat File Connection Manager Editor – Columns


Flat File Connection Manager Editor – Advanced

Figure 3: Flat File Connection Manager Editor – Advanced


SQL Destination Editor – Connection Manager

Figure 4: SQL Destination Editor – Connection Manager


SQL Destination Editor – Mappings

Figure 5: SQL Destination Editor – Mappings


SSIS Package – Data Flow

Figure 6: SSIS Package - Data Flow



SELECT [Month], [SaleAfrica], [SaleAsia], [SaleEurope]
FROM   [dbo].[SSIS_CSV_Import]
GO
/* Result:
   Month       | Africa | Asia | Europe
   January       34       67	 56
   February      36       87     78
   March         31       56     88
   April         29       67     92
   May	         54       71     68
   June	         68       71     54
   July	         54       71     68
   August        68       71     54
   September     92       67     29
   October       88       56     31
   November      78       87     36
   December      56       67     34
*/

Source code 3: SQL query for result verification

Additional references:
Task: to create row values concatenation, i.e. CSV (Comma Separated Values), by means of the FOR XML PATH construction, and to export database table column values to the MS Excel file using the Microsoft.Jet.OLEDB ad hoc method of connecting and accessing remote data by using OLE DB
Solution: see Source code 1 – 4



EXEC sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
EXEC sp_configure 'ad hoc distributed queries', 1;
GO
RECONFIGURE;
GO

Source code 1: system configuration



USE [AdventureWorks]
GO

DECLARE @InputData        AS TABLE
(
        [ManagerCounter]  INT,
        [ManagerLoginID]  NVARCHAR(256),
        [EmployeeLoginID] NVARCHAR(256)
)
;WITH [Employees]
(
      [ManagerCounter],
      [ManagerID],
      [LoginID]
)
AS
(
    SELECT ROW_NUMBER() OVER(PARTITION BY [ManagerID] ORDER BY [ManagerID]),
           [ManagerID],
           [LoginID]
    FROM   [HumanResources].[Employee]
)
INSERT INTO @InputData
(
            [ManagerCounter],
            [ManagerLoginID],
            [EmployeeLoginID]
)
SELECT   [ManagerCounter],
         [Managers].[LoginID],
         [Employees].[LoginID]
FROM     [HumanResources].[Employee] AS [Managers] INNER JOIN [Employees] ON
         [Managers].[EmployeeID] = [Employees].[ManagerID]
ORDER BY [Managers].[LoginID]

/* temporary test */
SELECT    [ManagerCounter]  AS 'ManagersCounter',
          [ManagerLoginID]  AS 'Managers',
          [EmployeeLoginID] AS 'Employees'
FROM      @InputData
ORDER BY  [ManagerLoginID]

Source code 2: input data preparation (ManagersCounter – number of employees per manager)


MS SSMS result of input data preparation (ManagersCounter - number of employees per manager)

Figure 1: MS SSMS result of input data preparation (ManagersCounter - number of employees per manager)



SELECT       [Managers].[ManagerLoginID] AS 'Manager',
(
    SELECT
        CASE
	     WHEN [Employees].[ManagerCounter] = 1
                  THEN [Employees].[EmployeeLoginID]
	     ELSE ',' + [Employees].[EmployeeLoginID]
	END
    FROM     @InputData AS [Employees]
    WHERE    [Employees].[ManagerLoginID] = [Managers].[ManagerLoginID]
    ORDER BY [Employees].[ManagerLoginID]
    FOR XML  PATH('')
)            AS 'Employees'
FROM         @InputData AS [Managers]
GROUP BY     [Managers].[ManagerLoginID]

Source code 3: row values concatenation (converting to the CSV format) by means of FOR XML PATH


MS SSMS result of row values concatenation (converting to the CSV format) by means of FOR XML PATH

Figure 2: MS SSMS result of row values concatenation (converting to the CSV format) by means of FOR XML PATH



INSERT INTO OPENROWSET
(
            'Microsoft.Jet.OLEDB.4.0',
            'Excel 8.0;Database=C:\Managers.xls;',
                -- NOTE: the MS Excel file must already exist on the disk 
                --       together with the specified path and name!
            'SELECT [Managers] FROM [Managers$]'
                -- NOTE: the above-mentioned MS Excel file must have one 
                --       sheet named Managers, and the value in cell [0, 0], 
                --       i.e. the first row and column, must be 
                --       set to Managers (see Figure 5)!
)
SELECT	    [Managers].[ManagerLoginID]
FROM	    @InputData AS [Managers]
GROUP BY    [Managers].[ManagerLoginID]

GO

Source code 4: exporting data (the column Managers) to the MS Excel file on the disk


MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk - error

Figure 3: MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk - error


MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk - solution for the above error

Figure 4: MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk - solution for the above error


MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk

Figure 5: MS SSMS result of exported data (the column Managers) to the MS Excel file on the disk


Additional references: