Understanding SSIS internals and performance tuning

1 Understanding SSIS internals and performance tuningDavi...
Author: Margaret Fox
0 downloads 26 Views

1 Understanding SSIS internals and performance tuningDavid Peter Hansen, MCM, MVP davidpeterhansen.com/ssis-internals

2 Instructor: David Peter HansenConsultant / Speaker / Trainer Consultant Manager at Advectas Blog: davidpeterhansen.com LinkedIn: Microsoft Certified Master (MCM) of SQL Server Microsoft MVP 12 years experience with database architecture, development, and administration Focus: Large scale Data Warehouse and BI solutions, RDBMS, and SSIS SQL Server since SQL Server 2000 SSIS since SQL Server 2005 (DTS since SQL Server 2000) Speaker at PASS Summit, SQLBits, PASS SQLRally, SQLSaturday, Microsoft Campus Days, SQL Server Days, and user groups

3 Slides disclaimer Yes, these slides have a lot of textAnd bullet points too This is not how I usual design slides for my talks However, it is on purpose for this full-day training So… when you have to troubleshoot a package in 3 month, you can go back to these slides and remember what you’ve learned (hopefully )

4 Demos Demos are available at: davidpeterhansen.com/ssis-internalsPassword for zip file: GoSharks!

5 Modules M1: SSIS package internals M2: SSIS ServerM3: Performance tuning

6 Why understand SSIS internals?Not only fun to geek out about Key to: Understand what is going on “under the hood” Better performance Better design For example: Why are some designs bad? What actually happened? Why is the package running slow or taking up a lot of resources? How should packages be designed to run fast and use less resources?

7 SSIS package internalsUnderstanding SSIS internals and performance tuning SSIS package internals Module 1 © Copyright 2014 David Peter Hansen

8 Overview – SSIS package internalsControl Flow Control flow engine Data Flow Data flow engine Connections Data types Buffers Transformations Execution trees Back pressure

9 Package Connections Control Flow Data Flow(s) Event handlers VariablesCollection of Executable Executable ForLoop Collection of EventHandler Connections Handle connections to sources and destinations ADO, ADO.Net, Cache, DQS, Excel, File, Flat files, FTP, HTTP, Message Queue, Analysis Services, ODBC, OLE DB, SMOServer, SMTP, WMI Control Flow Orchestrate the execution of the tasks Data Flow(s) In-memory transformation of data Event handlers Does something when an event happens Variables Stores values used within the package Parameters Send parameters to the package when executing them New in 2012/2014 Configuration ForEachLoop Sequence Collection of TaskHost EventHandler Collection of Executable ForLoop ForEachLoop Sequence TaskHost

10 SSIS Designer (SSDT-BI)Log Providers Native SSIS Designer (SSDT-BI) SSIS Wizards Integration Services Service Custom Application Integration Services Data Flow Connection Managers Data Flow Task Object Model Enumerators Event Handlers Destination Packages Command Line Utilities Integration Services Runtime Managed Tasks Task Container Data Sources Transformation Source CustomTasks Data Flow Components Custom Data Flow Components .dtsx File SSIS catalog Server Source: technet.microsoft.com/en-us/library/bb522498(v=sql.105).aspx (SSIS Server and SSIS catalog added)

11 Control Flow Containers Tasks Precedence constraints

12

13 Control flow engine Also known as Runtime engine Responsible for:Load package XML file and initialise tasks Apply parameter and configure values Validate the package hierarchy Maintain and retrieve connection managers Orchestrate task execution Coordinate execution of event handlers Dispatch log entries to corresponding log providers Evaluate variables and expression values

14 Control flow engine DLLMajority is contained within a single native COM DLL DTS.dll Functionality is also exposed through a managed a .NET layer Microsoft.SqlServer.ManagedDTS.dll

15 Execution Four main phases during executionEach phase triggered by calling Control Flow engine methods exposed through the SSIS object model Load Apply parameters Validate Execute

16 Load First execution step is to load the packageApply parameters Validate Execute Load First execution step is to load the package Five phases before package is loaded into control flow engine Read XML Decrypt and check signatures Check package format version Load objects Apply configurations and expressions

17 Read XML Read entire package XML file into an XML DOM format in memoryLoad Apply parameters Validate Execute Read XML Read entire package XML file into an XML DOM format in memory DOM is an in-memory representation of an XML document Represented as a tree-structure DOM allows to programmatically read, manipulate, and modify an XML document Malformed XML will throw an error

18 Decrypt and Check Signature (1/2)Load Apply parameters Validate Execute Understanding SSIS internals and performance tuning Decrypt and Check Signature (1/2) Sensitive values can be protected by encryption E.g. passwords for connections ProtectionLevel property DontSaveSensitive Sensitive values stripped from package when saved as XML EncryptAllWithPassword Password used to encrypt entire package Package content placed into a single encrypted block of text within the file EncryptAllWithUserKey A key from the current user profile used to encrypt entire package Package can only be read by the user who originally created it © Copyright 2014 David Peter Hansen

19 Decrypt and Check Signature (2/2)Load Apply parameters Validate Execute Decrypt and Check Signature (2/2) ProtectionLevel property (cont.) EncryptSensitiveWithPassword Password used to encrypt sensitive values only Will still open if incorrect password provided Sensitive values will be stripped from the package EncryptSensitiveWithUserKey A key from the current user profile used to encrypt sensitive values Will still open if opened by different user Default protection level for packages ServerStorage Only be used when package is saved in SSISDB or msdb Rely on server storage encryption Package XML contains unencrypted sensitive values Cryptographic signatures can be used to confirm the validity of a package packages-using-digital-certificates.html

20 Check package format versionLoad Apply parameters Validate Execute Check package format version Which version of SSIS created the package Property in the XML file Automatic upgrade the package XML DOM at runtime to the latest format If the package is created on an older version of SSIS than the one it is being executed on Does not overwrite the .dtsx file (XML) with new version If package is created on a newer version of SSIS Throws an error Control flow engine is tied to specific version Not forward compatible

21 Load objects Load and instantiate objects If a task fails to loadApply parameters Validate Execute Load objects Load and instantiate objects Variables Tasks Event handlers Containers Connection Managers If a task fails to load A replacement task will substitute it Missing task will be displayed as an empty box Package can still be loaded and viewed by SSIS designer

22 Apply Configurations and ExpressionsLoad Apply parameters Validate Execute Apply Configurations and Expressions All types of configurations are applied Except parent package variable configurations Only if Project Deployment Model is not used Compute all expressions and set values Variables Task properties

23 Apply parameters Set package parameter valuesLoad Apply parameters Validate Execute Apply parameters Set package parameter values Project Deployment Model only Parameters cannot be changed during execution, once they have been set

24 Move data from foobar back to foo and barLoad Apply parameters Validate Execute Validate Tree-like approach for validation Engine tells all top-level containers to validated themselves Each container tells children to validate themselves And so on… Each task control its own validation logic Returns validation status If fail to validate, it will not execute Sibling nodes are validated sequentially Package TaskHost TaskHost TaskHost For Loop Container TaskHost Send Mail Task FTP Task XML Task TaskHost TaskHost TaskHost SQL Initialise Data Move data from foobar back to foo and bar Silly Data Flow Truncate foobar

25 Delay Validation Validation is always requiredLoad Apply parameters Validate Execute Delay Validation Validation is always required DelayValidation property Delay validation of task / container until just before it will execute Default: False Very useful when a connection string is set dynamic using expressions

26 Connection Manager ValidationLoad Apply parameters Validate Execute Connection Manager Validation Improvement in SQL Server 2012 Prior validation, each connection manager is validated separately If establishing a connection fail Connection manager is put in an offline state If offline state Subsequent calls to establish connection will fail immediately Eliminate the wait for timeout

27 Execute Orchestrates the executables in a package Executables:Load Apply parameters Validate Execute Execute Orchestrates the executables in a package Executables: Tasks Containers Sequence For Loop Foreach Loop TaskHost Event handlers Package and Event handlers are internally a container Contain tasks/containers within them Same execution logic as sequence container

28 DTSExecResult Executable returns DTSExecResultLoad Apply parameters Validate Execute DTSExecResult Executable returns DTSExecResult 0 Success 1 Failure 2 Complete 3 Cancelled Value used by control flow engine to orchestrate the executables Possible to force DTSExecResult value ForceExecutionResult property

29 Scheduling Scheduling algorithm used by control flow engineLoad Apply parameters Validate Execute Scheduling Scheduling algorithm used by control flow engine Executables are run based on: Precedence constraints DTSExecResult of previous executables

30 Executable queues Three queuesLoad Apply parameters Validate Execute Executable queues Three queues Pending queue Executables not ready to be executed yet Ready queue Executables can be started at any time Will start when there is an available thread Completed queue Executables have finished execution All top-level executables are placed in the pending queue Start with initial tasks Does not have a precedence constraint and any prior tasks

31 Load Apply parameters Validate Execute

32 Executable queues Pending Ready Completed Start log Bulk load dataApply parameters Validate Execute Executable queues Pending Ready Completed Start log Bulk load data Process some data Process other data Send Mail Task End log

33 Executable queues Pending Ready Completed Start log Bulk load dataApply parameters Validate Execute Executable queues Pending Ready Completed Start log Bulk load data Process some data Process other data Send Mail Task End log

34 Executable queues Pending Ready Completed Start log Bulk load dataApply parameters Validate Execute Executable queues Pending Ready Completed Start log Bulk load data Process some data Process other data Send Mail Task End log

35 Executable queues Pending Ready Completed Start log Bulk load dataApply parameters Validate Execute Executable queues Pending Ready Completed Start log Bulk load data Process some data Process other data Send Mail Task End log

36 Executable queues Pending Ready Completed Start log Bulk load dataApply parameters Validate Execute Executable queues Pending Ready Completed Start log Bulk load data Process some data Process other data Send Mail Task End log

37 Executable types (1/2) Tasks Sequence containerLoad Apply parameters Validate Execute Executable types (1/2) Tasks Performs operational logic Return result value Contained within a TaskHost container Provides all common functionality Event handlers Transaction handling logic Control flow engine interact with TaskHost container Not directly with the task Sequence container Share schedule algorithm with a package object Could contain TaskHost and other sub-containers Sub-containers can run in parallel Maintains own executable queues Calculate own precedence constraints

38 Executable types (2/2) Loop containers Event handlers For loopLoad Apply parameters Validate Execute Executable types (2/2) Loop containers For loop Foreach loop Like a sequence container, executed multiple times Event handlers Stand-alone control flow Similar execution mechanism to the package When an event is raised by container in the control flow: Check if there is an event handler If so, add event handler to the ready queue

39 Threads Thread pool Parallelism MaxConcurrentExecutables propertyLoad Apply parameters Validate Execute Threads Thread pool Maintained by the control flow engine Used to execute ready executables Note: Threads used for data flow engine are not included in this thread pool Parallelism Multiple executables in the ready queue Multiple threads needed MaxConcurrentExecutables property Number of tasks that can run simultaneously Default = -1 # physical cores + 2

40 Events Executable return events Two main side effectsLoad Apply parameters Validate Execute Events Executable return events Two main side effects Attached loggers will get notified Event handles are triggered If there are any Event (and event handlers and logging) occur synchronously Has to complete before package continue running Event handlers examples: Clean up temporary data storage when a package or task finishes running Retrieve system information to assess resource availability before package runs Send an message when an error or a warning occurs or when a task fails

41 No event handler definedLoad Apply parameters Validate Execute No event handler defined Package Event raised to the next container hierarchy If this container has an event handler This will respond to the event If not, the event will be raised to the next container Execute Package task Events propagate between packages For Loop Container Task Host Execute SQL Task Package Event Handler Runs No Event Handler Runs For Loop Container Event Handler Runs Package Event A Task Host Event Handler Runs For Loop Container Event Execute SQL Task Raises Event Task Host No Event Handler Event Handler

42 Standard events Hidden reference slideLoad Apply parameters Validate Execute Standard events OnError Raised by an executable when an error occurs. OnExecStatusChanged Raised by an executable when its execution status changes. OnInformation Raised during the validation and execution of an executable to report information. This event conveys information only, no errors or warnings. OnPostExecute Raised by an executable immediately after it has finished running. OnPostValidate Raised by an executable when its validation is finished. OnPreExecute Raised by an executable immediately before it runs. OnPreValidate Raised by an executable when its validation starts. OnProgress Raised by an executable when measurable progress is made by the executable. OnQueryCancel Raised by an executable to determine whether it should stop running. OnTaskFailed Raised by a task when it fails. OnVariableValueChanged Raised by an executable when the value of a variable changes. The event is raised by the executable on which the variable is defined. This event is not raised if you set the RaiseChangeEvent property for the variable to False. OnWarning Raised by an executable when a warning occurs Hidden reference slide

43 Data Flow Handled by the data flow engine Contains: SourcesTransformations Destinations

44 Source Output Transformation Destination Input Error OutputFile Database External column Output Columns Input Columns

45 Data types SSIS Data flow has its own data type systemOne single data type system Data can come from a variety of different sources Each with its own data types No variable data types NVARCHAR(200) in SQL Server  DT_WSTR(200) in SSIS In-row and BLOB data types

46 Data Flows engine Control flow engine sees this as task like any otherActivated when data flow task is started Contained within a single native COM DLL DTSPipeline.dll Functionality is also exposed through a managed .NET wrapper layer Microsoft.SqlServer.DTSPipelineWrap.dll

47 Primary subsystems of data flow engineLayout management Maintains data flow layout information Buffer manager Maintains buffer information Execution control Analyses data flow layout Creates execution plan Maintains thread pool

48 Layout management Internal structure / relationship between data flow components Passive repository No threads or active services All data flow layout information is saved in the XML subnode under the data flow task All control flow tasks have a subnode Additional information can be stored here Components Inputs Outputs Paths

49 Demo Package XML structure

50 Buffer Manager Maintains all buffer related informationServes buffer related requests from other subsystems Create new physical buffer profile using these given data types Create new virtual buffer profile on a given physical buffer profile Allocate buffer instance according to a given buffer profile Release buffer instance Spool buffer memory When the memory pressure is high

51 Physical buffers Physically memory structure allocated for rows and columns Physical memory location Collection of data Columns are always fixed width Col 1 Col 2 Col3 Col 4 Col 5 Col 6 Col m Row 1 Row 2 Row 3 Row 4 Row n

52 Buffer profiles Different data type shape Each with a unique IDBuffer manager allocates physical buffers during execution Based on buffer profile ID

53 Binary Large Objects (BLOB)SQL Server BLOB data types xml varchar(max) / nvarchar(max) varbinary(max) SSIS BLOB data types DT_TEXT DT_NTEXT DT_IMAGE Data sizes vary

54 BLOB in pipeline Half of buffer for in-row dataHalf of buffer for blob data Col 1 Col 2 Col k Col k+1 Col m Row 1 Row 2 Row 3 Row 4 Row n ½ of buffer ½ of buffer

55 Spooling BLOB data Beware of spooling to disk Set storage pathBLOBTempStoragePath BufferTempStoragePath Default for both is TMP/TEMP environment variable Usually on C: drive

56 Virtual buffers One ore more virtual buffers are defined on a physical buffer Based on column views or row views Component only need to see part of the physical buffer Row subset Column subset Column views Col 1 Col 2 Col3 Col 1 Col 2 Col3 Col 4 Col 5 Col 6 Col m Row 1 Row 2 Row 3 Row 4 Row n Row 1 Row 2 Row 3 Row views

57 Example: Column view Add new column Column hidden fromupstream components

58 Example: Row view Error row is virtually redirected

59 Example: Row view Hide columns from downstream componentsColumns are added on separate paths, but use the same buffer Orange Green

60 Private buffers Owned by component to store data used for its jobSame structure as physical buffers Example: Sort component Store data for sorting, before data can be passed back to new physical buffers

61 Flat buffers Chunk of memory a component can use as it pleasesByte-by-byte No defined structure Example: Cached or semi-cached lookup component store lookup Hash tables

62 Size of physical buffersDynamic sized Calculated by the data flow engine

63 Calculating buffer sizesizerow = calculating the estimated size of a single row of data sizebuffer = sizerow * DefaultBufferMaxRows > DefaultBufferSize engine decreases number of rows < MinBufferSize (64KB) engine increases engine sizes the buffer as close as possible to sizebuffer < DefaultBufferSize > MinBufferSize

64 Buffer Properties DefaultBufferSize DefaultBufferMaxRowsDefault: 10 MB Max: 100 MB DefaultBufferMaxRows Default: 10,000

65 Goal If you have enough memory Small number of large buffersMake sure to have enough memory Small number of large buffers Fit as many rows into a buffer

66 Sizing buffer propertiesBegin with default values Make sure you have enough memory Turn on logging BufferSizeTuning (how many rows in each buffer) Change DefaultBufferMaxRows/DefaultBufferSize Watch out for spooling Use performance counters

67 RunInOptimizedMode Data flow engine will not allocate for unused columns Output not referenced by downstream components Corresponding input has hasSideEffect = False If hasSideEffect = True Extra work besides data reading/writing could be done Example: update a variable Data flow component determines if hasSideEffect is true or false True by default Only for dtexec.exe / ISServerExec.exe (not SSDT) Not an excuse for SELECT *

68 Demo Buffers

69 Transformations blocking natureNon-blocking Semi-blocking Blocking

70 Non-blocking transformationsStreaming Transformation done on entire buffer Very fast Row-based Transformation for each row in the buffer E.g. call to stored procedure in an OLE DB Command transformation Try to avoid these

71 Semi-blocking transformationsHolds up data for a period of time Don’t need all data before it proceeds But it needs some Use private buffers to store the data before it can be passed on to the data flow

72 Blocking transformationsAll rows are held in private buffers before the transformation can be executed Can take up a lot of memory Beware of spooling to disk Try to avoid these

73 Output types Two types of component output typesSynchronous Asynchronous Data flow component can have zero or more outputs Component is asynchronous if one or more outputs are asynchronous Otherwise synchronous Relationship between input and output

74 Asynchronous componentPhysical buffers used for input different from physical buffers used for output An asynchronous output allocates its own physical buffer from buffer manager All semi-blocking and blocking transformations have asynchronous outputs by definition Data is held up for processing in private buffers New physical buffers are needed for the processed data Source components All outputs are asynchronous

75

76 Example: Asynchronous outputNew buffer is created

77 Synchronous componentPhysical buffers used for input are the same as those used for output Buffers are immediately handed off to the next downstream transformation At the completion of the transformation logic Preserve the sort order of the incoming data All non-blocking transformations have synchronous output Except for Union All

78

79 Example: Synchronous outputExpressions on existing column

80 Multiple synchronous outputsA component can have more than one output They can all be synchronous Virtual row or column buffers are used Expose only relevant rows/columns to the following components Examples: Multicast Conditional Split

81 Example: Multiple outputsThe same buffer is virtually redirected to multiple outputs.

82 Demo Blocking nature and buffers

83 Component interfaces Data flow engine interacts with components through the IDTSRuntimeComponent100 interface Microsoft.SqlServer.Dts.Pipeline.Wrapper namespace Microsoft.SqlServer.DTSPipelineWrap.dll assembly All data flow components implement this interface Important methods ProcessInput() PrimeOutput()

84 ProcessInput() Called at runtime when a PipelineBuffer from an upstream component is available to the component Tells the component that it can process the incoming rows Columns contained in buffer include: Columns defined in the IDTSInputColumnCollection100 of the component If component has synchronous outputs, buffer includes: All the columns in the output column collection of the components upstream from the component Columns added to the output column collection by the component Columns are located in a buffer row using the FindColumnByLineageID method of the BufferManager Called repeatedly as the data flow task receives full buffers from the upstream components Called until the EndOfRowset property is true

85 ProcessInput() while (buffer.NextRow()) { // Do something with each row } if (buffer.EndOfRowset) // Do something after all rows have been processed (optional)

86 PrimeOutput() Called at run time for source components and transformation components with asynchronous outputs Let components add rows to the output buffers Called for components whose SynchronousInputID property is set to None

87 Execution tree Logical grouping of data flow componentsAKA execution path Sequence of outputs and inputs Starting from an output Always an asynchronous output All outputs are synchronous to the previous input Except the first output, which is asynchronous Otherwise it would belong to the previous execution tree An instance of a dataflow have one or more execution trees Data flow engine execute the same steps on each execution tree

88 Find execution trees Root of execution tree is an asynchronous component Number of execution trees = number of asynchronous components Data flow engine breaks down data flow into execution trees Enumerate all outputs Finds all asynchronous outputs Build execution tree from each asynchronous output (root) Follow associated path and add associated input If the component of the associated input is synchronous Add those outputs to the execution tree If component has multiple outputs The execution trees will be split into multiple sub-paths This point in the execution tree is called a fork point And so on, until an asynchronous output is found

89

90 Begin Path 0 Merge Join.Outputs[Merge Join Output]; Merge Join Union All.Inputs[Union All Input 2]; Union All End Path 0 Begin Path 1 OLE DB Source.Outputs[OLE DB Source Output]; OLE DB Source Derived Column.Inputs[Derived Column Input]; Derived Column Derived Column.Outputs[Derived Column Output]; Derived Column Multicast.Inputs[Multicast Input 1]; Multicast Begin Subpath 0 Multicast.Outputs[Multicast Output 1]; Multicast Sort.Inputs[Sort Input]; Sort End Subpath 0 Begin Subpath 1 Multicast.Outputs[Multicast Output 2]; Multicast Union All.Inputs[Union All Input 1]; Union All End Subpath 1 End Path 1

91 Begin Path 2 OLE DB Source 1Begin Path 2 OLE DB Source 1.Outputs[OLE DB Source Output]; OLE DB Source 1 Merge Join.Inputs[Merge Join Right Input]; Merge Join End Path 2 Begin Path 3 Sort.Outputs[Sort Output]; Sort Merge Join.Inputs[Merge Join Left Input]; Merge Join End Path 3 Begin Path 4 Union All.Outputs[Union All Output 1]; Union All OLE DB Destination.Inputs[OLE DB Destination Input]; OLE DB Destination End Path 4

92 1 2 3 4

93 Buffers in execution treesPhysical buffer profile Different for each execution tree Include all columns ever needed in the execution tree All the physical buffers used for a particular execution tree are identical Virtual buffer profile Each virtual (column/row) view has a virtual buffer profile Execution tree and buffer profiles One physical buffer profile per execution tree Can have multiple virtual buffer profiles Data flow engine registers physical and virtual buffer profile to each execution tree

94 Execution plans (1/2) Data flow engine analyses the data flow layout and creates execution plans Steps in an execution tree is called an execution plan One execution plan per execution tree Execution plan is bound to physical buffer profile Execution steps in an execution plan (in that order): PrimeOutput() Call PrimeOutput() on source components and/or asynchronous transformation components ProcessInput() Call ProcessInput() on transformation and/or destination component

95 Execution plans (2/2) Execution steps in an execution plan (in that order): (cont.) NewRowView() Build new virtual buffer for an output based on row view NewExecutionItem() If two or more subpaths Create a new execution item at the fork point A execution item is a single execution step with an associated buffer Each execution item will execute different subpath plans in parallel PipelineExecutionPlan Logging to get the full execution plan

96 Execution steps (1/2) Start all source components and produce physical buffers Calling PrimeOutput() method on all source components Each source component occupy its own thread Produce data until last row Reports end of row set to the data flow engine Use a physical buffer pointer to add rows When physical buffer is full Data flow engine allocate a new physical buffer and give new pointer to source component Looks like one infinite buffer to the source component But is multiple physical buffers

97 Execution steps (2/2) For each physical buffer:Find execution tree Find execution plan Based on corresponding execution tree Uses two data structures: Work item queue Thread pool Release the physical buffer When all steps have been executed Continue until all physical buffers have been released and all source components have reported an end of row set

98 Queues and thread pool Work item queue Pending queue Thread poolContain steps ready to be executed Pending queue Work items waiting for a component to be ready Thread pool Handles available threads in the data flow When there is a free thread available Try get a ready work item from the work item queue If the required component is occupied by another buffer Work item will be placed in pending queue until available

99 Thread scheduler Can assign more than one thread to a single execution tree If: Threads are available Execution tree requires intense processor utilization In SSIS 2005 Limited to one thread per execution tree Each transformation can receive one thread Each source adapter receives a separate thread

100 EngineThreads propertyModify to ensure: Execution trees are not sharing threads Extra treads are available for complex and large execution trees Does not include number of threads allocated to source adapters Source adapters get one thread each Suggestion for Data flow engine Not used if too small Preventing thread starvation Downstream components not getting a thread Could result in a deadlock Default is 10 threads

101 Demo Execution trees and plans

102 Backpressure Reduce excessive memory usage in the data flowDon’t load data into the data flow If there is still enough buffers to be processed Basic backpressure Extended backpressure Introduced with SQL Server 2012

103 Basic back pressure Downstream component is not as fast as previous components Upstream would keep allocating physical buffers Physical buffers would wait in the execution tree, before the slow component could process them Max 5 buffers per execution tree Data flow engine checks the number of buffers in an execution tree Before a physical buffer is allocated to the asynchronous output Maximum 5 buffers within an execution tree If maximum is reached Component cannot add more rows to the output ProcessInput() is blocked Cascades up to previous execution trees Until a source component is suspended Works in most scenarios

104 Enhanced backpressureData mismatch issue Merge join or merge component Two independent data input Different incoming data rate Difference could be very large One input could receive data very fast Another input very slow Merge / merge join could end up caching (using private buffers) a large amount of the incoming rows from the fast input Before a row would match from the slow input Large amount of private buffers until it finds matched rows

105 Enhanced backpressureAllow component to temporary suspend input If it already have enough rows on that input It is the other input that is the bottleneck Will not decrease performance IDTSMultiInputComponent100 Interface CanProcess() Indicates whether component can process the specified number of inputs Called before ProcessInput() If CanProcess() is false, ProcessInput() will not be called GetDependencies() Gets the dependencies for the specified input component Used to prevent deadlocks

106 Demo Backpressure

107 SSIS Server Module 2

108 Overview – SSIS Server SSIS Server SSIS Catalog (SSISDB)Security ISServerExec.exe (host process) Deployment Monitoring

109 SSIS Server SSISDB user database Execution host processAka SSIS Catalog Execution host process ISServerExec.exe Tools and APIs to control and manage the SSIS Server Project Deployment Model SSIS Server is the target of deployment New in SQL Server 2012 Optional part of SSIS Highly recommendable

110 SSIS Server architectureSource: Deep Inside the Microsoft SQL Server Integration Services Server Matt Masson & Matthew Roche, TechEd 2013 SSIS Server architecture SQL Server Instance SSIS Catalog (SSISDB) Deploy Manage Security Validate Execute IS Objects Operation logs State Execution Process ISServerExec.exe Execution Control Components PowerShell SSMS Deployment Wizard Application OM: Manage OM: Project Runtime SSIS Server

111 SSIS Catalog (SSISDB) Inside an instance of SQL ServerThe center of SSIS Server Operational store Runtime store 5 categories of functionality Deploy Manage Security Validate Execute Functionality can be accessed through an API T-SQL, PowerShell, SSMS, Deployment Wizard, Application Managed Object Model (OM) .Net wrapper on top of underlying T-SQL Folder #1 Folder #2 Project #1 Project #2 Packages Packages

112 SSIS Catalog (SSISDB) stored procs and viewsActions to SSIS packages deployed to SSISDB Deploy Manage Configure Execute Monitor Catalog schema Supported API Views, stored procedures, functions Internal schema Not supported – do not use Base tables and lower-level objects Will change over time

113 SQLCLR in SSISDB Interprocess Communication (IPC) to/from ISServerExec.exe IPC = mechanisms for facilitating communications and data sharing between applications Encryption within SSISDB SQL Server Host Policy Level Permission Set = UNSAFE Allows assemblies unrestricted access to resources Both within and outside SQL Server Code executing from within an UNSAFE assembly can also call unmanaged code

114 UNSAFE assembly CREATE ASYMMETRIC KEY MS_SQLEnableSystemAssemblyLoadingKey FROM EXECUTABLE FILE = 'C:\Program Files\Microsoft SQL Server\110\DTS\Binn\Microsoft.SqlServer.IntegrationServic es.Server.dll' CREATE LOGIN MS_SQLEnableSystemAssemblyLoadingUser FROM ASYMMETRIC KEY MS_SQLEnableSystemAssemblyLoadingKey GRANT UNSAFE ASSEMBLY TO MS_SQLEnableSystemAssemblyLoadingUser

115 Security in SSISDB (1/2) SSISDB protects sensitive dataProject Symmetric Key Environment Symmetric Key Execution Symmetric Key Project Certificate Environment Certificate Execution Certificate Database Master Key Service Master Key Project Parameter Value Environment Value Execution Parameter Value SSISDB protects sensitive data Never stored as plain text in SSISDB Row level security Using build-in SQL Server encryption key structure Service master key Database master key Certificate Symmetric key Symmetric key opened and closed in catalog.* views / stored procedures This is done automatically

116 Security in SSISDB (2/2) Encryption algorithmesSSISDB must be in single user mode to change the encryption algorithm AES_256 (default) AES_192 AES_128 DESX TRIPLE_DES_3KEY TRIPLE_DES DES Sensitive data must be < 8000 bytes

117 User access in SSISDB Users can have access to:Folder Project Environment Operations/Executions Granting user access to a project = granting user access to certificate for that project User can then decrypt values related to the project ssis_admin and sysadmin Can access all securables catalog.explicit_object_permissions Displays the permissions that have been explicitly assigned to the user Does not show if the principal is a member of roles and groups that have permissions assigned catalog.effective_object_permissions Displays the effective permissions for the current principal for all objects in SSISDB Users can see effective permissions only for themselves and for roles of which they are members

118 Permissions and SecurablesRead Modify Execute Manage Permission Create New Read Objects Modify Objects Execute Objects Manage Objects Permission Folder X Project Environment Operation / Execution Source: Deep Inside the Microsoft SQL Server Integration Services Server Matt Masson & Matthew Roche, TechEd 2013 Hidden reference slide

119 Row level security catalog.* views presents a filtered rowset from internal.* table User can only see rows it has been granted READ permission to ssis_admin / sysadmin can see all rows Implemented based on patterns from “Implementing Row- and Cell- Level Security in Classified Databases Using SQL Server 2005” whitepaper

120 Demo Security

121 Outside SSISDB Logins MS_SQLEnableSystemAssemblyKey##MS_SQLEnableSystemAssemblyLoadingUser## ##MS_SSISServerCleanupJobLogin## MS_SQLEnableSystemAssemblyKey Asymmetric Key dbo.sp_ssis_startup Stored procedure in master Redirects to [SSISDB].[catalog].[startup] Cleans up orphaned operation status for unexpected shutdown SSIS Server Maintenance Job SQL Server Agent job Cleans up execution log data outside retention window and project versions beyond configured limit

122 ISServerExec.exe External host process for SSIS package operationsDeploy Validate Execute Functionality implemented in Microsoft.SqlServer.IntegrationServices.Server.Shared.dll UNSAFE assembly granted to ##MS_SQLEnableSystemAssemblyLoadingUser## Everything that touches SSIS object model is run in the host process Runs control flow engine and data flow engine Lots of native COM code Reason for it to be in its own process Originally thought to be part of SqlServr.exe Bad idea: What happens if SSIS crashes? SQL Server would crash too! Each execution gets its own instance Multiple packages within one instance Child packages called from a master package

123 ISServerExec.exe and SSISDB communicationSqlConnection ADO.Net SQL Server sees ISServerExec.exe as just another client Used for writing events back to SSISDB during package execution (logging) Asynchronous logging Events are guaranteed to be written in the right order Events are queued Package execution is not slowed down by logging SqlConnection SSISDB ISServerExec

124 ISServerExec.exe and SSISDB communicationNamed pipes IPC between SSISDB and ISServerExec.exe Communicate back and forth using named pipes ISServerExec_{ExecutionGuid} Gets set up by SQLCLR when catalog.start_execution is called Calls internal.start_execution_internal Examples: Get performance counters from ISServerExec.exe Tell ISServerExec.exe to terminate package Tell ISServerExec.exe to create debug dump SqlConnection Named Pipes SSISDB ISServerExec Named Pipe Client Named Pipe Server

125 SQLCLR and ISServerExec.exeSQLCLR used for: Impersonation when starting ISServerExec.exe Required for deploy, validate, and execute IPC communication with ISServerExec.exe

126 Examples of IPC between SSISDB and ISServerExecSSISDB tells ISServerExec to stop execution of package SSISDB tells ISServerExec to create memory dump ISServerExec then tells SSISDB what the file name is SSISDB asks ISServerExec for performance counters ISServerExec responds with performance counters

127 Demo ISServerExec and SSISDB communication

128 Project Deployment Project Deployment Model in 2012/2014.ispac file created when project is build OpenXML (zip) file Contains: Packages Parameters Connections Deploy .ispac to SSISDB Stored as binary in SSISDB Some meta data are parsed out Packages will say in the .ispac file Reverting to earlier version of .ispac is easy Pointer being moved Maximum number of version saved Default: 10

129 Encryption of .ispac file.ispac is encrypted by server storage encryption During deployment it is changed from whatever encryption was being used (e.g. encryption with user key) to server storage encryption Encrypts entire package T-SQL encryption only supports data up to 8000 bytes Projects are usually larger than that Encryption of .ispac file is done with SQLCLR and System.Security.Cryptography

130 Deployment of .ispac EXECUTE AS CALLER OPEN SYMETRIC KEY WAITFOR DELAYYou need the right permissions to the folder you want to deploy to OPEN SYMETRIC KEY Encryption of .ispac WAITFOR DELAY To make sure .ispac is deployed before validation is performed Data is inserted into: internal.projects internal.object_versions internal.packages Stored procedures being used internal.encrypt_binarydata internal.deploy_project_internal internal.append_packages

131 Package execution Execution is a multi-step processEXEC [catalog].[create_execution] ... EXEC ... EXEC An instance of ISServerExec.exe is started ISServerExec.exe sends events back to SSISDB via SqlConnection OnPostExecute events internal.executables internal.executable_statistics

132 Package execution lifecycleCancelled (3) catalog.create_execution Stopping (8) Success (7) catalog.stop_operation Created (1) Pending (5) Running (2) Completed (9) catalog.start_execution Unexpected Termination (6) Failed (4) ISServerExec.exe crashed

133 Package execution Package is executed at the SSIS Server hostNot locally Beware of double hop problems If SSIS Server is not on same machine as source or destination SQL Server EXECUTE AS CALLER Impersonate user Cannot establish connection to other servers Login failed for user ‘NT AUTHORITY\ANONYMOUS’ Use Kerberos delegation-working-with-ssis-package.aspx Or use SQL Server Agent to execute packages Executes from same server, so no double hop

134 Demo Deployment and execution

135 Monitoring Logging is now automatically done by the SSIS ServerNo longer any need for own logging framework Logging levels None Only package execution status Basic All events, except diagnostic and custom events (Default) Performance Only performance statistics, and OnWarning and OnError Verbose All events (big performance overhead!) DiagnosticEx - diagnostic information whenever an Execute Package Task executes a child package

136 Monitoring Build-in reportsT-SQL queries against the catalog.* views and functions Custom reports DiagnosticEx events Only in Verbose logging level Captures diagnostic information whenever an execute package task executes a child package Child packages can run multiple times in parallel with different parameters See which parameters are passed to the child packages XML output – can be parsed with T-SQL

137 Performance counters ISServerExec.exe reports normal logging via SqlConnection (ADO.Net) SSISDB asks ISServerExec.exe for performance counters via named pipes ISServerExec.exe returns performance counters via named pipes @execution_id = unique identifier of the execution that contains one or more packages Packages that are executed with the Execute Package task Run in the same execution as the parent package is NULL Return performance counters for multiple executions If ssis_admin/sysadmin Performance counters for all running executions are returned If NOT ssis_admin/sysadmin Performance counters for all running executions you have access to are returned

138 Performance counters SELECT *FROM Logging Return performance counters SSISDB ISServerExec Named Pipe Client Named Pipe Server Ask for performance counters

139 Performance counters (1/2)BLOB bytes read Number of bytes of binary large object (BLOB) data that the data flow engine reads from all sources BLOB bytes written Number of bytes of BLOB data that the data flow engine writes to all destinations BLOB files in use Number of BLOB files that the data flow engine is using for spooling Buffer memory Amount of memory that is used by the Integration Services buffers, including physical and virtual memory Buffers in use Number of buffer objects, of all types, that all data flow components and the data flow engine are using Buffers spooled Number of buffers written to the disk Total number of rows written by the execution Hidden reference slide

140 Performance counters (2/2)Flat buffer memory Amount of memory, in bytes, that is used by all flat buffers Flat buffers are blocks of memory that a component uses to store data Flat buffers in use Number of flat buffers that the data flow engine uses All flat buffers are private buffers Private buffer memory Amount of memory in use by all private buffers. A private buffer is a buffer that a transformation uses for temporary work A buffer is not private if the data flow engine creates the buffer to support the data flow Private buffers in use Number of buffers that the transformations use for temporary work Rows read Total number of rows ready the execution Rows written Total number of rows written by the execution Hidden reference slide

141 Demo Monitoring

142 Execution dumps (1/2) Create debug dump files Provide information about the execution of a package :\Program Files\Microsoft SQL Server\110\Shared\ErrorDumps Secure folder (need admin permissions to access), as dump filer might have sensitive information Similar to dtutil.exe /Dump .mdmp debug dump file Binary file Probably not that useful Unless you’re from Microsoft CSS

143 Execution dumps (2/2) .tmp debug dump file Text formatted fileMore useful – but still really detailed Examples of type of information: Environment information Operating system version Memory usage data Process ID Process image name Dynamic-link library (DLL) path and version For each DLL that the system loads during the processing of a package Recent messages issued by the system Time Type Description Thread ID

144 Execution dump createdEXEC = 42 Logging Execution dump created SSISDB ISServerExec Named Pipe Client Named Pipe Server Create execution dump Dump file Pause running package Dump Resume running package

145 Data taps Server side data viewerLet’s DBAs or DevOps see data during execution Logged if someone adds a data tap Created using one of two stored procedures [catalog].[add_data_tap] Only for parent packages [catalog].[add_data_tap_by_guid] For both parent and child packages Data Taps create CSV outputs (tap files) Under %DTS%\DataDumps folder Contain all data that passes through the specified data flow path Records are created in [catalog].[operation_messages] When a data tap file is created If the specified data flow path is invalid If the specified data flow task is never executed If the data tap file creation fails

146 Data taps Data tap file EXEC [catalog].[create_execution] EXEC [catalog].[add_data_tap] @execution_id = '\Package\Data Flow Task‘ = 'Paths[Data Clean.Derived Column Output]‘ = 'DimProduct.csv‘ EXEC [catalog].[start_execution] ...

147 Demo Troubleshooting

148 Performance tuning Module 3

149 Package design It is much easier to design for good performance up front than tuning existing packages However, that is not how the real world works… This module focuses on tuning existing packages

150 Overview – Performance tuningTuning approach Is it SSIS? Environment Extract Transformations Load Buffer tuning Parallelism

151 Tuning approach Measure performance Come up with a hypothesisCreate a baseline Come up with a hypothesis Tune your package Measure performance again Did your hypothesis hold? Did performance increase? Repeat

152 Is it SSIS? More important: Is it NOT SSIS?Using the SSIS Data Flow task? Or Execute SQL task? SQL Server RDBMS Slow query Wait stats – what is SQL Server waiting for? ASYNC_NETWORK_IO Environment IO subsystem Memory CPU VM vs physical hardware

153 SSIS on same server as SQL Server RDBMS? (1/2)SSIS and SQL Server RDBMS on same server No network bottleneck SSIS does not play nice and will take the resources needed Could harm your SQL Server RDBMS performance SQL Server usually wins on memory, causing SSIS to spill to disk SSIS and SQL Server on separate servers Potential network bottleneck Gigabit network Multiple NICs windows.aspx IO subsystem is usually your primary bottleneck

154 SSIS on same server as SQL Server RDBMS? (2/2)SSIS and SQL Server on separate servers (cont.) SSIS will not harm SQL Server RDBMS performance And SQL Server RDBMS will not harm SSIS performance Cost considerations Hardware Licensing Which solution is the best? It depends…

155 SQL Server environmentIO subsystem Memory CPU Network

156 IO subsystem performance (1/3)IO subsystem performance of source and destination SSIS data flow handles data in memory IO subsystem performance of SSISDB and SSIS Server environment is really not that important Unless you do excessive logging or spooling IOPS and bandwidth Source: Flat files Reading all files from disk File shares – network might be the bottleneck SQL Server Typically a lot of table scans Data typically not already in buffer pool Large scans might result in low Page Life Expectancy (PLE) on source system Load delta data set if possible Performance of reads from data files

157 IO subsystem performance (2/3)Destination SQL Server Performance of writes to transaction log If minimal logged inserts are obtained Performance of writes of dirty data pages caused by eager writer Flat files Write performance to destination drive File shares – network might be the bottleneck

158 IO subsystem performance (3/3)Benchmark IO subsystem performance SQLIO to-excel-charts-using-regex-in-powershell.aspx Iometer Crystal DiskMark Get the portable edition! Installer edition is bundled with nasty adware (I learned that the hard way :/ )

159 Memory Have enough memory Limit the amount of memory used by avoiding:Avoid spooling to disk Limit the amount of memory used by avoiding: Blocking transformations Look for private buffers Large data sets for lookups Look for flat buffers Avoid BLOB in pipeline Can really hurt performance Running a lot of packages / data flow tasks in parallel Have enough memory to handle all buffers

160 Memory performance counters (1/2)Process \ Working Set (ISServerExec) Total physical memory used by ISServerExec Include memory-mapped files (e.g. shared DLLs) Process \ Private Bytes (ISServerExec) Amount of memory that ISServerExec has asked for - not necessarily the amount it is actually using Usually exclude memory-mapped files (e.g. shared DLLs) Memory \ Page Reads / sec Total memory pressure on the system Constantly >500, system is under memory pressure SQLServer:SSIS Pipeline 12.0 \ Buffer memory Amount of memory that is in use May include both physical and virtual memory When larger than amount of physical memory, memory swapping is happening

161 Memory performance counters (2/2)SQLServer:SSIS Pipeline 12.0 \ Flat buffer memory Total amount of memory, in bytes, that all flat buffers use Blocks of memory that a component uses to store data Large block of bytes that is accessed byte by byte SQLServer:SSIS Pipeline 12.0 \ Private buffer memory Total amount of memory in use by all private buffers Transformation uses private buffers for temporary work only SQLServer:SSIS Pipeline 12.0 \ Buffers spooled Number of buffers currently written to the disk If the data flow engine runs low on physical memory, buffers not currently used are written to disk and then reloaded when needed

162 CPU SSIS is quite good at parallelise its workload across multiple cores Control Flow can run multiple executables in parallel Data Flow can assign multiple threads to each execution tree Has to be done by design Too few threads can cause thread starvation All cores are not being used Too many threads can lead to context switching Too much context switching can cause a performance hit Use xperf to look for context switching

163 CPU CPU-Z Performance counter Windows Power PlanTool for getting information about CPUs Performance counter Process / % Processor Time (Total) ISServerExec.exe Windows Power Plan Always choose High performance Default is Balanced

164 If SSIS is not using close to 100% CPUApplication contention Other applications (e.g. sqlservr.exe) is taking resources Hardware contention Not sufficient I/O or memory to handle data load Design limitations of SSIS package Not making use of parallelism Too many single threaded executables

165 Network Data is transferred over networkFrom source system to SSIS Server From SSIS Server to SQL Server From SQL Server to SAN iSCSI Ensure that the network can deliver the best possible performance Data sources SSIS server DW server SAN

166 Network performance countersNetwork Interface / Current Bandwidth: Estimate of current bandwidth Network Interface / Bytes Total / sec: Rate at which bytes are sent and received over each network adapter Network Interface / Transfers/sec: How many network transfers per second are occurring If it is approaching 40,000 IOPs, get another NIC and use teaming between the NICs

167 Network tuning Ethernet frameDefines how much data can be transmitted over the network at once Processing of frame requires hardware and software resources utilization Increasing the frame size Has to be supported by network adapter Send more bytes with less overhead on the CPU Increase throughput by reducing the number of frames that need to be processed Jumbo frame Ethernet frame can carry up to 9000 bytes Change setting on NIC

168 Network tuning Jumbo Packet propertySet to 9014 bytes to enable the use of Jumbo frames Make sure that the network infrastructure can support this type of frame Packet Size SQL Server can accept up to bytes in one SSIS network packet Set the Packet Size to bytes for the connection manager

169 Demo Is it SSIS?

170 Source component performanceFlat file SQL Server Oracle Teradata Excel Etc. Let’s look at flat files and SQL Server Most common sources with large data

171 Baseline source componentHow fast can we get data in to the pipeline? SSIS Data Flow can not be faster than it receive data Always test speed of source component Source Component  Row count Rows / sec = Row Count / TimeData Flow Source component is a single threaded operation But T-SQL query might be multi threaded on the SQL Server side Compression of source table Could benefit

172 Increase rows/sec from source componentImprove driver and driver configurations E.g. Oracle Attunity vs Oracle OLE DB vs Oracle ADO.Net Start multiple connections As long as the source can handle many concurrent connections Concurrency could causing locking or blocking issues Consider partitioning Use multiple NIC cards First ensure you’re using gigabit NIC If network is your bottle neck

173 Extract 1:1 Typical in Enterprise Data Warehouse architecturesExtract data from source and insert it into a table within a SQL Server database Exact same structure and format No transformations take place Avoid putting unnecessary pressure on source system Easier to query tables in SQL Server (EDW) than in source systems Especially if source system is not SQL Server

174 Flat file source Consider BULK INSERT in T-SQL SSIS is also very fastIn-process method for bringing data from a text file into SQL Server Very fast, because it runs in process with Sqlservr.exe SSIS is also very fast And the most flexible choice for bulk loading data into SQL Server Fast parse Locale-neutral parsing routine Forfeits ability to interpret date, time, and numeric data in locale-specific formats and many frequently used ISO 8601 basic and extended formats Enhanced performance Available in Flat File source or Data Conversion components Specified at column level in Advanced Editor

175 Flat file source Performance of IO subsystem where flat files are placed Network share Network bottleneck Copying files around over network File share  SSIS server  archive Limit the copying over network Place archive on same machine as file share Fewer larger files better than lots of small files However, splitting files and reading them in parallel might give a performance gain

176 SQL Server source Different providers For now, use OLE DB OLE DB ODBCADO.Net For now, use OLE DB SQL Native Client OLE DB provider is (probably) being deprecated in next version of SQL Server Doesn't affect other OLE DB providers or the OLE DB API The SSIS OLE DB components are not being deprecated But let’s see what happens in the future

177 SQL Server source optimizationSELECT only columns you need Puts pressure on SQL Server Loads data over network to SSIS Even if they are not used in pipeline Convert data types to smaller data types Narrower columns in the buffer  more rows per buffer Example: INT  TINYINT If values fit within TINYINT Optimize the query Look at the execution plan Index strategy Cardinality estimation problems

178 WITH (NOLOCK) hint (1/2) Used to minimize locking contentionThis is usually not where your performance problem is Same as READ UNCOMITTED isolation level No shared locks issued Does not prevent other transactions from modifying data read by the current transaction Not blocked by exclusive locks Would otherwise prevent the current transaction from reading rows that have been modified but not committed by other transactions Dirty read Missing already committed rows Reading rows twice in case of a page split

179 WITH (NOLOCK) hint (2/2) Use only if and only if you are 100% sure nobody is writing to the table when you read Which is quite possible in a Data Warehouse batch flow Don’t use during extract from source systems Dirty reads NOT best practice to always use WITH (NOLOCK) It is the exception to the rule Just DON’T do it! Unless you know that this is the place you are having problems (it typically isn’t)

180 Blocking iterators in SQL ServerExample: sort All data is held up and sorted, before any rows are returned to SSIS

181 Source: The SSIS tuning tip that everyone misses http://sqlblogOPTION (FAST n) Sometimes we would like the first rows of data faster into the data flow, despite of longer overall query time Example: slow transformation downstream OPTION (FAST n) Optimize plan for the first n rows Can help remove blocking iterators Data into SSIS faster At a cost of overall query performance n = DefaultMaxBufferRows

182

183 Demo Extract optimisation

184 Blocking transformationsTypically most used Sort Aggregate Others Fuzzy grouping Fuzzy lookup Row sampling Term extraction Try to avoid these at (almost) all cost Especially for large data loads Can potential allocate a lot of private buffers

185 Most used blocking transformationSort Push it down to the source if possible ORDER BY Set IsSorted property on source component output column MaximumThreads property Default = -1  no restriction on threads Aggregate GROUP BY IsBig property Set to 1, if column value exceed 4 billion or a precision beyond a float data type is required DT_R8 data type is used instead of the DT_R4 Count results are stored as the DT_UI8 data type Distinct count results are stored as the DT_UI4 data type

186 Data types conversion TransformationsData Conversion Derived Columns Can be quite CPU intensive transformations Extra columns in physical buffers Data Conversion transformation converts the data in an input column to a different data type and then copies it to a new output column Derived column will add a new column with converted data Push the convert down to the source CONVERT() or CAST()

187 Lookup transformation optimisationSelect only the columns needed Default behavior is to load entire table into flat buffers If you choose “Use a table or a view” SELECT * Choose “Use results of an SQL query” SELECT a, b FROM dbo.MyTable Select the right cache mode Full cache Partial cache No cache Combine full cache and partial cache Cache Connection Manager If multiple data flows needs the same lookup Example: multiple fact load packages need to lookup same dimension surrogate key

188 Lookup transformation optimisationEnable memory limit Limit amount of memory that Lookup can use New rows are added individually Query database whenever data for a row cannot be located in cache Singleton (single row) queries In comparison to a large full table scan query Index accordingly on lookup columns Benefit: Large lookups can be performed, which might not be possible otherwise Performance cost of singleton queries – but not as bad as one might think Useful if lookup is only matching a few rows of a very large lookup table

189 OLE DB Command Non blocking transformation Row by rowCan be very very slow Alternative: Insert data into a staging table Perform set based T-SQL after the data flow execution have finished

190 Slowly changing dimensionsAlex Whittles wrote an MSc dissertation on “Performance comparison of techniques to load type 2 slow changing dimension in a Kimball data warehouse” Alex tested four different methods Slowly changing dimension wizard Lookup Merge join T-SQL MERGE “T-SQL Merge and SSIS Merge Join methods offered significantly higher performance than the other methods in most tests” “Merge Join […] should be preferred for higher volume scenarios, where the number of new or changed rows reached and exceeds 500k.”

191 Demo Transformation optimisation

192 SQL Server DestinationsShared memory OLE DB Destination TCP/IP Named pipes Table lock WITH (TABLOCK) Acquire table lock on destination table instead of acquiring multiple row locks Fewer locks Avoid lock escalation

193 Check constraint in OLE DB DestinationWITH (CHECK_CONSTRAINTS) Can be un-checked if data does not violate any constraints Any CHECK and FOREIGN KEY constraints are ignored The constraint on the table is marked as not-trusted SQL Server won’t use untrusted constraints to build better execution plans Need to recheck constraints afterwards ALTER TABLE dbo.MyTable WITH CHECK CHECK CONSTRAINT MyConstraint

194 Fast load in OLE DB DestinationData Access Mode: Table or view name - fast load Table or view name variable - fast load What happens with fast load? BULK INSERT in T-SQL If not used: INSERT INTO … SELECT Row by row insert BULK INSERT can use minimal logged operations Can also be fully logged Still has less overhead than row by row inserts

195 Fully logged vs minimal logged insertsFully logged inserts Every row insert kept track of in transaction log Minimal logged inserts Keep track of extend allocation and meta data changes only Only the information that is required to recover the transaction Smaller footprint on the transaction log Does not support point-in-time recovery

196 Methods for minimal logged insertsFast load in OLE DB Destination BULK INSERT BCP INSERT INTO … SELECT SELECT INTO

197 Fully logged inserts (1/3)SQL Server allocate the needed extends for the data-file pages If table not empty SQL Server might have to insert rows into already existing data-file pages If page does not exist in buffer pool Physical read Rows are inserted into allocated pages in buffer pool Pages marked as dirty Will eventually be flushed to disk when checkpoint occur Or if buffer pool under pressure, lazy writer would flush them to disk

198 Fully logged inserts (2/3)Log records for inserted rows are created Inserted into log blocks When lock blocks fill up, they are asynchronously flushed to the transaction log on disk Commit Remaining log blocks are flushed to the transaction log on disk Data pages are not flushed to disk at commit time

199 Fully logged inserts (3/3)Data file BEGIN TRAN INSERT INTO dbo.MyTable SELECT a, b, c FROM dbo.MyOtherTable COMMIT TRAN Buffer pool Buffer pool D L Log file

200 Minimal logged inserts (1/3)SQL Server allocate the needed extends for the data-file pages If table not empty SQL Server might have to insert rows into already existing data-file pages If page does not exist in buffer pool Physical read Rows are inserted into allocated pages in buffer pool Pages marked as dirty The eager writer process asynchronously flushes dirty pages to the data-files on disk

201 Minimal logged inserts (2/3)Log records are created for extend allocation and meta data only Inserted into log blocks When lock blocks fill up, they are asynchronously flushed to the transaction log on disk Much smaller footprint than fully logged inserts Extend allocation uses some special type of pages called ML map pages Used to keep track of the extents that has been modified by a bulk insert When taking a transaction log backup under the BULK_LOGGED recovery model, SQL Server can include these extents from the data-file in the backup Commit Remaining log blocks are flushed to the transaction log on disk Remaining dirty pages are flushed to the data-files

202 Minimal logged inserts (3/3)Data file BEGIN TRAN BULK INSERT dbo.MyTable INFO ... COMMIT TRAN Buffer pool D L Buffer pool Log file Extend allocation & meta data

203 Minimal logged operations prerequisitesSIMPLE or BULK_LOGGED recovery model Will affect your backup strategy Target table must not be replicated Target table is not a memory-optimised table

204 Inserting into a heap Target table lock is requiredWITH (TABLOCK) Empty and non-empty heaps

205 Trace flag 610 Controls minimal logged inserts into indexesCan be turned on: SQL Server startup parameter Enable for specific session DBCC TRACEON (610) Enable globally for all sessions DBCC TRACEON (610, -1) Not every row inserted into clustered index is minimal logged New page allocated during bulk insert  minimally logged Rows inserted into existing pages  fully logged Page marked as dirty in buffer pool, and later written to disk during checkpoint Rows moved as result of a page split  fully logged

206 Trace flag 610 and bulk insertThere might be scenarios where inserting into b-trees (indexes) is slower under trace flag 610 minimal-logging-for-sql-server-2008.aspx Use the largest BATCHSIZE possible When batch is committed, all dirty pages must be flushed to disk Any pages not caught by an earlier checkpoint or the eager writer Written to data file Can generate a lot of random I/O With fully logged operations, records for the inserts are written to the transaction log Thus causes sequential I/O Does not flush dirty pages to disk at commit time Dirty pages are flushed during checkpoint If slow IO subsystem and small inserted data set Minimal logged inserts could be slower than fully logged inserts

207 Minimal logged conditionsSource: The Data Loading Performance Guide Minimal logged conditions Table Indexes Rows in table Hints Without TF 610 With TF 610 Concurrent possible Heap Any TABLOCK Minimal Yes None Full Heap + Index Depends (2) No Cluster Empty TABLOCK, ORDER Yes (1) Cluster + Index (1) Concurrent loads only possible under certain conditions. Only rows written to newly allocated pages are minimally logged. (2) Depending on the plan chosen by the optimizer, the non-clustered index on the table may either be fully- or minimally logged.

208 Inserting into clustered indexSource data ordered by cluster key Specify ORDER hint in FastLoadOptions Minimal logged if empty or trace flag 610 User needs to be sysadmin to set trace flags SSIS service account needs sysadmin if DBCC TRACEON used within the control flow Use WITH EXECUTE AS ‘user’ in a stored procedure to call DBCC TRACEON Where ‘user’ has sysadmin permissions Alternative set traceflag 610 at startup of SQL Server Be sure that it does not create unexpected behavior in other parts of the database server Use the same session for the two control flow tasks Trace flag 610 needs to be enabled for data flow task

209 Maximum Insert Commit Size (MICS)Also known as Batch Size > buffer size One commit for every buffer = 0 Entire batch is committed in one big batch Might cause the running package to stop responding If the OLE DB destination and another data flow component are updating the same table Set the Maximum insert commit size option to < buffer size Commit after MICS & commit after every buffer Any constraint failure at the destination causes the entire batch of rows defined by Maximum insert commit size to fail

210 Insert into table with nonclustered indexesSort must happen for every index Set a smaller commit sizes Smaller commit size makes sort fit in memory Beware of fragmentation Another approach: Disable/drop nonclustered indexes Insert data Enable/create nonclustered indexes

211 Delta load Change existing rows in the target table instead of reloading the table Change Data Capture SQL Server 2008 and above Oracle Delta detection by comparing the source input with the target table Often not possible to only load new rows from source, as old rows can be updated “back in time” Can be very costly operation Requires maintenance of special indexes and checksums just for this purpose Often, it is fastest to just reload the target table If the target table has changed by >10%

212 Buffer tuning Goal: DefaultBufferSize DefaultBufferMaxRowsHave enough memory! Small numbers of large buffers Fit as many rows in to a buffer Remove unused columns Narrows data types DefaultBufferSize Default: 10 MB Max: 100 MB DefaultBufferMaxRows Default: 10,000 DefaultMaxBufferRows = DefaultBufferSize / sizerow

213 Buffer tuning approachStart with default values Enable logging on the data flow task Select the BufferSizeTuning event to see how many rows are contained in each buffer Increase DefaultBufferSize and DefaultBufferMaxRows Check logging BufferSizeTuning Measure performance Make sure no spooling to disk Repeat

214 EngineThreads Property for each data flow Default: 10If multiple data flows, multiple EngineThreads Default: 10 Increase to avoid threads starvation Data flow engine will not use more threads than are needed, even if value is higher Data flow engine might use more threads than the value To avoid concurrency issues Starting point: 1 engine thread per execution tree Minimum 10 Increase if complex execution trees General rule is to not have more threads than number of available cores Avoid context switching

215 BLOB Try avoid BLOB if possible Half buffer for in-row, half for BLOBSpooled to disk if does not fit in memory Set BufferTempStoragePath and BLOBTempStoragePath property Default: TMP/TEMP variable Usually C: Minimise spooling Size DefaultBufferSize & DefaultBufferMaxRows

216 Minimise spooling of BLOBSize DefaultBufferSize & DefaultBufferMaxRows Approach: Find max blob buffer: DefaultBufferSize = e.g. 100MB MaxBufferSizeblob = 100 MB / 2 = 50 MB Estimate size of BLOB SSIS Operational and Tuning Guide Average length + 2 standard deviations on the average length of all of your blob data that will be in a buffer Contain approximately 98% of all of your blob data Likely that more than one row will be contained in a single SSIS buffer, this will ensure that almost no spooling will take place SELECT CAST (AVG(DATALENGTH(MyBlobCol)) + (2 * STDEV(DATALENGTH(MyBlobCol))) AS INT) AS Length FROM dbo.MyTable Set DefaultBufferMaxRows < MaxBufferSize / estimated size of blob data

217 BLOB horror story from the wildNew source system Not much data to begin with Took a long time to extract data after a while All columns were either NVARCHAR(MAX) or DATETIME Solution: CONVERT all NVARCHAR(MAX) to proper data types

218 Demo Load optimisation

219 Parallelism (1/2) Designing for parallelism is key to great performance in SSIS Divide and conquer Split complex packages up into multiple packages My approach, one package for each: Extract table Extract source table / file in a 1:1 format to an extract table in data warehouse Dimension load Load data from extract tables into one dimension table Fact load Load data from extract tables into one fact table Orchestrate with a master package First, execute all extract packages Second, execute all dimension load packages Finally, execute all fact load packages

220 Parallelism (2/2) Execute multiple packages in parallelExtract packages are not dependent on each other Dimension load packages are usually not dependent on each other Unless in the case of snowflaking Be careful of locking and blocking when SELECT from extract tables Use WITH (NOLOCK) hint Make sure nothing is writing to table, to avoid dirty reads Fact load packages might or might not be dependent on each other

221 Parallelism in data flowRemember: Each execution tree can have one or more threads depending on work Each data flow component can maximum receive one thread Consider splitting up transformation work into multiple transformations Gives data flow engine the possibility of assigning multiple threads to work Example: Single derived column transformations or script component that does a lot of work Split into transformations/components, each doing less works Balanced Data Distributor

222 Derived column division of workExample from: SSIS Performance Design Patterns Matt Masson & John Welch, SQLBits X Single Derived Column does a lot of work Expressions on multiple columns One thread allocated to transformation Split transformation to multiple transformations Data flow engine can now allocate multiple threads to the work

223 Derived column division of workExample from: SSIS Performance Design Patterns Matt Masson & John Welch, SQLBits X Derived column resolves expressions for each column in serial Dividing the work, allow the expressions for each column to be executed in parallel If Data Flow engine assigns multiple thread to execution tree Buffer Async Component Derived Column Execution Tree

224 davidpeterhansen.com/ssis-internals

225 References / Further readingGetting Optimal Performance with Integration Services Lookups Data Warehouse Junkie – SSIS posts SSIS Performance Design Patterns Previously committed rows might be missed if NOLOCK hint is used SQL Server NOLOCK Hint & other poor ideas. Collecting information about your SSIS packages Integration Services: Performance Tuning Techniques We Loaded 1TB in 30 Minutes with SSIS, and So Can You

226 References / Further readingSQL Server 2005 Integration Services: A Strategy for Performance SSIS Process Control Performance counters SSIS: Some PerfMon information Top 10 SQL Server Integration Services Best Practices 8 Ways to Optimize and Improve Performance of your SSIS Package SQL Server Integration Services SSIS Best Practices SSIS Nugget: Engine Threads

227 References / Further readingData Flow Performance Features Scaling Heavy Network Traffic with Windows Data Flow Optimization Checklist SSIS Catalog - Backup and Restore Package Format Changes in SQL Server Denali Designing SSIS Packages for High Performance A Look at the SSIS Object Model Developer's Guide (Integration Services)

228 References / Further readingIntegration Services Programming Overview Understanding Synchronous and Asynchronous Transformations Semi-blocking Transformations in SQL Server Integration Services SSIS SSIS Operational and Tuning Guide Getting Optimal Performance with Integration Services Lookups SQL Server 2012 SSIS Project Versions