Monday, March 21, 2011

CONTROL FLOW TASK

CONTROL FLOW TASK:

DTS(Data Transformation Services) is a predecessor of SSIS(SQL Server Integration Services)
Control Flow Tasks and Data Flow Tasks:
1.) Control Flow Tasks:
Tasks that include working with data, file system and network locations. These are the workflow engine of the package and define the structure of the control flow.
These tasks also interact with .Net Technologies, Activex components and command line prompts.
Data Flow task:
The Data Flow task is commonly used in SSIS packages that load data warehouses. Separate Data Flow tasks may be created to populate staging tables. For example, a Data Flow task to populate dimension tables with another Data Flow task, and a Data Flow task to populate the fact tables with data.
Bulk Insert task:
The Bulk Insert task is particularly useful in situations where data needs to be retrieved from a system that cannot be connected to SSIS. Text files can act as a great intermediary data source and the Bulk Insert task will minimise the time required to import the data into a SQL Server table.
Execute SQL task:
You can use the Execute SQL task for the following purposes:
Truncating a table or view in preparation for inserting data
Creating, altering and dropping database objects such as tables and views
Recreating fact and dimension tables before loading data into them
Running stored procedures
Saving the rowset returned from a query into a variable
File System task:
A File System task may be used to create a folder on the file system so that it is ready to receive files that are retrieved from elsewhere. For example, an FTP task may use the folder created by the File System task as a destination to move files.
FTP task:
An FTP task is a useful medium to retrieve files from a third-party source on the Internet. Using the FTP task, you can connect to an FTP site, retrieve files and load the files into a local folder that may be used as data within the SSIS package.
XML task:
Note that the XML task is not used to connect to an XML document that will be used as source data to be loaded into a separate destination. The XML task allows you to manipulate the XML data itself. An example could include using the Merge function to merge the contents of many XML documents into a single XML file for consolidation purposes.
Send Mail task:
The Send Mail task is commonly used within SSIS packages to send an e-mail message to SSIS administrators about failures that occur within the package itself. This provides a useful notification mechanism if a package fails and you need to be informed about the failure.
Web Service task:
Suppose you have an SSIS package that is used to populate a data warehouse and process an Analysis Services cube.
A Web Service task could be used at the end of a package to call the ReportServices2005 Web service. You can use the SetExecutionOptions method within the class to create a snapshot of the report that uses the data, which is being populated.
This way, the report snapshots are created after the data warehouse is loaded and the cube processed, rather than relying on the snapshot being created on a schedule.
Message Queue task:
The Message Queue task is useful in scenarios where your package is dependent on receiving data from other sources. For example, you may have a central data warehouse that relies on updates to the fact table from branch offices. There may be a number of Message Queue tasks that wait to receive the data from each branch office. On arrival, the Message Queue task will pass the data into your SSIS package to add the data to the fact table. When all of the branch office data has arrived it will then allow the SSIS package to continue and complete the load of the data warehouse.
Script task:
The Script task is a powerful feature that helps you to overcome a situation where a built-in task or transformation cannot provide the required functionality. Examples can include connecting to custom data sources.
ActiveX Script task:
Prior to SSIS, the ActiveX Script task was used in Data Transformation Services (DTS); the predecessor to SSIS, to perform loops within DTS packages.
Within SSIS, the For Loop Container and Foreach Loop Container can now provide the same functionality. Consider replacing ActiveX Script tasks that perform this logic with the For Loop Containers and Foreach Loop Containers.
Execute Process task:
Consider using the Execute Process task if your SSIS package can only interact with a third-party application through a batch file.
Create the batch file to interact with the third-party application first, and then use the Execute Process task to call the batch file when needed within the package.
Transfer Database task:
The Transfer Database task is a useful task if you want to move a database from one instance of SQL Server that is on old hardware to a new instance of SQL Server that is held on a server with new hardware.
Organizations may also use this task to create a copy of the database on a development SQL Server.
Transfer SQL Server Objects task:
The Transfer SQL Server Objects task allows you to be more selective about the specific objects to move between database in SQL Server, unlike the Transfer Database task.
Use this task when you want to incorporate objects from a SQL Server database into your own database.
Transfer Error Messages task:
Suppose you create user-defined error messages for use within a SQL Server instance. SSIS provides a task exclusively for the purpose of transferring these user-defined messages from one SQL Server instance to another, negating the need to manually recreating the messages.
Transfer Jobs task:
Similar to the Transfer Error Message task, you can use the Transfer Job task to move all or specific jobs from one instance of the SQL Server Agent to another without the need to manually recreate the job.
Transfer Logins task:
You can use the Transfer Logins task to move Logins from one instance of SQL Server to another without requiring to manually recreate them
Transfer Master Stored Procedure task:
The Transfer Master Stored Procedure task can only be used if you save your own user-defined stored procedures within the Master database.
Additional categories of Control Flow tasks are available in SSIS that provide additional functionality, which can be used to interact with components outside of SSIS.
a.)Package Execution Tasks:
Execute Package task:
The Execute Package task is commonly used to bring together multiple SSIS packages and add constraints to control the order in which the packages should run.
This can be useful when loading a data warehouse as well. You can encapsulate separate packages to load the different parts of the data warehouse and then integrate those separate packages within a single package to control the workflow by using the Execute Package task.
Execute DTS 2000 Package task:
Although DTS support is enabled within SSIS, consider upgrading DTS packages to SSIS packages, because this is a deprecated feature available for backward compatibility only.
b.)Analysis Services Task:
The Analysis Services Processing task is commonly used in SSIS packages that are used to load data warehouses.
Typically, this is the last task that is added to SSIS and is used to process objects within Analysis Services after the data has been loaded into the data warehouse.
Data Mining Query task:
The Data Mining Query task helps you to issue DMX prediction queries against a pre-existing data mining structure.
If your data load requirements involve automated prediction on existing data, you can use the Data Mining Query task to fulfill this requirement.
Analysis Services Execute DDL task:
You can use the Analysis Services Execute DDL task if you want to automate the creation of Analysis Services objects such as cubes or dimensions.
c,)Maintainance Tasks:
Backup Database tasks, check db integrity tasks, Execute SQL Server Agent job, Notify Operator Task, Execute TSQL Statement Task, History Cleanup task etc.
d.) Windows Management Instrumentation Tasks:
WMI Event Watcher task:
A WMI Event Watcher task can be used to monitor event in the windows system that triggers a package to execute.
For example, a WMI Event Watcher task could be used to monitor files being added to a folder. On the event occurring, the WMI Event Watcher receives a notification that can then allow the package to run using the files that have been loaded. The WQL to monitor files added to a folder is:
SELECT * FROM __InstanceCreationEvent WITHIN 10 WHERE TargetInstance ISA "CIM_DirectoryContainsFile" and
TargetInstance.GroupComponent= "Win32_Directory.Name=\"c:\\\\WMIFileWatcher\""
WMI Data Reader task:
You can use the results that are returned by the WMI Data Reader task to store the results in a variable that can be passed onto other tasks within the SSIS package.

WORKING WITH PRECEDENCE CONSTRAINTS AND CONTAINERS:
PRECEDENCE CONSTRAINTS:
They help define the work flow by prioritizing different tasks.
This helps you to control what should happen should a task succeed, fail or complete. Expressions can also be used to further refine the behavior of the workflow logic.
A control flow container lets you group different tasks together to know how tasks are run in parallel or in a specific order. You can add tasks to additional containers that are available in the Toolbox in Business Intelligence Development Studio to provide greater flexibility.

PRECEDENCE CONSTRAINTS:
You can use outcome-based precedence constraints that dictate the flow of the control flow based on success, failure or completion of a task.
You can also use expression-based precedence constraints that control the flow of the Control Flow task based on a predefined expression being met.
You can create precedence constraints based on a combination of outcome and expression-based precedence constraints.

When a Control Flow task is added to the Control Flow Designer, a green arrow appears underneath the Control Flow task.
You can click and drag the arrow to connect the precedence executable to another constrained executable.
This will indicate that on successful execution of the precedence executable, the control flow can move on to the constrained executable.
If you click on the task again, a green arrow will appear under the original Control Flow task. You can click and drag this to another constrained executable.
You can also change the control flow so that the arrow indicates on completion.
This means that it will move onto the next Control Flow task regardless of the Control Flow task succeeding or failing and will appear in blue.
You can also change the control flow so that the arrow indicates on failure.
This will indicate that on failed execution of the precedence executable, the control flow can move onto the constrained executable that is connected to it and will appear in red.
If expressions are used to control the tasks in the control flow, a blue line will appear with an fx icon appearing next to the control flow.

Multiple Constraints:
Any Control Flow task can have multiple precedence constraints connecting to it.
When this situation occurs, you need to configure when the constrained executable should execute.
If the constrained executable can only execute if all precedence constraints are completed configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical AND.
If only one of the precedence constraints must complete before the constrained task will execute, configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical OR.
These settings allow you to refine the control for complex SQL Server Integration Services packages.

CONTAINERS:
Control flow containers enable you to group a set of Control Flow tasks or containers within a single container so that they can be organized and managed as a single entity.
Containers can also have precedence constraints connect to it and connect from it.
SQL Server Integration Services provides three types of containers:
Sequence containers
For Loop containers
Foreach Loop containers


A Sequence container is typically used to contain a number of tasks within it, so that the container can act as one logical unit within the Sequence container.
If the container has a precedence constraint, all the tasks within the container must complete first before the control flow path moves onto the next Control Flow task.
Furthermore, properties that are defined at the container level are inherited by the tasks within it, such as the DisableEventHandlers property.
Sequence containers can also act as a scope for variables.

For Loop containers allow you to define a repeating workflow that uses an expression and loops until the expression evaluates to false.
Within the For Loop container, you must define an initialisation value as the basis to start the loop.
To end the loop, an EvalExpression property specifies the value for exiting a loop with an AssignExpression property used to increment the loop.
The For Loop container is useful when you know how many times the container should run through a loop.

When the number of loops required is not known, a Foreach Loop container can loop through a collection of objects until the object no longer exists.
Objects can include files, ADO recordsets or XML nodes.
A task can be performed on the objects as they are enumerated by the Foreach Loop container.

WORKING WITH VARIABLES:
To work with variables in a package, choose variables from SSIS menu.
To add dynamic capabilities Variables can be used to pass information between 2 different components in a package. They add flexibility to the package logic.
1.) System Defined Variables
2.) User Defined Variables
Static values can be assigned to a variable within a variable definition itself, by setting a default value. You can also dynamically assign values to a variable from control flow tasks. For example, an Execute SQL task can be used to perform a count of the rows within a result set. You can then use the result mapping page of the Execute SQL task to pass the result to a variable. The variable can then use this as an input to other control flow tasks within the package.

SYSTEM DEFINED VARIABLES:
1.)SYSTEM VARIABLES in a Packahe:
Package-based variables are system variables that hold information about the package. Some of the common variables include the following:
System::PackageName. The name of the package.
System::PackageID. The unique ID of the package.
System::MachineName. The machine on which the package was executed.
System::UserName. The name of the user that executed the package.
System::StartTime. The time at which the package started to execute.
2.)Task-based variables
Task-based variables can be specific to the task that you add to the SSIS package including the following:
System::TaskName. The name of the task.
System::TaskID. The ID for the task within the package.
System:TaskTransactionOption. This variable holds information about the transaction setting that the task is using.
3.)Container-based variables
he container-based system variables can be used with Sequence, For Loop and Foreach Loop containers. Not all of the system variables are applicable to all container types. Some of the common variables include the following:
System::ParentContainerGUID. Holds the globally unique ID of the parent container that holds the container.
System::ContainerStartTime. Holds the time when the container started to execute.
System::LocaleID. Holds the value of the locale that the container uses.

USER DEFINED VARIABLES:
At the top of the Variable window are the buttons that let you create or delete variables as well as view other variables within a package. The scope defines at what level within a package the variable is created. eg if you select a foreach loop container and then click the add variable button on the toolbar for the variables window, the variable is scoped at that level.If no tasks and containers are selected the variable is scoped at the entire package level.

USING SCRIPT TASK AND DATA PROFILING TASK:
SCRIPT TASK:
It is used within SSIS Package to execute VB.NET or C#.NET code. If variables are to be included you need to include variables in ReadOnlyVariables or ReadWriteVariables list depending on whether you will be just accessing the variable for read or updating purposes in ScriptTask Editor.
DATA PROFILING TASKS:
It is used to review data source entities to check for data cleanliness and completeness and to understand how the data is organized structurally.
It had 2 parts:
1.)Data Profiling Task in the control flow that performs analysis
2.)Data profile viewer that reviews the result.
Data Profiling task features:
Column Null ratio Profile:For returning the percentage of NULL's in the columns relative to total
Column Stats Profile: For mean, median , Avg Values
Column Value Distribution Profile:
Column Length Distribution Profile:
Column Pattern Profile: Displays any patterns found in the column data
Candidate Key Profile:Identifies one or more columns that are unique across all rows
Functional Dependency Profile

4 comments: