Monday, March 28, 2011

Configuring and Deploying a SSIS Package

PACKAGE CONFIGURATION:
When you work in BIDS all values that you set up during development are stored as XML in the package's .dtsx file. The package is also stored in this file. Whenever a package is executed the first action a package takes is to look at its configurations and overwrite the packages current setting with the new settings from the configurations.
Package configurations are optional but they provide certain benefits like:
1.)Letting you update package settings without opening the package in BIDS.
2.)They also give a central location for settings that can be shared between package.
The common elements that are configured using the config file are:
Connection properties
Package Properties
Task and Container Properties
Package Variable Properties
Note:Data Flow properties cannot be configured by using package configurations. You can only configure the Data Flow Container itself.

ENABLING SSIS PACKAGE CONFIGURATION:

Open Package-->SSIS menu-->SSIS Configurations
Config Type:
Most commonly used configuration types are XML Configurations file and SQL Server configurations.
Creating an XML File Configurations:
There are 2 ways of creating an XML File
1.)Hardcoding the configuration file by specifying a name in the box provided.
2.)Use environment variable that uses the location in the file by creating a systems environment variable in the computer systems properties.
This is called indirect Configuration and is useful when the XML file location or file name might change.
Control Panel-->Systems and Security-->Advanced System Setting-->Advanced-->Environment Variables.
Creating an SQL Server Configurations:
Packages can share SQL Server configurations even if they are on different servers since the SQL Server configurations do not have to be stored in the local server.
Note: Which Config type to use:
1.)Use SQL Server config if there are multiple packages that need to share a config and the packages are executing on different computers.
2.) If packages are executed together. i.e. one package executes another which executes another one then use the parent package variable config.
3.) If planning to move packages from developmental to test server and then to production and each environment needs to use a different location then use the xml file
4.)If there are more than 1 config entries then that needs to put into a single SSIS configuration and either XML or SQL Server config file can be used.
Adding Properties to Your Configurations:
If entries already exists in the configuration file you selected SSIS prompts to reuse the setting or overwrite(which will clear the existing configurations) it.
If configuration entries do not exists or if overwrite is clicked you see the select properties to export page.
The Select Properties To Export Page lets you select the properties for the SSIS configurations that are selected. Properties are grouped by the following folder:
Variables
Connection Managers
Log Providers
Properties
Executables
Sharing, Ordering Configurations:
If there are several configurations the configurations are called in the order in which they are listed in the package configuration Organizer.

LIST OF PLACES THAT CAN USE EXPRESSIONS IN THE SSIS PACKAGE:
Derived Column Transformation
Conditional Split
For Loop
Variable Expression
Property Expression which include package properties at the control flow level or container and task properties accessed by editors and property window.
PROPERTY EXPRESSIONS:
It lets you update the properties of control flow during execution of package.
This can be done by first setting the property as an expression through the properties window and then set the expression property.

DEPLOYING A SSIS PACKAGE:
After the configurations and expressions are set in a package it is ready to move to a new environment. The deployment process requires 2 steps:
1.)Creating a package installer set by using package deployment utility
2.)Deploying the installer set to a new destination.
PACKAGE DEPLOYMENT:
When packages are moved from one environment to another there are 2 options for package storage.
SSIS Deployment to the File System:
Packages can be placed on file shares or file folders from which they could be executed.

SSIS Deployment to SQL Server:
Packages can be imported into SQL Servers from which they can be executed.
When you deploy a package to SQL Server the package is stored in a table named sysssispackages.
Deployment can be manual or driven through a utility.
1.)Manual package Deployment:
Connect to SSIS service and open the
2.)Deployment Utility:
It is the built in tool of SSIS and helps deploying an entire project at the same time to either a file system or SQL Server. It first generates a set of files that are packaged and ready to be deployed and then you run the package installer to run the deployment.

Creating a Installer Kit by Using a Deployment Utility:
Each SSIS Project has a property for enabling the deployment utility. When this property is enabled each time the project is built or debugged, the following items are copied to a single file location:
Packages
Misc files
Project Deployment Manifest(the installer kit config file)
To turn on the property follow the following steps:
1.)Right click SSIS project and click properties
2.)Change the createDeploymentUtility as True
3.)Set the location in which the deployment files will be stored.
4.)The AllowConfigChanges property is set to trueby default but it can be set to false if you need to limit the config changes while deployment.
5.)click OK
Next process is to run a build process on SSIS project to generate a deployment set. Right click on SSIS Project and create build.
Note: Turn off CreateDeploymentUtility when finished because any time any package in the project is debugge, a new deployment set will be created. This will overwrite the existing deployment sets so set the property is false after creating a deployment set.

Deploying Packages:
After the installer kit is created the next task is to take the set of files and deploy them to a destination by using Package installation Wizard. The following steps show how is it done:
1.)Move the installer kit to a destination computer. If the destination computer is on share the deployment files remain in place.
2.)Pick the file [ProjectName].SSISDeploymentManifest and open it. When you run the wizard you have the option to deploy the package to either a SQL Server db or a file system.
3.)On the Deploy SSIS package you select the destination as either a file system or SQL Server deployment.
4.)Based on what is selected you will be able to define the specific storage location.
5.) If the package includes config files it will now let you update and change the values of these configuration files.

Note: When deploying the package installation wizard copies the dependent files to the folder you specify.This includes XML Configuration files used by any package in the project(but only where XML config path is hard coded and does not use an environmental variable).Package installation files will update the XML config files in the package and point them to a new location.

Using SSIS DTUtil command line Utility:
SSIS comes with DTUtil which lets you perform package management tasks like deployment through command line or batch file processes.
The Package storage location references use /SQL, /File, /DTS parameters for sources and destinations.
Commands:
/COPY
/MOVE
/DELETE
/EXISTS
/ENCRYPT
/DECRYPT
/SIGN:Digitally signs a package
/FCREATE: Creates and manages SSIS Package store and in SQL Server MSDB db.

eg2: Below example shows copying a file from a local folder to the same location in the local folder with a new file name:
dtutil.exe /FILE c:\MyPackage.dtsx /COPY FILE; c:\MyCopiedPackage.dtsx
eg2: To delete the MyPackage.dtsx
dtutil.exe /SQL c:\MyPackage.dtsx /DELETE
eg3: To check for the existance of the file
dtutil.exe /SQL c:\MyPackage.dtsx /EXISTS
eg4: To deploy a package from file system to sql Server
dtutil.exe /FILE c:\MyCopiedPackage.dtsx /COPY SQL; \MyCopiedPackage.dtsx
When DTUtil is executed through an automated command line, use the follwing to interpret the results:
0 Success
1 Failure
4 Package not found
5 Package cannot load
6 Bad Syntax

Friday, March 25, 2011

Handling data flow errors and Debugging

I.)CONTROL FLOW/CONTROL FLOW ERRORS:
1.)Checkpoints and Transactions:
Within SSIS packages, you can implement checkpoints to reduce the time it would take to rerun failed package from a point of failure.In other words you might not want successful tasks to run again if you restart the package.
Checkpoints can be used to reduce the time it will take to rerun a failed package.
Here is what happened if you run a package that has checkpoints enabled.
1.)Package checks to see if the checkpoint file exists
If the file does not exists(ie it is deleted) the package begins at the first task.
If the file exists the package reads the file and finds out where to start(including updating the values of variables and connections at the time of the last failure).
After you enable checkpoints in package, the final step is to set checkpoints at the various tasks within your package. To do this set the FailPackageOnFailure property at each task or container as True.
If a series of task must be completed as a single unit in which either all the tasks are successful and committed or all error occurs and none of the tasks are committed place the task within a sequence container and then set transaction option property as Required
To ensure transactional consistency of the data that is affected by the SSIS package, you can make use of transaction settings on control flow executable.
This will ensure that if the executable succeeds then the data is committed, if it should fail it will roll back the work performed by the executable to ensure data integrity.
Isolation level defines the level of concurrent access on the task . The least concurrency is chaos and the max is Serializable

2.)Identifying Package Status:
MaximunErrorCount control flow property lets a package continue to run to completion even if there are errors that have occurred. The max number specifies the number of errors that can occur before the package will stop executing and report failure.
Whenever a package is executed outside the debug environment the tab name execution results lists the error and warning description, execution times, final destination row counts and other execution information.

3.)Configuring Execution Logging:
Logging is used to understand what aspect of the package caused the package failure so that you can identify problems quickly.
There are 2 types of logging:
Standard(built in logging feature provided by SSIS logging info about standard events used by event handlers such as OnError, OnQueryCancel and OnPreExecute events.) and custom logging(set up event handler and query system and user defined variables to log specific info to the destination of your choice)

Implementing Standard Logging:
There are 3 steps involved in enabling logging in a package.
1.)Choosing the container to log
2.)Setting the log provider type
Most common files are text files, SQL Server but you can also log to Windows event log, SQL Server profiler or XML file.
3.)Selecting the log events:
The most common are OnError, onPostExecute, OnProcess and OnTaskFailure
If you select sql server as the log provider , the log details are stored in the system table named,dbo.syssislog.
To recover the log file execute:
select *
from dbo.sysssislog

Custom Logging:
Event Handlers For Control Flow :
Note: In data flow the data viewers gives the ability to debug problems while data processing The control flow is different and taking advantage of the visual properties in Visual Studio control flow supports visual debugging and breakpoint features.
You would first look at event handler and you will then explore the debugging capabilities in control flow.

If an error occurs the event handler can send an alert or fix a data problem.You can provide 0 to many event handlers for a package.Event handlers are a component of SQL Server Integration Services that can execute event handler executables only when a defined event is encountered on a given Control Flow container or task within an SSIS package.
You can turn off the event handler for any task or container by setting the propertyDisableEventHandlers to True.

The most common event handler used is an OnError event.
The MaximumErrorCount property on a control flow executable determines the number of allowable errors to occur.
If the threshold of the MaximumErrorCount property is met, the control flow executable throws an error.
If an OnError event handler is configured for the executable that threw the error, the event handler will execute event handler executables that are defined against the control flow executable.
Using the OnError event handler is the equivalent of using an outcome-based precedence constraint set to failure.

The OnQueryCancel event would be raised if you were to stop a package while the package was executing.

The event handler executables that are available for you to use include the same containers and tasks that are available within a Control Flow component of SSIS.

Each package contains set of system variables that are updated for various levels in the package execution. With event handlers you can capture these variables and values which provide info such as ErrorCode, ErrorDescription, SourceName etc.
You can create a template with the desired error handling settings and place it in:
C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems. This can be used as a template and used by opening project menu on the visual studio menu bar and then add new item.

4.)Connecting Control Flow Objects with Precedence:
Precedence Constraints are used only in control flow and not in the data flow. The connectors between objects in control flow are Precedence constraints and connectors in data flow are data paths.
Precedence Constraints:
If the task failed the constraints are blue or red. Even if a task failed it is considered completed and therefore blue constraints (completion) are followed to the next task or container.
These connectors are green(indicating the execution success), red(is tasks failed) or blue(task is completed).
Constraints Features:
1.)Solid Lines represent AND precedence
2.)Dashed lines are logical OR statements
3,)Functional symbols show that an SSIS expression has been embedded in the constraint.
Evaluation Operation Property:
1.)Constraints:It is the execution status of the preceding task(completion, failure or success)
2.)Expressions: They are advanced boolean evaluators(meaning that they must be evaluated to true or false)
Evaluation operation drop down list includes the following options:
1.)Constraints
2.)Expressions
3.)Constraints and Expression
4.)Constraints or Expression


II.)DATA FLOW ERRORS:
1.)Data Paths and Error Paths:
Data Paths are the green connectors that go from one component to another. For these paths the rows have successfully gone through the component are output as are the error rows when the error Output is set to ignore failure.
Error Paths:The red connectors between data flow components. they contain the data rows that failed in a component when the error rows are set to redirected.
Not all components in data flow use error paths. eg multicast component since it does not perform any operations on the data itself.
There are 3 options for configuring error output:
1.)Fail Transformation:
Causes the data flow to fail if an error is encountered
2.)Ignore Failure Option:
Will let the row continue out the normal green data path but the value that resulted in error is changed to NULL.
3.)Redirect Rows
Sends the error row out as the red error path. This is the way to handle errors with separate components.
Uses of Error Path output:
a.) When text file source data does not match the data type specified in the data flow.
b.)When the lookup transformation does not find a match.

2.)Data Viewers:
They help to identify data flow issues.
There are 4 types of Data Viewers- Grid, hitogram, scatter plots and column charts.
Data viewers do not work when package is run from command line.

III.)DEBUGGING
The Progress Results windows will show the steps that the SSIS package is running through and colour codes the task as they each execute to provide a visual indication of which tasks are in progress, which tasks have succeeded and which tasks have failed.

You can also use the built-in row counts within the Data Flow component to indicate the number of data rows passing through the data flow.

You can make use of data viewers to stop the execution of a package and view the state of data as it passes through the data flow.

Furthermore, you can use the Data Profiling task for more in-depth analysis of the data to ensure that the data is being transformed and loaded as expected.

Breakpoints can be applied to both SSIS package components and within the Script task to pause the execution of a package under specific conditions so you can debug the SSIS package at the point the breakpoint causes the package to suspend.
When you debug a package that contains multiple Script tasks, the debugger hits breakpoints in only one Script task and will ignore breakpoints in the other Script tasks. If a Script task is part of a Foreach Loop or For Loop container, the debugger ignores breakpoints in the Script task after the first iteration of the loop.

Debugging the Control Flow with Breakpoints:
Breakpopints function in the control flow and not in the data flow. For Scripting that means you can set up breakpoints only in the control flow Scripting task and not in the data flow scripting Component Transformation.
In the Debug menu open the 'Windows'--> 'locals' this window shows the value of variables when the package is paused with a breakpoint.

Tuesday, March 22, 2011

DATA FLOW TASKS

Data Flow Tasks
One of the most valuable control flow task is Data Flow task. It has 3 types of objects.
1.)Data Flow Source Adapters: ado.net source(Provides connections to tables or queries through ADO.NET Provider),excel source,flat file source, ole db source(connects to installed OLE DB providers, such as SQL Server,Access,SSAS and Oracle), raw file source,xml source.
You can also define more than one data source.
2.)Data Flow Transformations:Transformations are used to modify the data so that it can be standardized.
SQL Server Integration Services provides a wide variety of transformations to meet an organization’s requirements.
Each transformation contains different properties to control how the data is changed
3.)Data Flow Destination Adapters: You then define data destinations in which the transformed data is loaded into.
Similar to Data Flow sources and have matching destination adapters.Apart from Source adapters other destination adapters are Data Mining Model Training,DataReader Destination, Dimension Processing,Partition Processing, RecordSet DestinationSQL Server Compact Destination, SQL Server Destination.
To use the destination optimization technique to have fast load and not just one row at a time to be inserted into the destination table set data access mode to Table or View- Fast Load.

DATA FLOW PATHS:
Data Flow paths can be simply used to connect a data source directly to a data destination. Typically, you use a Data Flow path to determine the order in which a transformation takes place; specifying the path that is taken should a transformation succeed or fail.
2 Paths:
1.)Success DF Path:green arrow
2.)Failure DF Path:red arrow..You are not required to configure error output Data Flow paths for every Data Flow component.

DATA VIEWERS:
A data viewer is a useful debugging tool that enables you to view the data as it passes through the data flow between two data flow components.
The different types of viewers are Grid, Histogram, Scatter Plots and column Charts.
Data viewers can be added by right clicking the data flow path.
You can use a data viewer only when running a package in the debug environment. If you have an error path configured in the package, it will be ignored when you run the package progrmmatically or from the command line.

WORKING WITH DATA TRANSFORMATIONS:
Transformations give the ability to manipulate the data in the data flow. It allows you to change the data as the data is being moved from a source connection to a destination connection such as a text file to a table within a database.
1.) ROW LEVEL TRANSFORMATIONS:
Transformations provided operations on rows without the need of other rows from sources.
Uses of this transformation includes..Performing mathematical calculations, converting text value to numerical etc or replacing null values.
The transformations in this category are:
Character Map, copy column, data conversions, derived column, row count, sort, aggregate(sum, group by) etc.

Note:The Cache transformation can improve the performance of the Lookup transformation.
2.)MULTI INPUT OR MULTI OUTPUT TRANSFORMATIONS:
It can work with more than 1 data input or generate more than 1 data output.
They include:
Conditional Split:
The Conditional Split transformation takes a single data flow input and creates multiple data flow outputs based on multiple conditional expressions defined within the transformation. The order of the conditional expression is important. If a record satisfies the first condition, the data is moved based on that condition even if it meets the condition of the second expression. There, the record will no longer be available to be evaluated against the second condition. Expression can be a combination of functions and operators to define a condition.
Lookup:
Allows matching between pipeline column values to external database tables.
Merge Join:
The Merge Join transformation is similar to the Merge transformation. However, you can make use of the following Transact-SQL clauses to determine how the data is merged. The Transact-SQL clauses include FULL, LEFT or INNER join. Like the Merge transformation, the input columns must be sorted and the columns that are joined must have compatible data types.
Merge:
The Merge transformation takes multiple inputs into the transformation and merges the data together from the separate inputs. A prerequisite to the merge input working successfully is that the input columns are sorted. Furthermore, the columns that are sorted must also be of compatible data types. For example, you cannot merge the input that has a character data type with a second input that has a numeric data type.
Multicast:
The Multicast transformation allows you to output multiple copies of the same data flow input to different data flow outputs.
Union All:
The Union All transformation is very similar to the Merge transformation. The key difference is that the Union All transformation does not require the input columns to be sorted. However, the columns that are mapped must still have compatible data types.
Note: Merge and Merge Join require data to be sorted before.
The Merge transformation is similar to the Union All transformations. Use the Union All transformation instead of the Merge transformation in the following situations:
The transformation inputs are not sorted.
The combined output does not need to be sorted.
The transformation has more than two inputs.
3.)ADVANCED DATA PREPARATION TRANSFORMATION:
These let you perform advanced operations on the rows in the data flow pipeline.
OLE DB Command transformation:
Performs database operations such as updates, deletes, one row at a time based on mapped parameters from input rows.
Slowly Changing Dimensions:
Processes Dimension changes including tracking dimension history and updating dimension values. It handles common dimension changes like Historical Attributes(2 rows are created for 1 record), Fixed Attributes and Changing attributes(replaces the row data).
Data Mining Query
Applies input rows against a data mining model for prediction.
Fuzzy Grouping
The Fuzzy Group transformation allows you to standardise and cleanse data by selecting likely duplicate data and comparing it to an alias row of data that is used to standardise the input data. As a result, a connection is required to SQL Server, as the Fuzzy Group transformation requires a temporary table to perform its work.
Fuzzy Lookup
The Fuzzy Lookup transformation performs data cleansing tasks such as standardizing data, correcting data and providing missing values.
Using the fuzziness capability that is available to the Fuzzy Grouping transformation, this logic can be applied to Lookup operations so that it can return data from a dataset that may closely match the Lookup value required. This is what separates the Fuzzy Lookup transformation from the Lookup transformation, which requires an exact match. Note that the connection to SQL Server must resolve to a user who has permission to create tables in the database.
Script Component
Provides VB.NET scripting capabilities against Rows ,Coulmns, Inputs and Outputs.
Term Extraction
Analysis text input for eng nouns and noun phrases
Term Lookup
Analysis the text input column for user defined set of words for association.

Monday, March 21, 2011

CONTROL FLOW TASK

CONTROL FLOW TASK:

DTS(Data Transformation Services) is a predecessor of SSIS(SQL Server Integration Services)
Control Flow Tasks and Data Flow Tasks:
1.) Control Flow Tasks:
Tasks that include working with data, file system and network locations. These are the workflow engine of the package and define the structure of the control flow.
These tasks also interact with .Net Technologies, Activex components and command line prompts.
Data Flow task:
The Data Flow task is commonly used in SSIS packages that load data warehouses. Separate Data Flow tasks may be created to populate staging tables. For example, a Data Flow task to populate dimension tables with another Data Flow task, and a Data Flow task to populate the fact tables with data.
Bulk Insert task:
The Bulk Insert task is particularly useful in situations where data needs to be retrieved from a system that cannot be connected to SSIS. Text files can act as a great intermediary data source and the Bulk Insert task will minimise the time required to import the data into a SQL Server table.
Execute SQL task:
You can use the Execute SQL task for the following purposes:
Truncating a table or view in preparation for inserting data
Creating, altering and dropping database objects such as tables and views
Recreating fact and dimension tables before loading data into them
Running stored procedures
Saving the rowset returned from a query into a variable
File System task:
A File System task may be used to create a folder on the file system so that it is ready to receive files that are retrieved from elsewhere. For example, an FTP task may use the folder created by the File System task as a destination to move files.
FTP task:
An FTP task is a useful medium to retrieve files from a third-party source on the Internet. Using the FTP task, you can connect to an FTP site, retrieve files and load the files into a local folder that may be used as data within the SSIS package.
XML task:
Note that the XML task is not used to connect to an XML document that will be used as source data to be loaded into a separate destination. The XML task allows you to manipulate the XML data itself. An example could include using the Merge function to merge the contents of many XML documents into a single XML file for consolidation purposes.
Send Mail task:
The Send Mail task is commonly used within SSIS packages to send an e-mail message to SSIS administrators about failures that occur within the package itself. This provides a useful notification mechanism if a package fails and you need to be informed about the failure.
Web Service task:
Suppose you have an SSIS package that is used to populate a data warehouse and process an Analysis Services cube.
A Web Service task could be used at the end of a package to call the ReportServices2005 Web service. You can use the SetExecutionOptions method within the class to create a snapshot of the report that uses the data, which is being populated.
This way, the report snapshots are created after the data warehouse is loaded and the cube processed, rather than relying on the snapshot being created on a schedule.
Message Queue task:
The Message Queue task is useful in scenarios where your package is dependent on receiving data from other sources. For example, you may have a central data warehouse that relies on updates to the fact table from branch offices. There may be a number of Message Queue tasks that wait to receive the data from each branch office. On arrival, the Message Queue task will pass the data into your SSIS package to add the data to the fact table. When all of the branch office data has arrived it will then allow the SSIS package to continue and complete the load of the data warehouse.
Script task:
The Script task is a powerful feature that helps you to overcome a situation where a built-in task or transformation cannot provide the required functionality. Examples can include connecting to custom data sources.
ActiveX Script task:
Prior to SSIS, the ActiveX Script task was used in Data Transformation Services (DTS); the predecessor to SSIS, to perform loops within DTS packages.
Within SSIS, the For Loop Container and Foreach Loop Container can now provide the same functionality. Consider replacing ActiveX Script tasks that perform this logic with the For Loop Containers and Foreach Loop Containers.
Execute Process task:
Consider using the Execute Process task if your SSIS package can only interact with a third-party application through a batch file.
Create the batch file to interact with the third-party application first, and then use the Execute Process task to call the batch file when needed within the package.
Transfer Database task:
The Transfer Database task is a useful task if you want to move a database from one instance of SQL Server that is on old hardware to a new instance of SQL Server that is held on a server with new hardware.
Organizations may also use this task to create a copy of the database on a development SQL Server.
Transfer SQL Server Objects task:
The Transfer SQL Server Objects task allows you to be more selective about the specific objects to move between database in SQL Server, unlike the Transfer Database task.
Use this task when you want to incorporate objects from a SQL Server database into your own database.
Transfer Error Messages task:
Suppose you create user-defined error messages for use within a SQL Server instance. SSIS provides a task exclusively for the purpose of transferring these user-defined messages from one SQL Server instance to another, negating the need to manually recreating the messages.
Transfer Jobs task:
Similar to the Transfer Error Message task, you can use the Transfer Job task to move all or specific jobs from one instance of the SQL Server Agent to another without the need to manually recreate the job.
Transfer Logins task:
You can use the Transfer Logins task to move Logins from one instance of SQL Server to another without requiring to manually recreate them
Transfer Master Stored Procedure task:
The Transfer Master Stored Procedure task can only be used if you save your own user-defined stored procedures within the Master database.
Additional categories of Control Flow tasks are available in SSIS that provide additional functionality, which can be used to interact with components outside of SSIS.
a.)Package Execution Tasks:
Execute Package task:
The Execute Package task is commonly used to bring together multiple SSIS packages and add constraints to control the order in which the packages should run.
This can be useful when loading a data warehouse as well. You can encapsulate separate packages to load the different parts of the data warehouse and then integrate those separate packages within a single package to control the workflow by using the Execute Package task.
Execute DTS 2000 Package task:
Although DTS support is enabled within SSIS, consider upgrading DTS packages to SSIS packages, because this is a deprecated feature available for backward compatibility only.
b.)Analysis Services Task:
The Analysis Services Processing task is commonly used in SSIS packages that are used to load data warehouses.
Typically, this is the last task that is added to SSIS and is used to process objects within Analysis Services after the data has been loaded into the data warehouse.
Data Mining Query task:
The Data Mining Query task helps you to issue DMX prediction queries against a pre-existing data mining structure.
If your data load requirements involve automated prediction on existing data, you can use the Data Mining Query task to fulfill this requirement.
Analysis Services Execute DDL task:
You can use the Analysis Services Execute DDL task if you want to automate the creation of Analysis Services objects such as cubes or dimensions.
c,)Maintainance Tasks:
Backup Database tasks, check db integrity tasks, Execute SQL Server Agent job, Notify Operator Task, Execute TSQL Statement Task, History Cleanup task etc.
d.) Windows Management Instrumentation Tasks:
WMI Event Watcher task:
A WMI Event Watcher task can be used to monitor event in the windows system that triggers a package to execute.
For example, a WMI Event Watcher task could be used to monitor files being added to a folder. On the event occurring, the WMI Event Watcher receives a notification that can then allow the package to run using the files that have been loaded. The WQL to monitor files added to a folder is:
SELECT * FROM __InstanceCreationEvent WITHIN 10 WHERE TargetInstance ISA "CIM_DirectoryContainsFile" and
TargetInstance.GroupComponent= "Win32_Directory.Name=\"c:\\\\WMIFileWatcher\""
WMI Data Reader task:
You can use the results that are returned by the WMI Data Reader task to store the results in a variable that can be passed onto other tasks within the SSIS package.

WORKING WITH PRECEDENCE CONSTRAINTS AND CONTAINERS:
PRECEDENCE CONSTRAINTS:
They help define the work flow by prioritizing different tasks.
This helps you to control what should happen should a task succeed, fail or complete. Expressions can also be used to further refine the behavior of the workflow logic.
A control flow container lets you group different tasks together to know how tasks are run in parallel or in a specific order. You can add tasks to additional containers that are available in the Toolbox in Business Intelligence Development Studio to provide greater flexibility.

PRECEDENCE CONSTRAINTS:
You can use outcome-based precedence constraints that dictate the flow of the control flow based on success, failure or completion of a task.
You can also use expression-based precedence constraints that control the flow of the Control Flow task based on a predefined expression being met.
You can create precedence constraints based on a combination of outcome and expression-based precedence constraints.

When a Control Flow task is added to the Control Flow Designer, a green arrow appears underneath the Control Flow task.
You can click and drag the arrow to connect the precedence executable to another constrained executable.
This will indicate that on successful execution of the precedence executable, the control flow can move on to the constrained executable.
If you click on the task again, a green arrow will appear under the original Control Flow task. You can click and drag this to another constrained executable.
You can also change the control flow so that the arrow indicates on completion.
This means that it will move onto the next Control Flow task regardless of the Control Flow task succeeding or failing and will appear in blue.
You can also change the control flow so that the arrow indicates on failure.
This will indicate that on failed execution of the precedence executable, the control flow can move onto the constrained executable that is connected to it and will appear in red.
If expressions are used to control the tasks in the control flow, a blue line will appear with an fx icon appearing next to the control flow.

Multiple Constraints:
Any Control Flow task can have multiple precedence constraints connecting to it.
When this situation occurs, you need to configure when the constrained executable should execute.
If the constrained executable can only execute if all precedence constraints are completed configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical AND.
If only one of the precedence constraints must complete before the constrained task will execute, configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical OR.
These settings allow you to refine the control for complex SQL Server Integration Services packages.

CONTAINERS:
Control flow containers enable you to group a set of Control Flow tasks or containers within a single container so that they can be organized and managed as a single entity.
Containers can also have precedence constraints connect to it and connect from it.
SQL Server Integration Services provides three types of containers:
Sequence containers
For Loop containers
Foreach Loop containers


A Sequence container is typically used to contain a number of tasks within it, so that the container can act as one logical unit within the Sequence container.
If the container has a precedence constraint, all the tasks within the container must complete first before the control flow path moves onto the next Control Flow task.
Furthermore, properties that are defined at the container level are inherited by the tasks within it, such as the DisableEventHandlers property.
Sequence containers can also act as a scope for variables.

For Loop containers allow you to define a repeating workflow that uses an expression and loops until the expression evaluates to false.
Within the For Loop container, you must define an initialisation value as the basis to start the loop.
To end the loop, an EvalExpression property specifies the value for exiting a loop with an AssignExpression property used to increment the loop.
The For Loop container is useful when you know how many times the container should run through a loop.

When the number of loops required is not known, a Foreach Loop container can loop through a collection of objects until the object no longer exists.
Objects can include files, ADO recordsets or XML nodes.
A task can be performed on the objects as they are enumerated by the Foreach Loop container.

WORKING WITH VARIABLES:
To work with variables in a package, choose variables from SSIS menu.
To add dynamic capabilities Variables can be used to pass information between 2 different components in a package. They add flexibility to the package logic.
1.) System Defined Variables
2.) User Defined Variables
Static values can be assigned to a variable within a variable definition itself, by setting a default value. You can also dynamically assign values to a variable from control flow tasks. For example, an Execute SQL task can be used to perform a count of the rows within a result set. You can then use the result mapping page of the Execute SQL task to pass the result to a variable. The variable can then use this as an input to other control flow tasks within the package.

SYSTEM DEFINED VARIABLES:
1.)SYSTEM VARIABLES in a Packahe:
Package-based variables are system variables that hold information about the package. Some of the common variables include the following:
System::PackageName. The name of the package.
System::PackageID. The unique ID of the package.
System::MachineName. The machine on which the package was executed.
System::UserName. The name of the user that executed the package.
System::StartTime. The time at which the package started to execute.
2.)Task-based variables
Task-based variables can be specific to the task that you add to the SSIS package including the following:
System::TaskName. The name of the task.
System::TaskID. The ID for the task within the package.
System:TaskTransactionOption. This variable holds information about the transaction setting that the task is using.
3.)Container-based variables
he container-based system variables can be used with Sequence, For Loop and Foreach Loop containers. Not all of the system variables are applicable to all container types. Some of the common variables include the following:
System::ParentContainerGUID. Holds the globally unique ID of the parent container that holds the container.
System::ContainerStartTime. Holds the time when the container started to execute.
System::LocaleID. Holds the value of the locale that the container uses.

USER DEFINED VARIABLES:
At the top of the Variable window are the buttons that let you create or delete variables as well as view other variables within a package. The scope defines at what level within a package the variable is created. eg if you select a foreach loop container and then click the add variable button on the toolbar for the variables window, the variable is scoped at that level.If no tasks and containers are selected the variable is scoped at the entire package level.

USING SCRIPT TASK AND DATA PROFILING TASK:
SCRIPT TASK:
It is used within SSIS Package to execute VB.NET or C#.NET code. If variables are to be included you need to include variables in ReadOnlyVariables or ReadWriteVariables list depending on whether you will be just accessing the variable for read or updating purposes in ScriptTask Editor.
DATA PROFILING TASKS:
It is used to review data source entities to check for data cleanliness and completeness and to understand how the data is organized structurally.
It had 2 parts:
1.)Data Profiling Task in the control flow that performs analysis
2.)Data profile viewer that reviews the result.
Data Profiling task features:
Column Null ratio Profile:For returning the percentage of NULL's in the columns relative to total
Column Stats Profile: For mean, median , Avg Values
Column Value Distribution Profile:
Column Length Distribution Profile:
Column Pattern Profile: Displays any patterns found in the column data
Candidate Key Profile:Identifies one or more columns that are unique across all rows
Functional Dependency Profile