Friday, March 25, 2011

Handling data flow errors and Debugging

I.)CONTROL FLOW/CONTROL FLOW ERRORS:
1.)Checkpoints and Transactions:
Within SSIS packages, you can implement checkpoints to reduce the time it would take to rerun failed package from a point of failure.In other words you might not want successful tasks to run again if you restart the package.
Checkpoints can be used to reduce the time it will take to rerun a failed package.
Here is what happened if you run a package that has checkpoints enabled.
1.)Package checks to see if the checkpoint file exists
If the file does not exists(ie it is deleted) the package begins at the first task.
If the file exists the package reads the file and finds out where to start(including updating the values of variables and connections at the time of the last failure).
After you enable checkpoints in package, the final step is to set checkpoints at the various tasks within your package. To do this set the FailPackageOnFailure property at each task or container as True.
If a series of task must be completed as a single unit in which either all the tasks are successful and committed or all error occurs and none of the tasks are committed place the task within a sequence container and then set transaction option property as Required
To ensure transactional consistency of the data that is affected by the SSIS package, you can make use of transaction settings on control flow executable.
This will ensure that if the executable succeeds then the data is committed, if it should fail it will roll back the work performed by the executable to ensure data integrity.
Isolation level defines the level of concurrent access on the task . The least concurrency is chaos and the max is Serializable

2.)Identifying Package Status:
MaximunErrorCount control flow property lets a package continue to run to completion even if there are errors that have occurred. The max number specifies the number of errors that can occur before the package will stop executing and report failure.
Whenever a package is executed outside the debug environment the tab name execution results lists the error and warning description, execution times, final destination row counts and other execution information.

3.)Configuring Execution Logging:
Logging is used to understand what aspect of the package caused the package failure so that you can identify problems quickly.
There are 2 types of logging:
Standard(built in logging feature provided by SSIS logging info about standard events used by event handlers such as OnError, OnQueryCancel and OnPreExecute events.) and custom logging(set up event handler and query system and user defined variables to log specific info to the destination of your choice)

Implementing Standard Logging:
There are 3 steps involved in enabling logging in a package.
1.)Choosing the container to log
2.)Setting the log provider type
Most common files are text files, SQL Server but you can also log to Windows event log, SQL Server profiler or XML file.
3.)Selecting the log events:
The most common are OnError, onPostExecute, OnProcess and OnTaskFailure
If you select sql server as the log provider , the log details are stored in the system table named,dbo.syssislog.
To recover the log file execute:
select *
from dbo.sysssislog

Custom Logging:
Event Handlers For Control Flow :
Note: In data flow the data viewers gives the ability to debug problems while data processing The control flow is different and taking advantage of the visual properties in Visual Studio control flow supports visual debugging and breakpoint features.
You would first look at event handler and you will then explore the debugging capabilities in control flow.

If an error occurs the event handler can send an alert or fix a data problem.You can provide 0 to many event handlers for a package.Event handlers are a component of SQL Server Integration Services that can execute event handler executables only when a defined event is encountered on a given Control Flow container or task within an SSIS package.
You can turn off the event handler for any task or container by setting the propertyDisableEventHandlers to True.

The most common event handler used is an OnError event.
The MaximumErrorCount property on a control flow executable determines the number of allowable errors to occur.
If the threshold of the MaximumErrorCount property is met, the control flow executable throws an error.
If an OnError event handler is configured for the executable that threw the error, the event handler will execute event handler executables that are defined against the control flow executable.
Using the OnError event handler is the equivalent of using an outcome-based precedence constraint set to failure.

The OnQueryCancel event would be raised if you were to stop a package while the package was executing.

The event handler executables that are available for you to use include the same containers and tasks that are available within a Control Flow component of SSIS.

Each package contains set of system variables that are updated for various levels in the package execution. With event handlers you can capture these variables and values which provide info such as ErrorCode, ErrorDescription, SourceName etc.
You can create a template with the desired error handling settings and place it in:
C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems. This can be used as a template and used by opening project menu on the visual studio menu bar and then add new item.

4.)Connecting Control Flow Objects with Precedence:
Precedence Constraints are used only in control flow and not in the data flow. The connectors between objects in control flow are Precedence constraints and connectors in data flow are data paths.
Precedence Constraints:
If the task failed the constraints are blue or red. Even if a task failed it is considered completed and therefore blue constraints (completion) are followed to the next task or container.
These connectors are green(indicating the execution success), red(is tasks failed) or blue(task is completed).
Constraints Features:
1.)Solid Lines represent AND precedence
2.)Dashed lines are logical OR statements
3,)Functional symbols show that an SSIS expression has been embedded in the constraint.
Evaluation Operation Property:
1.)Constraints:It is the execution status of the preceding task(completion, failure or success)
2.)Expressions: They are advanced boolean evaluators(meaning that they must be evaluated to true or false)
Evaluation operation drop down list includes the following options:
1.)Constraints
2.)Expressions
3.)Constraints and Expression
4.)Constraints or Expression


II.)DATA FLOW ERRORS:
1.)Data Paths and Error Paths:
Data Paths are the green connectors that go from one component to another. For these paths the rows have successfully gone through the component are output as are the error rows when the error Output is set to ignore failure.
Error Paths:The red connectors between data flow components. they contain the data rows that failed in a component when the error rows are set to redirected.
Not all components in data flow use error paths. eg multicast component since it does not perform any operations on the data itself.
There are 3 options for configuring error output:
1.)Fail Transformation:
Causes the data flow to fail if an error is encountered
2.)Ignore Failure Option:
Will let the row continue out the normal green data path but the value that resulted in error is changed to NULL.
3.)Redirect Rows
Sends the error row out as the red error path. This is the way to handle errors with separate components.
Uses of Error Path output:
a.) When text file source data does not match the data type specified in the data flow.
b.)When the lookup transformation does not find a match.

2.)Data Viewers:
They help to identify data flow issues.
There are 4 types of Data Viewers- Grid, hitogram, scatter plots and column charts.
Data viewers do not work when package is run from command line.

III.)DEBUGGING
The Progress Results windows will show the steps that the SSIS package is running through and colour codes the task as they each execute to provide a visual indication of which tasks are in progress, which tasks have succeeded and which tasks have failed.

You can also use the built-in row counts within the Data Flow component to indicate the number of data rows passing through the data flow.

You can make use of data viewers to stop the execution of a package and view the state of data as it passes through the data flow.

Furthermore, you can use the Data Profiling task for more in-depth analysis of the data to ensure that the data is being transformed and loaded as expected.

Breakpoints can be applied to both SSIS package components and within the Script task to pause the execution of a package under specific conditions so you can debug the SSIS package at the point the breakpoint causes the package to suspend.
When you debug a package that contains multiple Script tasks, the debugger hits breakpoints in only one Script task and will ignore breakpoints in the other Script tasks. If a Script task is part of a Foreach Loop or For Loop container, the debugger ignores breakpoints in the Script task after the first iteration of the loop.

Debugging the Control Flow with Breakpoints:
Breakpopints function in the control flow and not in the data flow. For Scripting that means you can set up breakpoints only in the control flow Scripting task and not in the data flow scripting Component Transformation.
In the Debug menu open the 'Windows'--> 'locals' this window shows the value of variables when the package is paused with a breakpoint.

9 comments:

  1. You made such an interesting piece to read, giving every subject enlightenment for us to gain knowledge. Thanks for sharing the such information with us
    angularjs-Training in annanagar

    angularjs Training in chennai

    angularjs Training in chennai

    angularjs Training in online

    angularjs Training in bangalore

    ReplyDelete
  2. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    python training in pune | python training institute in chennai | python training in Bangalore

    ReplyDelete
  3. Thank you for allowing me to read it, welcome to the next in a recent article. And thanks for sharing the nice article, keep posting or updating news article.
    Java training in Rajaji nagar | Java training in Kalyan nagar

    Java training in Kalyan nagar | Java training in Jaya nagar

    ReplyDelete
  4. I would really like to read some personal experiences like the way, you've explained through the above article. I'm glad for your achievements and would probably like to see much more in the near future. Thanks for share.


    Data Science Training in Chennai | Data Science course in anna nagar

    Data Science course in chennai | Data science course in Bangalore

    Data Science course in marathahalli | Data Science course in btm layout

    ReplyDelete
  5. This is most informative and also this post most user friendly and super navigation to all posts... Thank you so much for giving this information to me.. 

    best rpa training in chennai | rpa online training |
    rpa training in chennai |
    rpa training in bangalore
    rpa training in pune
    rpa training in marathahalli
    rpa training in btm

    ReplyDelete
  6. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
    Tableau Training in Chennai

    Tableau Training in Bangalore

    Tableau Training in Bangalore

    ReplyDelete
  7. Awesome article. It is so detailed and well formatted that i enjoyed reading it as well as get some new information too.
    DevOps Training in chennai

    DevOps Training in Bangalore

    DevOps Training in Bangalore

    ReplyDelete
  8. I think this is one of the most significant information for me. And i’m glad reading your article. Thanks for sharing!

    Learn Hadoop Training from the Industry Experts we bridge the gap between the need of the industry. Softgen Infotech provide the Best Hadoop Training in Bangalore with 100% Placement Assistance. Book a Free Demo Today.
    Big Data Analytics Training in Bangalore
    Tableau Training in Bangalore
    Data Science Training in Bangalore
    Workday Training in Bangalore

    ReplyDelete