Thursday, April 7, 2011

Developing SSAS Cubes: Creating Data Sources and data Source Views

CREATING DATA SOURCES AND DATA SOURCE VIEWS:
SSAS works best with database schema's that are designed to support data analytic s and reporting. The methodology you use to architect such schemes is called dimensional Modeling.
The fact table and its dimensions make a star pattern typical of OLAP. With star schema the dimension table is contained in a single table.
After the dimension table is in place you can build the UDM (Unified Dimensional model) or cube on top of it.

Options for developing Analysis Services:
1.)Project mode:
You can either select the Analysis Services Project template or the Import Analysis Services 9.0 Database Wizard in the Business Intelligence template of the Add New Project dialog box.
Working in the project mode helps you to develop Analysis Services solutions without the need for an active connection to the database. Furthermore, projects created in Business Intelligence Development Studio can work with source control technologies such as Microsoft Visual SourceSafe, allowing multiple developers to work with the same Analysis Services objects without overwriting each other’s work.
The project mode also provides the flexibility to deploy the solution to a test environment before deploying the solution to a production environment. It is best practice to develop Analysis Services solutions in the project mode.
2.)Online mode:
In Business Intelligence Development Studio, you can connect directly to cubes in an Analysis Services database and make modifications to a live cube. The changes that are made occur immediately, and no Analysis Services project is created.

To connect to Analysis Services 2008 in online mode, perform the following steps:
Open Business Intelligence Development Studio.
On the File menu, point to Open, and then click Analysis Services Database.
Select Connect to existing database.
Specify the server name and the database name.
Click OK.

UDP Consists of several components:
1.)Data Source: represents a connection to the database where the data is stored.
2.)Data Source Views: Abstracts the underlying db schema. Can add calculated columns to a DSV when the security prevents you from changing the schema.
3.)Dimension Model: After the DSV is created the next step is to create a cube dimensional model with dimension and fact tables.
4.)Calculations: Only in rare cases will the dimensional model alone meet the needs. Custom logic like MDX expressions such as Quater to date(QTD) and year to date(YTD) can be built from such calculations.
5.)End User Model: It defines the additional features you built on top of dimensional layer to provide richer data semantics. They include KPI's, actions, perspectives and translations.
6.)Management Settings: At the last the savvy admin would configure the cube to meet operational requirements including availability, latency and security.

1.)Defining a new data source:
Setting the Connection Information:
When setting the connection information you need to decide the authentication option the server will use to connect to the data source to read data.
1.)Windows integrated Security- This option connects to the data source under the identity of windows account. Data source operations initiated in BIDS are performed under the identity of the interactive user ie you.
2.)Standard Authentication- User name and Password
By selecting the SQL Server Auth option it is selected. Same credentials will be used both by BIDS and SSAS connections. This authentication is less secure then Windows Authentication.

Impersonation Information:
1.)Specify Windows Username and Password-
Lets you connect to the data source through a predefined windows account
2.)The service Account-
It performs the data operations under the identity of the SSAS windows service. This option is appropriate if the SSAS service account is domain or local account and has read access to the underlying data source data.
3.)Use the credentials of the current user-
It impersonates the identity of the interactive user. This option is rarely used since certain data retrieval scenarios involving data mining models.
4.)Inherit-
Lets the sever decide what impersonation mode is appropriate. For cube processing the server uses service account and for querying data mining models it uses credentials of current user.
Note:
Use Service Account option generally

2.)DATA SOURCE VIEWS(DSV):
Data Source Views provide a logical data model of related tables, views and queries from one or more data sources. As a result, you can define a subset of data in the data source view. This is typically done when you want to expose a data mart from a data warehouse that will be used as a basis for creating a cube. The data source view will hold metadata about the underlying tables and relationships from the source data.

As the data source view is a logical model, changes can be made to the data source view without changing the underlying schema in the data source.

DSV Keys and Table Relationships:
One purpose of DSV is to define the Primary key for the dimension table and the table relationships between dimension and fact tables. If Primary keys and relationships are defined within the underlying tables then the DSV will detect it automatically and add to DSV. If they are not defined eg when using views instead of tables then one must manually add them logically in the DSV schema.
This can be done by:
1.)Within the DSV designer select one or more columns that make up the key for the table by holding down the control key and then clicking the relevant columns.
2.)Right click the selected columns and select "Set logical Primary Key".
To define a logical relationship drag the foreign key from a fact table to its related primary key within a dimension table.The resulting relationship will appear as a arrow pointing from foreign key to the primary key table.

Note: Make sure you understand the underlying relational structure and the data before you create a logical Primary key and table relationships. The DSV designer does not validate any logical primary keys or table relationships that are created manually. If data issues in the underlying table break any logical definitions, one will only come to know once the cubes are designed and processed using the DSV.


Named calculations and Named Queries:
Named Calculation: It is a column based on an expression. It is a computed column which is a part of a DSV.
eg A named calculation can reference another table. Our eg has 2 tables Product and ProductCategory table and both tables have a ProductCategoryID column. The productcategory table has a CategoryName column. You need a named calculation in the product table that shows the Product Category by referencing the CategoryName column from Product Category table. Following SQL statement is used.
(Select CategoryName from ProductCategory
where ProductCategoryID=products.ProductCategoryID)
Named Queries:
It is created with a full select statement with one or more columns.

3.)CREATING AND MODIFYING DATA CUBES:
After creating a data source and DSV that model the underlying data structures, the next task is to create the multidimensional layer of UDP.
There are 3 ways by which wizard for generating the cube definition:
1.)Use Existing Tables:
Helps to generate the cube on the bottoms up approach based on the DSV created.
2.)Create empty Cube:
Need to design the dimensional model from scratch.
3.)Generate tables in Data Source:
This is a top down approach. Select this option when you prefer to design the UDM from top to bottom during the absence of data. After the UDM is in place you can auto generate the supporting schema.
Note: Common to use the first option.

Modifying the Cube and the Cube Designer:
BIDS opens the resulting cube in a cube designer. The cube designer interface has 10 design tabs across the top. You can use these tabs to modify he various aspects of the cube.
Cube Designer Tabs:
The tab names and their functions are as below:
1.)Cube Structure:
Used to edit the basic structure of a cube.
2.)Dimension Usage:
Defines relationships between dimension and measure groups in a cube.
3.)Calculations:
Contains MDX formulas and scripts to define calculated members.
4.)KPI's:
Defines KPI's within a cube including formulas for KPI status and trends.
5.)Actions:
Used to create actions such as running reports or launching urls that will be avaliable to the end users as they browse the cube.
6.)Partitions:
Defines the storage structures used for each measure group within a cube.
7.)Aggregations:
Used to create custom Aggregation designs.
8.)Perspectives:
Used to create subsets of the cube content for easier end user browsing.
9.) Translations:
Defines the lang for cube metadata eg measures and dimension names.
10.)Browser:
Used to browse the cube data

Note:When you modify certain properties for a dimension to change its behavior in a cube, the physical design of this dimension is inherited from the dimension at the database level.

CREATING AND MODIFYING DIMENSIONS:
Note: dimension first is more preferable. In this way, you now control the attributes that exist in the dimension, and define any hierarchies that you deem appropriate for your business users. By taking this approach, you can run the Cube Wizard to create the cube, and then add existing dimensions within the Cube Wizard so that after the wizard completes, the dimensions have already been defined without the need to further edit the dimensions.
When the cube wizard generates a set of dimensions it only adds primary key columns as dimension attributes. As the requirements evolve, you might need to add new dimensions or modify existing dimensions to the UDM dimensional model.

Creating a new Dimension:
Attributes can be organized into user-defined hierarchies that provide navigational paths to assist users when browsing the data in a cube. Instead of browsing a single flat hierarchy of attributes, you can improve the user's experience of reading the dimension data by creating user-defined hierarchies. For example, a Time dimension consists of a user-defined hierarchy named Calendar Year that contains the Year, Quarter and Month levels within the hierarchy for users to navigate through the time dimension data.
Furthermore, you can improve the performance of browsing dimension data through user-defined hierarchies by defining attribute relationships between attributes within the hierarchy. Although attribute relationships are not visible to the users, they can have an impact on performance by creating more effective aggregates within the cube that users may query.
Dimensions, attributes, user-defined hierarchies and attribute relationships are created using the Dimension Wizard.

Monday, April 4, 2011

Executing And Scheduling packages

Using DTExecUI to configure Package Execution:
The command line executable file DTExec is a utility with broad parametrization capabilities. You can generate the command line manually by applying the various parameters but to save time and avoid errors you can use the command line builder utility named DTExecUI.
DTExecUI is a visual tool that naturally groups the command parameter options.

Using DTExec for Package Execution:
To exec SSIS Package saved to a SQL Server with windows Authentication use the following: dtexec.exe /SQL MyPackage /Server SQLProd
To exec SSIS Package saved to a File System: dtexec.exe /f "c:\MyPackage.dtsx"
To exec SSIS Package saved to a File System passing a config file: dtexec.exe /f "c:\MyPackage.dtsx" /conf "c:\pkgOneConfig.cfg"

Executing Packages in SSMS with SSIS Service:
When you right click and run a package in SSMS, SSMS opens the DTExecUI utility. It will be preconfigured with Package location info and you can configure the other property pages.

Creating SQL Server Agent Jobs to Execute SSIS Package:
SSMS-->SQL Server database-->SQL Server Agent-->Job
Following list defines the scheduling options for SQL Server Agent job:
1.)Recurring
2.)One time
3.)Start Automatically when SQL Server Agent Starts
4.)Start whenever the CPU becomes Idle
You can set up notifications eg sending email whenever a package fails,succeeds or completes.

Sunday, April 3, 2011

Managing SSIS Service and Configuring Package Security

MANAGING SSIS SERVICE:
The SSIS Service is the Windows Service named 'SQL Server Integration Services'. You can configure the startup requirements and the service account by using SQL Server Config Manager.
After you connect to the SSIS Service,the object explorer window in SSMS lets you manage,secure and execute packages deployed to the server.
The Running Packages Folder:
You can connect to the SSIS services in SSMS and refresh the running package folder to see what packages are currently running. You can stop or terminate package execution by right clicking the package and it terminates after the next task completes.
The Stored Packages Folder:
It has 2 subfolders: File system and MSDB. The package store is located by default in %Program Files%\Microsoft SQL Server\100\DTS\Packages\Folder.
The services uses a config file at the location:
C:\Program Files\Microsoft SQL Server\100\DTS\Bin\MsDtsSrvr.ini.xml
You can edit this xml file and change the element value to a full address.

MSDB folders displays packages that have been deployed to SQL Server. These packages are stored in MSDB database in a table named [dbo].[sysssispackages]. You can create virtual folders in MSDB folder and import packages directly to those subfolders.

Adding SSIS Package Security:
Securing SSIS is important for 2 main reasons:
1.)Connection Information:
Although packages do not contain any data from sources or destinations there is connection information in those packages.
2.)Schema information:
It is not possible to access the sources or destination directly, but with an unencrypted package it is possible to get the look of how the sources and destination looks like.
There are 3 ways to limit the access to packages:
1.)You can apply SQL Server Security roles and can define which SQL logins or windows group or users have read and write access to a package.
2.)You can employ security encryption by using the package protection level in which either a part(sensitive data) or all of the package is encrypted.
3.)You can define file level security in which you can apply access rights to the file or folder.

Assigning Roles and Securing Packages stored in msdb:
Roles are of following types:
1.)Read actions focus on viewing and executing packages.
2.)Write actions apply to moving packages in and out of msdb without executing the packages.
How to use Security Roles:
1.)Expand stored package folder and then the MSDB folder.
2.)Right click the package name for which you need to apply the security roles and then select package roles dialog box.
Types of Roles:
1.)db_ssisadmin- read(view, execute and export)and write access(delete, change and import)
2.)db_ssisltduser-read and write own packages
3.)db_ssisoperator- read only access
4.)windows_Admin- can view exec details of currently running packages and stop currently running packages.
Package Protection Level:
The ProtectionLevel Property Encrypts package definition information that resides in the underlying XML file.
Package password property when protection level requires encrypting metadata in the package with a password.
By default the package is set to use EncryptSenstivityWithUserKey. This means if you create a package with default settings and the package contains sensitive data(such as connection with a password) the connection with be encrypted and viewed only by you, the package author if the package is opened in BIDS.
Options for Package Level Security:
1.)DontSaveSensitive: Sensitive information is not saved and every time a package is opened the information has to be reentered.
This is helpful if only Windows authentication is used for connections exclusively.
2.)EncryptAllWithPassword: The entire package is encrypted using a password used in the PackagePassword property. The package will not be executed without the correct password in BIDS.
3.)EncryptSensitiveWithPassword: The sensitive data is encrypted based on the password entered in the packagePassword property. if a password is not provided the package will be opened but the sensitive data will be replaced with blanks.If a user attemps to execute the package without the password the package fails.
4.)EncryptSensitiveWithUserKey: The sensitive data is encrypted based on the current user and the computer. if the package is opened by a different user it will be opened but the sensitive data will be replaced with blanks.
5.)EncryptAllWithUserKey:The entire package is encrypted based on the current user used in the PackagePassword property. The package will not be executed unless it is the same user who created it in BIDS.
6.)ServerStorage: This setting can be used only if the package is stored in the SQL Server db.Nothing in the package definition is encrypted. Instead the entire package is protected by the SQL Server db roles.
Note:
If you are deploying a package with the EncryptSensitiveWithUserKey ProtectionLevel setting to a new server the connection passwords will not be decrypted and the package execution fails.
Key Points to consider while deciding on the package ProtectionLevel Security:
1.)If all connections are windows authenticated they do not require passwords since users are authenticated with windows, you do not have sensitive info in your package. In this case use DontSaveSensitive.
2.)If there is sensitive info such as connection password you have to decrypt the password using the package pwd or by running the package on the same computer by the same user.

Managing Package Security with DTUtil:
The following eg encrypts MyPackage.dtsx with the password EncPawd. The parameter requires the package location and the encryption level and a package pwd.(0-DontSaveSensitive,1-EncryptSensitiveWithUserKey,2-EncryptSensitiveWithPassword,3-EncryptAllWithPassword,4-EncryptAllWithUserKey,5-SQLServerStorage)
dtutil.exe /file MyPackage.dtsx file;MyPackage.dtsx;3;EncPawd