Difference between control flow and data flowControl flow deals with orderly processing of individual, isolated tasks, these tasks are linked through precedence constraints in random order. Also the output for task has finite outcome i.e., Success, Failure, or Completion. A subsequent task does not initiate unless its predecessor has completed. Data flow, on the other hand, streams the data in pipeline manner from its source to a destination and modifying it in between by applying transformations. Another distinction between them is the absence of a mechanism that would allow direct transfer of data between individual control flow tasks. On the other hand, data flow lacks nesting capabilities provided by containers. | Control Flow | Data Flow | | Process Oriented | Data Oriented | Made up of | Tasks and Container | Source, Transformation and Destination | Connected through | Precedence constraint | Paths | Smallest unit | Task | Component | Outcome | Finite- Success, Failure, Completion | Not fixed |
|
If you want to send some data from Access database to SQL server database. What are different component of SSIS will you use?In the data flow, we will use one OLE DB source, data conversion transformation and one OLE DB destination or SQL server destination. OLE DB source is data source is useful for reading data from Oracle, SQL Server and Access databases. Data Conversion transformation would be needed to remove datatype abnormality since there is difference in datatype between the two databases (Access and SQL Server) mentioned. If our database server is stored on and package is run from same machine, we can use SQL Server destination otherwise we need to use OLE DB destination. The SQL Server destination is the destination that optimizes the SQL Server.
|
Difference and similarity between merge and merge join transformation | Merge Transofrmations | Merge Join Transformation | | The data from 2 input paths are merged into one | The data from 2 inputs are merged based on some common key. | Works as | UNION ALL | JOIN (LEFT, RIGHT OR FULL) | Supports | 2 Datasets | 1 Dataset | Columns | Metadata for all columns needs to be same | Key columns metadata needs to be same. | Pre-requisites | Data must be sorted. Merged columns should have same datatype i.e. if merged column is EmployeeName with string of 25 character in Input 1, it can be of less than or equal to 25 characters for merging to happen. | Data must be sorted. Merged columns should have same datatype i.e. if merged column is EmployeeName with string of 25 character in Input 1, it can be of less than or equal to 25 characters for merging to happen. | Limitations | Only 2 input paths can be merged. Does not support error handling. | Does not support error handling. | Use | Merging of data from 2 data source Can create complex datasets using nesting merge transformation, | When data from 2 tables having foreign key relationship needs to present based on common key. |
|
What is precedence constraint?A precedence constraint is a link between 2 control flow tasks and lays down the condition on which the second task is run. They are used to control the workflow of the package. There are 3 kinds of precedence constraint – success (green arrow), failure (red arrow) or Completion script task (blue arrow). By default, when we add 2 tasks, it links by green arrow. The way the precedence constraint is evaluated can be based on outcome of the initial task. Also, we can add expression to evaluate such outcome. Any expression that can be judged as true or false can be used for such purpose. The precedence constraint is very useful in error handling in SSIS package.
Your browser may not support display of this image.
|
Explain why variables called the most powerful component of SSIS.Variable allows us to dynamically control the package at runtime. Example: You have some custom code or script that determines the query parameter’s value. Now, we cannot have fixed value for query parameter. In such scenarios, we can use variables and refer the variable to query parameter. We can use variables for like: 1. updating the properties at runtime, 2. populating the query parameter value at runtime, 3. used in script task, 4. Error handling logic and 5. With various looping logic.
|
Can we add our custom code in SSIS?We can customize SSIS through code by using Script Task. The main purpose of this task is to control the flow of the package. This is very useful in the scenario where the functionality you want to implement is not available in existing control flow item.
To add your own code:- 1. In control flow tab, drag and drop Script Task from toolbox. 2. Double click on script task to open and select edit to open Script task editor. 3. In script task editor, there are 3 main properties i.) General – Here you can specify name and description ii.) Script – through this we can add our code by clicking on Design Script button. The scripting language present is VB.Net only. iii.) Expression
|
What is conditional split?As the name suggest, this transformation splits the data based on condition and route them to different path. The logic for this transformation is based on CASE statement. The condition for this transformation is an expression. This transformation also provides us with default output, where rows matching no condition are routed. Conditional split is useful in scenarios like Telecom industry data you want to divide the customer data on gender, condition would be: GENDER == ‘F’
|
Explain the use of containers in SSIS and also their types.Containers can be defined as objects that stores one or more tasks. The primary purpose of container is grouping logically related tasks. Once the task is placed into the containers, we can perform various operations such as looping on container level until the desired criterion is met. Nesting of container is allowed. Container is placed inside the control flow.
There are 4 types of Container:- 1. Task Host container- Only one task is placed inside the container. This is default container. 2. Sequence Container – This container can be defined as subset of package control flow. 3. For loop container – Allows looping based on condition. Runs a control flow till condition is met. 4. For each loop container - Loop through container based on enumerator.
|
Why is the need for data conversion transformations?This transformation converts the datatype of input columns to different datatype and then route the data to output columns.
This transformation can be used to: 1. Change the datatype 2. If datatype is string then for setting the column length 3. If datatype is numeric then for setting decimal precision.
This data conversion transformation is very useful where you want to merge the data from different source into one. This transformation can remove the abnormality of the data. Example à The Company’s offices are located at different part of world. Each office has separate attendance tracking system in place. Some offices stores data in Access database, some in Oracle and some in SQL Server. Now you want to take data from all the offices and merged into one system. Since the datatypes in all these databases vary, it would be difficult to perform merge directly. Using this transformation, we can normalize them into single datatype and perform merge.
|
Error Handling in SSIS?An error handler allows us to create flows to handle errors in the package in quite an easy way. Through event handler tab, we can name the event on which we want to handle errors and the task that needs to be performed when such an error arises. We can also add sending mail functionality in event of any error through SMTP Task in Event handler. This is quite useful in event of any failure in office non-working hours. In Data flow, we can handle errors for each connection through following failure path or red arrow.
|