If I have the option of using .Net
and can do data transformations
just fine in .Net, when would I need SSIS
? Is there a certain task
As the name suggests, SSIS is an integration system. It can be very difficult in .net to handle connectors to disparate data sources such as excel, teradata, oracle etc and also to live up to the responsibility to gracefully close those connections, garbage collection, handling memory issues.
So, SSIS is out of the box product perfect for scenarios where data not only needs to be pulled from, say, two different sources, but then a series lookups, transformations, merges, derivations and calculations need to be performed before writing it to a target location(be it sql server, a flat file or another db system).
SSIS also has checkpoints where, if the package fails due to any reason, it will pick up from where it left off (it needs to be configured as this is not default behavior).
In addition, SSIS will save you a lot of time because its tasks are reusable and its deployment process is fairly easy to implement and schedule, supported by great event handling.
SSIS has many built in ways of doing transformations from different data sources and you can string them together in a way that makes it very customizeable. They have built in optimizations that make them fast.
You can also use .NET to make your own custom transformations to take advantage of the speed and repeatability of an SSIS job.
I think main advantage is defining the entire programming construct visually. Any one look at the SSIS package is it pretty much self explainer. The tight integration with the SSIS with SQL allows you to be part of SQL for back up scheduling and huge plus.
As every one explained if you are doing the lot of data manipulation it is good tool. It is free if you have SQL you all set to go and very easy to learn with VS 2008 BIDS
SSIS is great for BI applications, you can manipulate the data on Stage Table and than make avaiable on DataWarehouse tables to be used for BI.
I can connect on SAP, Oracle to get employee information and make avaiable on PowerBI, QlikView, etc...
Its a nice tool if you know where and why use it. Use ir because its cool you will have troubles.
My arguments for not using SSIS are:
Design greenfield products so that they have RESTful data feeds for reporting and extraction built-in to the project plan and budget, preferably to a standard like OData so that other tools can plug right in.
Data feeds should pull and transform from upstream systems and feeds on demand; such that schedule tasks, configuration of scheduled tasks, task runner VMs and staff to run all this unreliable scheduling stuff is negated.
RESTful data feeds leverage HTTP caching.
Feeds/services/APIs can be moved to elastic-scale cloud easily.
SSIS requires finding people with SSIS skills that enjoy doing that stuff for weeks. In my experience, finding and retaining SSIS developers is hard and expensive and the people found tend to be sub-par.
SSIS doesn't work well with source control and collaborative work.
SSIS doesn't lend itself well to code reuse, unlike microservices and traditional code libraries.
SSIS doesn't version easily, unlike a REST service.
SSIS doesn't lend itself to modular designs and continuous deployment of many small changes, it tends to be large-batch with scary releases.
SSIS promotes the use of stored-procedures which places a lot of demand on SQL which is the hot-spot. Favour designs that place demands on a scaleable, stateless middle tier.
The tooling is clunky and unreliable.
You're at the mercy of Microsoft's roadmap for SSIS.
Consider writing to tables/services that support analysis, reporting and views as soon as the data comes into the application; see Event Sourcing and other application architecture patterns.
Never use Excel as a data source; train employees.
Code is king.
Ultimately, I see SSIS as a relic of Enterprise IT. I like to ask, "Would Google use SSIS?" How else can the problem be solved? Think outside the box.
SSIS is generally used for ETL (Extract Transform Load). Specific use cases are the pre-processing of SSAS (SQL Server Analysis Services) cubes; and enhanced extraction using Data Change Capture.
It can do typical automation, including FTP, and email. There is the programming aspect using script-tasks (C# or Visual Basic), so SSIS has functionality beyond it's included controls...
Packages can be programmed to use conditional control-flow path. For example, do a certain task Monday thru Friday, and a different task Saturday & Sundays. Or refuse to perform ETL if certain conditions are not met.
SSIS packages can call other SSIS packages. That keeps the code modular, allowing re-use.
It can work with various Data Sources, and perform simple transformation using the Derived Column control. This is versus doing transformation on the source server (which could be Oracle or Hadoop for example- something you don't have control of with your local SQL Server).