Data profiling with SQL Server 2008

One of the many improvements brought about SQL Server 2008 on the ETL with Integration Services is its ability to perform data profiling with its new Data Profile Task.

The profiling data is one of the first tasks that typically addressed in data quality processes, and involves an initial analysis of the source data, usually on tables, with the aim of getting to know its structure, format and level quality. Inquiries are made at the table, column, columns, relationships, and even relationships between tables.

The Data Profile Task SSIS works by selecting a table in a SQLServer 2000 database or above (no use other databases), the profiling options you want to perform on the data table and an XML file can be stored the results when running the same.It's really simple.

You can select up to 8 types of profiling, 5 to level 3 to level column and multiple columns.

Profiles at the column level:

  • Length distribution of values
  • Percentage of null values
  • Patterns expressed by regular expressions
  • Column statistics: minimum, maximum, mean and standard deviation
  • Distribution of values, different values ​​and percentage of occurrence of each of the total rows

Multicolumn outlined at:

  • Candidate keys, which could be primary key columns of the table
  • Functional dependence, the values ​​of a column may depend on other
  • Inclusion of values ​​that could be foreign key columns other

SQL Server SSIS task profile

After running the task XML file generated in your chosen location where all information is stored test results.To examine these results SQL Server provides the Data Profile Viewer application in a normal installation sober C unit should be in this directory:

C: \ Program Files \ Microsoft SQL Server \ 100 \ DTS \ Binn \ DataProfileViewer.exe

Simply select the XML generated by the task of SSIS and begin exploring the results:

SSIS Data Profile Viewer

For more detailed information see the section Task profiling data from the online documentation from Microsoft Technet.

Also very well commented this task items Perf SQLServerormance SSIS New Features in SQL Server 2008 - Part 3 and Using The Data Profiler and Task FTP Task in SQL Server 2008 Integration Services

Talk About It forum