Data Quality Services SQL Server 2012 Ash Tewari @ashtewari ashtewari.com
Data Quality Services // Ash Tewari // @ashtewari
What is Data Quality? The degree to which the data is fit for its intended use. Data Quality Services // Ash Tewari // @ashtewari
Why is Data Quality Important? Bad data is expensive. Management decision-making Regulatory compliance. Good data is good for business. Data Quality Services // Ash Tewari // @ashtewari
Data Quality Issues Completeness Conformity Consistency Accuracy Validity Duplication Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Completeness Is all the required information available? Example: if you have an email field where only 50,000 values are present out of a total of 75,000 records, then the email field is 66.6% complete. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Conformity Are there expectations that data values conform to specified formats? Example: The Gender codes in two different systems are represented differently; in one system the codes are defined as ‘M’, ‘F’ and ‘U’ whereas in the second system they appear as 0, 1, and 2. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Consistency Do values represent the same meaning? Example: Is a city name used consistently? For example: New York, NY, NYC, and The Big Apple refer to the same city. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Accuracy Do data objects accurately represent the “real - world” values they are expected to model? Example : A customer’s address is a valid USPS address. However, the ZIP code is incorrect and the customer name contains a spelling mistake. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Validity Do data values fall within acceptable ranges? Example: Salary values should be between 60,000 and 120,000 for position levels 51 and 52. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
Duplication Are there multiple, unnecessary representations of the same data objects within your data set? Name Address Postal Code City State Mag. Smith 545 S Valley View D. # 136 34563 <Anytown> New York Margaret smith 545 Valley View ave unit 136 34563-2341 <Anytown> New-York Maggie Smith 545 S Valley View Dr <Anytown> NY. Data Quality Services // Ash Tewari // @ashtewari Source : Data Quality Services FAQ
DQS Mechanisms Cleaning Matching Profiling Monitoring Data Quality Services // Ash Tewari // @ashtewari
DQS – Why? Knowledge-driven Semantics Knowledge feedback loop Extensible Data Quality Services // Ash Tewari // @ashtewari
DQS Installation New in SQL Server 2012 SQL Server 2012 – Enterprise or BI Edition Installed from SQL Server 2012 Installer Data Quality Services // Ash Tewari // @ashtewari
DQS Installer Bug Data Quality Services // Ash Tewari // @ashtewari
SQL Server 2012 Installer Data Quality Services // Ash Tewari // @ashtewari
DQSInstaller.exe Data Quality Services // Ash Tewari // @ashtewari
DQS Demo DQS Client Knowledge Base Management Cleansing Project Matching Project Data Quality Services // Ash Tewari // @ashtewari
DQS + SSIS DQS Cleansing Component Data Quality Services // Ash Tewari // @ashtewari
DQS + MDS MDS Excel Add-In Data Quality Services // Ash Tewari // @ashtewari
Resources Data Quality Services (MSDN) http://msdn.microsoft.com/en-us/library/ff877925.aspx Data Quality Services Blog http://blogs.msdn.com/b/dqs/ Data Quality Services Resources http://technet.microsoft.com/en-us/sqlserver/hh780961 Data Quality Services FAQ http://social.technet.microsoft.com/wiki/contents/articles/3919.data-quality-services-dqs- faq.aspx Data Quality Services // Ash Tewari // @ashtewari
Thanks! Ash Tewari @ashtewari ashtewari.com
Recommend
More recommend