Data Quality Services SQL Server 2012 Ash Tewari @ashtewari - - PowerPoint PPT Presentation

data quality services
SMART_READER_LITE
LIVE PREVIEW

Data Quality Services SQL Server 2012 Ash Tewari @ashtewari - - PowerPoint PPT Presentation

Data Quality Services SQL Server 2012 Ash Tewari @ashtewari ashtewari.com Data Quality Services // Ash Tewari // @ashtewari What is Data Quality? The degree to which the data is fit for its intended use. Data Quality Services // Ash Tewari


slide-1
SLIDE 1

Data Quality Services

SQL Server 2012 Ash Tewari @ashtewari ashtewari.com

slide-2
SLIDE 2

Data Quality Services // Ash Tewari // @ashtewari

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

What is Data Quality?

The degree to which the data is fit for its intended use.

Data Quality Services // Ash Tewari // @ashtewari

slide-6
SLIDE 6

Why is Data Quality Important?

Bad data is expensive. Management decision-making Regulatory compliance. Good data is good for business.

Data Quality Services // Ash Tewari // @ashtewari

slide-7
SLIDE 7

Data Quality Issues

Completeness Conformity Consistency Accuracy Validity Duplication

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-8
SLIDE 8

Completeness

Is all the required information available? Example: if you have an email field where only 50,000 values are present out of a total of 75,000 records, then the email field is 66.6% complete.

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-9
SLIDE 9

Conformity

Are there expectations that data values conform to specified formats? Example: The Gender codes in two different systems are represented differently; in one system the codes are defined as ‘M’, ‘F’ and ‘U’ whereas in the second system they appear as 0, 1, and 2.

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-10
SLIDE 10

Consistency

Do values represent the same meaning? Example: Is a city name used consistently? For example: New York, NY, NYC, and The Big Apple refer to the same city.

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-11
SLIDE 11

Accuracy

Do data objects accurately represent the “real- world” values they are expected to model? Example: A customer’s address is a valid USPS address. However, the ZIP code is incorrect and the customer name contains a spelling mistake.

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-12
SLIDE 12

Validity

Do data values fall within acceptable ranges? Example: Salary values should be between 60,000 and 120,000 for position levels 51 and 52.

Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-13
SLIDE 13

Duplication

Are there multiple, unnecessary representations of the same data objects within your data set?

Name Address Postal Code City State

  • Mag. Smith

545 S Valley View D. # 136 34563 <Anytown> New York Margaret smith 545 Valley View ave unit 136 34563-2341 <Anytown> New-York Maggie Smith 545 S Valley View Dr <Anytown> NY. Source : Data Quality Services FAQ

Data Quality Services // Ash Tewari // @ashtewari

slide-14
SLIDE 14

DQS Mechanisms

Cleaning Matching Profiling Monitoring

Data Quality Services // Ash Tewari // @ashtewari

slide-15
SLIDE 15

DQS – Why?

Knowledge-driven Semantics Knowledge feedback loop Extensible

Data Quality Services // Ash Tewari // @ashtewari

slide-16
SLIDE 16

DQS Installation

New in SQL Server 2012 SQL Server 2012 – Enterprise or BI Edition Installed from SQL Server 2012 Installer

Data Quality Services // Ash Tewari // @ashtewari

slide-17
SLIDE 17

DQS Installer Bug

Data Quality Services // Ash Tewari // @ashtewari

slide-18
SLIDE 18

SQL Server 2012 Installer

Data Quality Services // Ash Tewari // @ashtewari

slide-19
SLIDE 19

DQSInstaller.exe

Data Quality Services // Ash Tewari // @ashtewari

slide-20
SLIDE 20

DQS Demo

DQS Client Knowledge Base Management Cleansing Project Matching Project

Data Quality Services // Ash Tewari // @ashtewari

slide-21
SLIDE 21

DQS + SSIS

DQS Cleansing Component

Data Quality Services // Ash Tewari // @ashtewari

slide-22
SLIDE 22

DQS + MDS

MDS Excel Add-In

Data Quality Services // Ash Tewari // @ashtewari

slide-23
SLIDE 23

Resources

 Data Quality Services (MSDN) http://msdn.microsoft.com/en-us/library/ff877925.aspx  Data Quality Services Blog http://blogs.msdn.com/b/dqs/  Data Quality Services Resources http://technet.microsoft.com/en-us/sqlserver/hh780961  Data Quality Services FAQ http://social.technet.microsoft.com/wiki/contents/articles/3919.data-quality-services-dqs- faq.aspx

Data Quality Services // Ash Tewari // @ashtewari

slide-24
SLIDE 24

Thanks!

Ash Tewari @ashtewari ashtewari.com