У нас вы можете посмотреть бесплатно dqs 2021 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
This video provides a detailed technical tutorial on Microsoft SQL Server Data Quality Services (DQS), focusing on how to manage and improve the quality of data within an organization. Core Concepts of Data Quality Definition & Importance: The presenter explains that data quality issues often arise in complex systems where large volumes of data come from multiple sources. Poor data quality can lead to service failures, such as sending packages to obsolete addresses [00:41]. Beyond Normalization: While database integrity (like 3rd Normal Form and foreign keys) is essential, it isn't enough to ensure data usability. Issues like inconsistent name formats (e.g., "Dr." vs. no title) or hidden logical conflicts (e.g., a male record with pregnancy data) still occur [01:48], [03:26]. Key Dimensions: Data quality is measured by three "hard" dimensions: Completeness: Ensuring there are no missing or null values where they are required [04:18]. Accuracy: Verifying that the data correctly represents reality (e.g., correct city/street names) [04:47]. Consistency: Checking for logical contradictions within the entire information system [05:22]. DQS Architecture & Setup System Databases: Setting up DQS creates three main databases: DQS_MAIN (logic), DQS_STAGING_DATA (temporary storage), and DQS_PROJECTS (project administration) [07:13]. Knowledge Bases (KB): DQS relies on a "Knowledge Base" which acts as a reference point. It contains Domains (sets of valid values for specific fields) and synonym dictionaries [12:32], [13:52]. Similarity Thresholds: The system uses "Data Science" metrics to suggest corrections. If a value's similarity to a correct one is high (e.g., 80%), it can be auto-corrected; if it's moderate (e.g., ~70%), it is flagged for manual review [17:12], [19:00]. Practical Demonstrations Standard Cleansing: The video demonstrates a project to clean a customer last name field using a pre-installed knowledge base, identifying typos and suggesting corrections based on similarity scores [24:41]. Knowledge Discovery: The presenter shows how to build a custom Knowledge Base by analyzing existing data (e.g., city lists) to find unique values and patterns [34:43]. Custom Rules & Validation: Typo Correction: Creating a rule to map "Mulheim" (incorrect spelling without the umlaut) to the correct "Mülheim" [39:14]. Logical Rules: Setting a domain rule to flag birth dates that would make a customer older than 100 years, identifying them as likely data entry errors [41:20]. Conclusion The presenter emphasizes that data cleaning is rarely 100% automated. It is an interactive process that requires "human intelligence" to ensure that the corrections made are actually valid and don't introduce more errors [50:00]. He suggests that while data from machines is usually clean, data entered by humans needs periodic "washing" (e.g., once a year) to maintain service standards [52:17]. Video Link: • dqs 2021