У нас вы можете посмотреть бесплатно Database Schemas in the Wild: What Can We Learn from a Large Corpus of Relational Database Schemas? или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
DSDSD - THE DUTCH SEMINAR ON DATA SYSTEMS DESIGN: We hold bi-weekly talks on Fridays from 3:30 PM to 5 PM CET for and by researchers and practitioners designing (and implementing) data systems. The objective is to establish a new forum for the Dutch Data Systems community to come together, foster collaborations between its members, and bring in high-quality international speakers. We would like to invite all researchers, especially also Ph.D. students, who are working on related topics to join the events. It is an excellent opportunity to receive feedback early on from researchers in your field. Website: https://dsdsd.da.cwi.nl/ Twitter: / dsdsdnl Title: Database Schemas in the Wild: What Can We Learn from a Large Corpus Relational Database Schemas? Abstract: Tabular data collections, such as GitTables, are important sources of real-world tabular data. They provide training data for table representation learning approaches that advance the state-of-the-art for problems like semantic annotation, data imputation, and automated error detection. However, such datasets are limited to individual tables and do not contain schema information about database constraints (uniqueness, not nulls, etc.) or relationships to other tables. As real-world database schemas are hard to come by - with the largest public repository of databases containing about 150 relational databases there is a need in the community for a new dataset. Thus, we created GitSchemas, a large corpus of database schema information extracted from SQL scripts in public code repositories, containing highly accurate schema information for more than 150k schemas, 1M tables (including column names, data types, and database constraints), and almost 600k foreign key relationships. We believe that schema information alone (without data) at this scale will be suitable for benchmarking, and improving existing approaches to a variety of relevant data management problems, such as foreign key detection and constraint predictions, while also presenting an opportunity to learn more about how database systems are used in practice. Bio: Till Döhmen is a PhD student at RWTH Aachen University, guest researcher at the UvA Intelligent Data Engineering Lab (INDE Lab), and engineer at Hopsworks. His research interests lie at the intersection of data management and machine learning systems.