Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

A Field Analysis of Relational Database Schemas in Open-source Software (Extended)

Abstract : The relational schemas of 512 open-source projects storing their data in MySQL or PostgreSQL databases are investigated by querying the standard information schema, looking for various issues. These SQL queries are released as the Salix free software. As it is fully relational and relies on standards, it may be installed in any compliant database to help improve schemas. The overall quality of the surveyed schemas is poor: a majority of projects have at least one table without any primary key or unique constraint to identify a tuple; data security features such as referential integrity or transactional back-ends are hardly used; projects that advertise supporting both databases often have missing tables or attributes. PostgreSQL projects have a better quality compared to MySQL projects, and it is even better for projects with PostgreSQL-only support. However, the difference between both databases is mostly due to MySQL-specific issues. An overall predictor of bad database quality is that a project chooses MySQL or PHP, while good design is found with PostgreSQL and Java. The few declared constraints allow to detect latent bugs, that are worth fixing: more declarations would certainly help unveil more bugs. Our survey also suggests some features of MySQL and PostgreSQL as particularly error-prone. This first survey on the quality of relational schemas in open-source software provides a unique insight in the data engineering practice of these projects
Type de document :
Communication dans un congrès
Liste complète des métadonnées

Littérature citée [63 références]  Voir  Masquer  Télécharger

https://hal-mines-paristech.archives-ouvertes.fr/hal-00903676
Contributeur : Claire Medrala <>
Soumis le : mardi 12 novembre 2013 - 17:10:31
Dernière modification le : jeudi 24 septembre 2020 - 16:36:01
Archivage à long terme le : : jeudi 13 février 2014 - 08:25:09

Fichier

A-478.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00903676, version 1

Citation

Fabien Coelho, Alexandre Aillos, Samuel Pilot, Shamil Valeev. A Field Analysis of Relational Database Schemas in Open-source Software (Extended). The Third International Conference on Advances in Databases, Knowledge, and Data Applications, Jan 2011, St Marteen, Netherlands Antilles. p. 9-15. ⟨hal-00903676⟩

Partager

Métriques

Consultations de la notice

1792

Téléchargements de fichiers

7184