Accéder directement au contenu Accéder directement à la navigation
Article dans une revue

On the Quality of Relational Database Schemas in Open-source Software

Abstract : The relational schemas of 512 open-source projects storing their data in MySQL or PostgreSQL databases are investigated by querying the standard information schema, looking for overall design issues. The set of SQL queries used in our research is released as the Salix free software. As it is fully relational and relies on standards, it may be installed in any compliant database to help improve schemas. Our research shows that the overall quality of the surveyed schemas is poor: a majority of projects have at least one table without any primary key or unique constraint to identify a tuple; data security features such as referential integrity or transactional back-ends are hardly used; projects that advertise supporting both databases often have missing tables or attributes. PostgreSQL projects appear to be of higher quality than MySQL projects, and have been updated more recently, suggesting a more active maintenance. This is even better for projects with PostgreSQL-only support. However, the quality difference between both databases management systems is mostly due to MySQL-specific issues. An overall predictor of bad database quality is that a project chooses MySQL or PHP, while good design is found with PostgreSQL and Java. The few declared constraints allow to detect latent bugs, that are worth fixing: more declarations would certainly help unveil more bugs. Our survey also suggests that some features of MySQL and PostgreSQL are particularly error-prone. This first survey on the quality of relational schemas in open-source software provides a unique insight in the data engineering practice of these projects.
Type de document :
Article dans une revue
Liste complète des métadonnées

Littérature citée [71 références]  Voir  Masquer  Télécharger
Contributeur : Claire Medrala <>
Soumis le : mardi 16 octobre 2012 - 16:38:34
Dernière modification le : mercredi 14 octobre 2020 - 03:52:19
Archivage à long terme le : : jeudi 17 janvier 2013 - 11:40:16


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-00742605, version 1


Fabien Coelho, Alexandre Aillos, Samuel Pilot, Shamil Valeev. On the Quality of Relational Database Schemas in Open-source Software. Journal on Advances in Software, 2012, Vol 4 (N°3 & 4), 11 p. ⟨hal-00742605⟩



Consultations de la notice


Téléchargements de fichiers