{"id":29013,"date":"2024-04-29T19:15:54","date_gmt":"2024-04-29T18:15:54","guid":{"rendered":"https:\/\/nelkodev.com\/blog\/eliminando-filas-duplicadas-en-mysql-metodos-practicos-y-efectivos\/"},"modified":"2024-06-03T17:43:53","modified_gmt":"2024-06-03T16:43:53","slug":"eliminando-filas-duplicadas-en-mysql-metodos-practicos-y-efectivos","status":"publish","type":"post","link":"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/","title":{"rendered":"Removing Duplicate Rows in MySQL: Practical and Effective Methods"},"content":{"rendered":"<p>In the database world, one of the most common but critical tasks is managing and cleaning duplicate data. MySQL, being one of the most popular database management systems, offers several ways to handle this task. Over time, you may find yourself needing to remove duplicate rows to avoid inconsistencies and errors in data analysis. Today we will explore two efficient methods to perform this activity in MySQL: using DELETE JOIN and an intermediate table.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #ffffff;color:#ffffff\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #ffffff;color:#ffffff\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#%C2%BFPor_Que_es_Importante_Eliminar_Filas_Duplicadas\" >Why is it important to remove duplicate rows?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#Eliminacion_de_Duplicados_Usando_DELETE_JOIN\" >Removing Duplicates Using DELETE JOIN<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#1_Identificacion_de_Filas_Duplicadas\" >1. Identification of Duplicate Rows<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#2_Eliminacion_Usando_DELETE_JOIN\" >2. Deletion Using DELETE JOIN<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#Utilizando_una_Tabla_Intermedia_para_Eliminar_Duplicados\" >Using an Intermediate Table to Remove Duplicates<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#1_Creacion_de_la_Tabla_Intermedia\" >1. Creation of the Intermediate Table<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#2_Insercion_de_Filas_Unicas\" >2. Insertion of Single Rows<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#3_Reemplazo_de_la_Tabla_Original\" >3. Replacement of the Original Board<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#4_Limpieza\" >4. Cleaning<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/nelkodev.com\/en\/blog\/eliminating-duplicate-rows-in-mysql-practical-and-effective-methods\/#Conclusiones\" >Conclusions<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"%C2%BFPor_Que_es_Importante_Eliminar_Filas_Duplicadas\"><\/span>Why is it important to remove duplicate rows?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Before we dive into the methods, let&#039;s understand the importance of removing duplicates. Duplicate rows can arise due to errors in data insertion, failures in migration processes, or simply due to the design of data collection. These duplicates can lead to:<\/p>\n<ul>\n<li>Data integrity problems.<\/li>\n<li>Erroneous results in queries and reports.<\/li>\n<li>Reduced database performance.<\/li>\n<li>Challenges in implementing business restrictions and rules.<\/li>\n<\/ul>\n<p>With these issues in mind, it is essential to learn effective techniques to clean duplicate data and maintain the quality of our database.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Eliminacion_de_Duplicados_Usando_DELETE_JOIN\"><\/span>Removing Duplicates Using DELETE JOIN<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The DELETE JOIN method is recommended for situations where you need to keep an original row and delete the other copies. Let&#039;s see a step by step of how to implement this:<\/p>\n<h3><span class=\"ez-toc-section\" id=\"1_Identificacion_de_Filas_Duplicadas\"><\/span>1. Identification of Duplicate Rows<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>First, we need to identify the duplicate rows in the table. Suppose we have a table called <code>customers<\/code> with duplicate fields in column <code>e-mail<\/code>. We would use:<\/p>\n<pre><code class=\"&quot;language-sql&quot;\">SELECT email, COUNT(*) as num FROM customers GROUP BY email HAVING num &gt; 1;<\/code><\/pre>\n<h3><span class=\"ez-toc-section\" id=\"2_Eliminacion_Usando_DELETE_JOIN\"><\/span>2. Deletion Using DELETE JOIN<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Once identified, we can proceed to eliminate duplications, excepting the original row (for example, the row with the smallest <code>id<\/code>):<\/p>\n<pre><code class=\"&quot;language-sql&quot;\">DELETE c1 FROM clients c1 JOIN clients c2 WHERE c1.email = c2.email AND c1.id &gt; c2.id;<\/code><\/pre>\n<p>In this query, <code>JOIN<\/code> used to compare the table <code>customers<\/code> with itself, eliminating rows that have a <code>id<\/code> largest, which implies that the row with the smallest <code>id<\/code> will be preserved.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Utilizando_una_Tabla_Intermedia_para_Eliminar_Duplicados\"><\/span>Using an Intermediate Table to Remove Duplicates<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Another effective method, especially useful on very large boards or in situations where performance is a concern, is the use of an intermediate board. This method involves creating a new table that will store only one instance of each duplicate row.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"1_Creacion_de_la_Tabla_Intermedia\"><\/span>1. Creation of the Intermediate Table<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<pre><code class=\"&quot;language-sql&quot;\">CREATE TABLE clients_temp LIKE clients;<\/code><\/pre>\n<h3><span class=\"ez-toc-section\" id=\"2_Insercion_de_Filas_Unicas\"><\/span>2. Insertion of Single Rows<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>We insert a single instance of each duplicate row into the new table. This can be done by selecting rows based on specific criteria (such as the <code>id<\/code> smallest or most recent based on date):<\/p>\n<pre><code class=\"&quot;language-sql&quot;\">INSERT INTO temp_clients(id, name, email) SELECT MIN(id), name, email FROM clients GROUP BY email;<\/code><\/pre>\n<h3><span class=\"ez-toc-section\" id=\"3_Reemplazo_de_la_Tabla_Original\"><\/span>3. Replacement of the Original Board<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Finally, we can rename the original table and replace it with our clean table of duplicates:<\/p>\n<pre><code class=\"&quot;language-sql&quot;\">RENAME TABLE customers TO old_customers, customers_temp TO customers;<\/code><\/pre>\n<h3><span class=\"ez-toc-section\" id=\"4_Limpieza\"><\/span>4. Cleaning<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Don&#039;t forget to delete the old table:<\/p>\n<pre><code class=\"&quot;language-sql&quot;\">DROP TABLE old_customers;<\/code><\/pre>\n<h2><span class=\"ez-toc-section\" id=\"Conclusiones\"><\/span>Conclusions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Removing duplicate rows is crucial to maintaining the health and accuracy of any database management system. Whether using DELETE JOIN or through an intermediate table, MySQL offers robust tools to ensure that your data remains clean and reliable.<\/p>\n<p>For questions or more details on how you can optimize your database in MySQL, feel free to visit <a href=\"https:\/\/nelkodev.com\/en\/\">my blog NelkoDev<\/a> o <a href=\"https:\/\/nelkodev.com\/en\/contact\/\">contact me directly<\/a>. I&#039;m here to help you navigate the complex world of databases. Until next time!<\/p>","protected":false},"excerpt":{"rendered":"<p>In the world of databases, one of the most common yet critical tasks is managing and cleaning duplicate data. MySQL, being one of the most popular database management systems, offers a variety of ways to handle this task. Over time, you may encounter the need to clean up duplicate data and keep it clean.<\/p>","protected":false},"author":1,"featured_media":29014,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[420,2010,2185],"tags":[2193,205,2189,2098,1500,2097,2026,840,352,1303,2192],"class_list":["post-29013","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","category-advanced-techniques","category-curso-mysql","tag-advanced","tag-blog","tag-curso","tag-duplicadas","tag-efectivos","tag-eliminando","tag-filas","tag-metodos","tag-mysql","tag-practicos","tag-techniques"],"_links":{"self":[{"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/posts\/29013","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/comments?post=29013"}],"version-history":[{"count":0,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/posts\/29013\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/media\/29014"}],"wp:attachment":[{"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/media?parent=29013"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/categories?post=29013"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nelkodev.com\/en\/wp-json\/wp\/v2\/tags?post=29013"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}