Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21

Problem description & analysis:

A certain database table has multiple fields, each storing an email address, which may be null or duplicated.

Task: Now we need to merge multiple fields into one, remove null and duplicate values, and m…


This content originally appeared on DEV Community and was authored by Judith-Data-Processing-Hacks

Problem description & analysis:

A certain database table has multiple fields, each storing an email address, which may be null or duplicated.

source table

Task: Now we need to merge multiple fields into one, remove null and duplicate values, and merge them with commas.

expected results

Code comparisons:

SQL: In a database/computing platform that supports functions such as array, for example, in azure-databricks, SQL can be written as follows:

SELECT concat_ws(',', array_distinct(array(Mail_1, Mail_2, Mail_3, Mail_4, Mail_5)), NULL) AS Mail
FROM my_data

SQL Server and other databases do not have an array function, and generally require indirect implementation using statements such as cross apply:

SELECT 
    STRING_AGG(DistinctEmails.Mail, ',') AS Mail
FROM my_data
CROSS APPLY (
    SELECT DISTINCT Mail
    FROM (VALUES (Mail_1), (Mail_2), (Mail_3), (Mail_4), (Mail_5)) AS EmailList(Mail)
    WHERE Mail IS NOT NULL
) AS DistinctEmails
GROUP BY my_data.Mail_1, my_data.Mail_2, my_data.Mail_3, my_data.Mail_4, my_data.Mail_5

SQL must write column names, which is inflexible. To deduplicate between an uncertain number of columns, it is necessary to dynamically generate column names using stored procedures and then execute SQL, which will make the architecture more complex.

SPL: SPL does not need to write column names, and the code is the same for different data sources. 👉🏻Try.DEMO

SPL code

A1: Load data.

A2: Take the current record, convert it into a set, take unique values and remove null, and merge it with commas. ~ indicates the current record, function ID is used for deduplication, @0 means null is removed.

Get Started with esProc SPL — esProc SPL FREE Download!


This content originally appeared on DEV Community and was authored by Judith-Data-Processing-Hacks


Print Share Comment Cite Upload Translate Updates
APA

Judith-Data-Processing-Hacks | Sciencx (2025-04-22T02:39:59+00:00) Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21. Retrieved from https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/

MLA
" » Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21." Judith-Data-Processing-Hacks | Sciencx - Tuesday April 22, 2025, https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/
HARVARD
Judith-Data-Processing-Hacks | Sciencx Tuesday April 22, 2025 » Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21., viewed ,<https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/>
VANCOUVER
Judith-Data-Processing-Hacks | Sciencx - » Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/
CHICAGO
" » Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21." Judith-Data-Processing-Hacks | Sciencx - Accessed . https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/
IEEE
" » Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21." Judith-Data-Processing-Hacks | Sciencx [Online]. Available: https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/. [Accessed: ]
rf:citation
» Deduplicate between an Uncertain Number of Columns — From SQL to SPL #21 | Judith-Data-Processing-Hacks | Sciencx | https://www.scien.cx/2025/04/22/deduplicate-between-an-uncertain-number-of-columns-from-sql-to-spl-21/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.