Skip to main content

Identify and Delete duplicate Records in SQL Server

     
Scenario: I have accumulated data into SQL Server database from three different servers (oracle, O2 and oracle) for a financial reporting. After data accumulated , we found more duplicate rows; business requirement was to remove all duplicate rows because of final data will be import to another system.


For this purpose, I want to write script to find out duplicate rows from my table. Finally, I want to delete only duplicate rows not original row.

Original row means if i have total 2 rows those same or duplicate, so I want to delete only one row that is duplicate.

Business Web Hosting


Let’s go............................

Step 1: Create a student Table below Script:

CREATE TABLE [Students](
    [ID] [int] NULL,
    [Name] [varchar](50) NULL,
    [address] [varchar](50) NULL,
    [contact] [varchar](50) NULL,
    [email] [varchar](50) NULL
)

Step 2: insert Record that shown below:
 
Insert Duplicate Records
  
Step 3: Selecting Duplicate Rows

WITH CTE AS
(
    SELECT ROW_NUMBER()
          OVER
          (
                  PARTITION BY [ID] , [Name],[address],[contact],[email]
                        ORDER BY ( [ID])
          ) AS RowID,
          * FROM Students
)

 SELECT * FROM CTE
        WHERE [ID] IN ( SELECT [ID] FROM CTE WHERE RowID >1 ) 


Identify Duplicate Records

Step 4: Deleting Duplicate Rows

WITH CTE AS
(
    SELECT    ROW_NUMBER()
    OVER
    (
            PARTITION BY [ID] , [Name],[address],[contact],[email]
            ORDER BY ( [ID])
    ) AS RowID,
    * FROM Students
)

Delete From CTE where RowID Not In
(
    SELECT MIN(RowID)
        FROM CTE
        WHERE [ID] IN ( SELECT [ID] FROM CTE where RowID >1)  
)

Result: 2 rows Deleted. 

 Again select duplicate rows by Step-3. Display No records.


Note: I will discuss ASAP about major keywords are WITH , ROW_NUMBER() and PARTITION BY in another post.

Comments

Unknown said…
Why not simply this? Only one scan of the data...

WITH CTE AS
(
SELECT ROW_NUMBER()
OVER(PARTITION BY [ID] , [Name],[address],[contact],[email] ORDER BY [ID]) AS RowID
FROM Students
)
DELETE
FROM CTE
WHERE [RowID] >= 2;

Popular posts from this blog

What are the difference between DDL, DML , DCL and TCL commands?

Database Command Types DDL Data Definition Language  (DDL) statements are used to define the database structure or schema.  -           CREATE - to create objects in the database -           ALTER - alters the structure of the database -           DROP - delete objects from the database -           TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed -           COMMENT - add comments to the data dictionary -           RENAME - rename an object DML Data Manipulation Language  (DML) statements are used for managing data in database. DML commands are not auto-committed. It means changes made by DML command are not permanent to d...

Ubuntu/Unix Console Command -Cheat Sheet

M ost popular Linux/Unix command those are required to know during working on Linux/Unix environment. Primarily, when I started the work on ubuntu, I faced a common problem that was not familiar to know about the basic ubuntu command. After that, I aggregate basic commands and work very well. Besides that, most of the time forgot the command and this helps me to cut-down googling time. Privileges SL Command Use / Description 1 sudo command Run command as root 2 sudo -s Open a root shell 3 sudo -s -u user Open a shell as user 4 sudo -k Forget sudo passwords 5 gksudo command Visual sudo dialog (GNOME) 6 kdesudo command visual sudo dialog (KDE) 7 sudo visudo edit /etc/sudoers 8 gksudo nautilus root file man...

How to check dotnet SDK and runtime version installed

I need to install dot net SDK 5 but first, I need to know which versions of SDK installed on my system. So, use   dotnet --info .  This command will display the SDKs and also runtimes installed on your system with path together where they are installed. For example on my Windows 10 development machine,  dotnet --info   Microsoft Windows [Version 10.0.18363.1082] (c) 2019 Microsoft Corporation. All rights reserved. C:\WINDOWS\system32>dotnet --info .NET Core SDK (reflecting any global.json):  Version:   3.1.101  Commit:    b377529961 Runtime Environment:  OS Name:     Windows  OS Version:  10.0.18363  OS Platform: Windows  RID:         win10-x64  Base Path:   C:\Program Files\dotnet\sdk\3.1.101\ Host (useful for support):   Version: 3.1.8   Commit:  9c1330dedd .NET Core SDKs installed:   1.0.0-preview2-003131 [C:\Program Files\dotnet\sdk...