Identify and Delete duplicate Records in SQL Server

Scenario: I have accumulated data into SQL Server database from three different servers (oracle, O2 and oracle) for a financial reporting. After data accumulated , we found more duplicate rows; business requirement was to remove all duplicate rows because of final data will be import to another system.

For this purpose, I want to write script to find out duplicate rows from my table. Finally, I want to delete only duplicate rows not original row.

Original row means if i have total 2 rows those same or duplicate, so I want to delete only one row that is duplicate.

Let’s go............................

Step 1: Create a student Table below Script:

CREATE TABLE [Students](

[ID] [int] NULL,

[Name] [varchar](50) NULL,

[address] [varchar](50) NULL,

[contact] [varchar](50) NULL,

[email] [varchar](50) NULL

)

Step 2: insert Record that shown below:

Insert Duplicate Records

Step 3: Selecting Duplicate Rows

WITH CTE AS

(

SELECT ROW_NUMBER()

OVER

(

PARTITION BY [ID] , [Name],[address],[contact],[email]

ORDER BY ( [ID])

) AS RowID,

* FROM Students

)

SELECT * FROM CTE

WHERE [ID] IN ( SELECT [ID] FROM CTE WHERE RowID >1 )

Identify Duplicate Records

Step 4: Deleting Duplicate Rows

WITH CTE AS

(

SELECT ROW_NUMBER()

OVER

(

PARTITION BY [ID] , [Name],[address],[contact],[email]

ORDER BY ( [ID])

) AS RowID,

* FROM Students

)

Delete From CTE where RowID Not In

(

SELECT MIN(RowID)

FROM CTE

WHERE [ID] IN ( SELECT [ID] FROM CTE where RowID >1)

)

Result: 2 rows Deleted.

Again select duplicate rows by Step-3. Display No records.

Note: I will discuss ASAP about major keywords are WITH , ROW_NUMBER() and PARTITION BY in another post.

Comments

Unknown said…

Why not simply this? Only one scan of the data...

WITH CTE AS
(
SELECT ROW_NUMBER()
OVER(PARTITION BY [ID] , [Name],[address],[contact],[email] ORDER BY [ID]) AS RowID
FROM Students
)
DELETE
FROM CTE
WHERE [RowID] >= 2;

25 September 2016 at 20:30

Code By Experience

Search This Blog

Identify and Delete duplicate Records in SQL Server

Labels

Comments

Post a Comment

Popular posts from this blog

What are the difference between DDL, DML , DCL and TCL commands?

Ubuntu/Unix Console Command -Cheat Sheet

SQL Server blocked access to procedure ‘sys.sp_OACreate’ of component ‘Ole Automation Procedures’