Scenario: I have accumulated data into SQL
Server database from three different servers (oracle, O2 and oracle) for a
financial reporting. After data accumulated , we found more duplicate rows;
business requirement was to remove all duplicate rows because of final data will
be import to another system.
For this purpose, I want to write script to
find out duplicate rows from my table. Finally, I want to delete only duplicate
rows not original row.
Original row means if i have total 2
rows those same or duplicate, so I want to delete only one row that is
duplicate.
Let’s go............................
Step 1: Create a student Table below Script:
CREATE TABLE [Students](
[ID] [int] NULL,
[Name] [varchar](50) NULL,
[address] [varchar](50) NULL,
[contact] [varchar](50) NULL,
[email] [varchar](50) NULL
)
Step 2: insert
Record that shown below:
Step 3: Selecting Duplicate Rows
WITH CTE AS
(
SELECT ROW_NUMBER()
OVER
(
PARTITION BY [ID] , [Name],[address],[contact],[email]
ORDER BY ( [ID])
) AS RowID,
* FROM Students
)
SELECT * FROM CTE
WHERE [ID] IN ( SELECT [ID] FROM CTE WHERE RowID >1 )
Identify Duplicate Records |
Step 4:
Deleting Duplicate Rows
WITH CTE AS
(
SELECT ROW_NUMBER()
OVER
(
PARTITION BY [ID] , [Name],[address],[contact],[email]
ORDER BY ( [ID])
) AS RowID,
* FROM Students
)
Delete From CTE where RowID Not In
(
SELECT MIN(RowID)
FROM
CTE
WHERE
[ID] IN ( SELECT [ID] FROM CTE where RowID >1)
)
Result: 2 rows Deleted.
Again select duplicate rows by Step-3. Display
No records.
Note: I will discuss ASAP about major keywords
are WITH , ROW_NUMBER()
and PARTITION BY in another post.
Comments
WITH CTE AS
(
SELECT ROW_NUMBER()
OVER(PARTITION BY [ID] , [Name],[address],[contact],[email] ORDER BY [ID]) AS RowID
FROM Students
)
DELETE
FROM CTE
WHERE [RowID] >= 2;
Post a Comment