Skip to main content

Identify and Delete duplicate Records in SQL Server

     
Scenario: I have accumulated data into SQL Server database from three different servers (oracle, O2 and oracle) for a financial reporting. After data accumulated , we found more duplicate rows; business requirement was to remove all duplicate rows because of final data will be import to another system.


For this purpose, I want to write script to find out duplicate rows from my table. Finally, I want to delete only duplicate rows not original row.

Original row means if i have total 2 rows those same or duplicate, so I want to delete only one row that is duplicate.

Business Web Hosting


Let’s go............................

Step 1: Create a student Table below Script:

CREATE TABLE [Students](
    [ID] [int] NULL,
    [Name] [varchar](50) NULL,
    [address] [varchar](50) NULL,
    [contact] [varchar](50) NULL,
    [email] [varchar](50) NULL
)

Step 2: insert Record that shown below:
 
Insert Duplicate Records
  
Step 3: Selecting Duplicate Rows

WITH CTE AS
(
    SELECT ROW_NUMBER()
          OVER
          (
                  PARTITION BY [ID] , [Name],[address],[contact],[email]
                        ORDER BY ( [ID])
          ) AS RowID,
          * FROM Students
)

 SELECT * FROM CTE
        WHERE [ID] IN ( SELECT [ID] FROM CTE WHERE RowID >1 ) 


Identify Duplicate Records

Step 4: Deleting Duplicate Rows

WITH CTE AS
(
    SELECT    ROW_NUMBER()
    OVER
    (
            PARTITION BY [ID] , [Name],[address],[contact],[email]
            ORDER BY ( [ID])
    ) AS RowID,
    * FROM Students
)

Delete From CTE where RowID Not In
(
    SELECT MIN(RowID)
        FROM CTE
        WHERE [ID] IN ( SELECT [ID] FROM CTE where RowID >1)  
)

Result: 2 rows Deleted. 

 Again select duplicate rows by Step-3. Display No records.


Note: I will discuss ASAP about major keywords are WITH , ROW_NUMBER() and PARTITION BY in another post.

Comments

Unknown said…
Why not simply this? Only one scan of the data...

WITH CTE AS
(
SELECT ROW_NUMBER()
OVER(PARTITION BY [ID] , [Name],[address],[contact],[email] ORDER BY [ID]) AS RowID
FROM Students
)
DELETE
FROM CTE
WHERE [RowID] >= 2;

Popular posts from this blog

What are the difference between DDL, DML , DCL and TCL commands?

Database Command Types DDL Data Definition Language  (DDL) statements are used to define the database structure or schema.  -           CREATE - to create objects in the database -           ALTER - alters the structure of the database -           DROP - delete objects from the database -           TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed -           COMMENT - add comments to the data dictionary -           RENAME - rename an object DML Data Manipulation Language  (DML) statements are used for managing data in database. DML commands are not auto-committed. It means changes made by DML command are not permanent to d...

Display Image Immediately from fileUpload Control

Client Side Programming is more efficient for user to very first interection Like Jquery . Now i provide some jquery script to display Image from FileUpload Control directly.. For this purpose we use reference . <script src="../../../JS/JQueryAutocomplete/jquery-1.7.2.js" type="text/javascript"></script>. You just download from jquery.com. Now code below... <script language="javascript" type="text/javascript"> /* Write FileUpload Change event .. */ $('#FileUpload1').bind('change', function() {                                 displayImage(this);                    }); //-------------------------// function displayImage(input)             {        ...

Get Previous Month's First and Last Date -C#

Here this code, your can get previous month's first and last day. It's simple   static void Main( string [] arg) {    var year = DateTime .Today.Year;    var month = DateTime .Today.Month;    var firstDate = new DateTime (year, month, 1).AddMonths(-1);    var lastDate = new DateTime (year, month, 1).AddDays(-1);    Console .WriteLine( "First day Previous Month: {0}" , firstDate);    Console .WriteLine( "Last day Previous Month: {0}" , lastDate);    var  lastday =  GetMonthLastDate ( 2015 , 6 );   Console .WriteLine( "Last day : {0}" , lastDate);    Console .ReadLine(); } You can use below method to get last date of any month of the any year. private DateTime GetMonthLastDate( int year, int month) {     return new DateTime (year, month, DateTime .DaysInMonth(year, month)); }