MSSQL Remove duplicate records from a table leaving a single unique record.
The situation arose where we had multiple records for items where there should be only a single record. We have JobIDs which will have multiple unique TaskIDs. We found we several (lots) of JobIDs which had 2 of the same TaskIDs. It's OK to have many TaskIDs in each job, there just can't be 2 of the same TaskIDs per JobID. Many TaskIDs to one JobID and one unique TaskID to one JobID.
First we find which TaskIDs are duplicated.
Select DISTINCT TaskID FROM (
select JobID, TaskID, (*) as Count from JobTable
Where TaskID IS NOT NULL
Group by JobID, TaskID HAVING COUNT(*) > 1) X
We are going to remove the 2nd entry for each duplicate TaskID. The original TaskID is correct in this case. The 2nd TaskID is superfluous and will have a greater Identity Key, they were added at a later time. Let's pull up the duplicate TaskIDs with the larger Identity Key (ID).
SELECT T1.ID
FROM JobTable T1
JOIN
(SELECT JobID,TaskID , MAX(ID) AS ID FROM JobTable
GROUP BY JobID,TaskID
Having COUNT(*) > 1) AS T2
ON T1.TaskID = T2.TaskID AND T1.Qbid = T2.QBID AND T1.ID = T2.ID
This pulls up what we want, so lets get rid of them.
DELETE JobTable WHERE ID in (
SELECT T1.ID
FROM JobTable T1
JOIN
(SELECT JobID,TaskID , MAX(ID) AS ID FROM JobTable
GROUP BY JobID,TaskID
Having COUNT(*) > 1) AS T2
ON T1.TaskID = T2.TaskID AND T1.Qbid = T2.QBID AND T1.ID = T2.ID
)
Done! Now we have unique TaskIDs for each JobID.