How to Detect and Remove Duplicate Records in SQL Server

Duplicate records in SQL Server can lead to inaccurate reporting, data inconsistencies, and performance issues. In this article, we’ll go over how to identify and safely remove duplicate rows while keeping at least one unique record.

Detecting Duplicates

To find duplicate records in a table, use the GROUP BY and HAVING clauses to count occurrences of each unique combination of values:

SELECT column1, column2, COUNT(*)
FROM YourTable
GROUP BY column1, column2
HAVING COUNT(*) > 1;

Replace column1, column2 with the columns that define a duplicate in your dataset.

If you need to see the actual duplicate rows, use a ROW_NUMBER() approach:

SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num
    FROM YourTable
) t
WHERE row_num > 1;

Here, id should be a unique column to order the duplicates.

Removing Duplicates

Method 1: Using ROW_NUMBER()

The safest way to remove duplicates while keeping one unique record is by using ROW_NUMBER().

WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num
    FROM YourTable
)
DELETE FROM CTE WHERE row_num > 1;

This deletes all duplicate records while keeping the first occurrence.

Method 2: Using DISTINCT INTO a New Table

If you want to be extra cautious, create a new table with only unique records:

SELECT DISTINCT * INTO NewTable FROM YourTable;

Then, drop the old table and rename NewTable back to YourTable.

Final Thoughts

Always backup your data before running delete operations to prevent accidental data loss. By regularly cleaning up duplicates, you can keep your SQL Server database efficient and reliable.

3
31

Related

String interpolation, introduced in C# 6.0, provides a more readable and concise way to format strings compared to traditional concatenation (+) or string.Format(). Instead of manually inserting variables or placeholders, you can use the $ symbol before a string to directly embed expressions inside brackets.

string name = "Walt";
string job = 'Software Engineer';

string message = $"Hello, my name is {name} and I am a {job}";
Console.WriteLine(message);

This would produce the final output of:

Hello, my name is Walt and I am a Software Engineer

String interpolation can also be chained together into a multiline string (@) for even cleaner more concise results:

string name = "Walt";
string html = $@"
    <div>
        <h1>Welcome, {name}!</h1>
    </div>";
37
148

Raw string literals in C# provide a flexible way to work with multiline strings, with some interesting rules around how quotes work.

The key insight is that you can use any number of double quotes (three or more) to delimit your string, as long as the opening and closing sequences have the same number of quotes.

The Basic Rules

  1. You must use at least three double quotes (""") to start and end a raw string literal
  2. The opening and closing quotes must have the same count
  3. The closing quotes must be on their own line for proper indentation
  4. If your string content contains a sequence of double quotes, you need to use more quotes in your delimiter than the longest sequence in your content

Examples with Different Quote Counts

// Three quotes - most common usage
string basic = """
    This is a basic
    multiline string
    """;

// Four quotes - when your content has three quotes
string withThreeQuotes = """"
    Here's some text with """quoted""" content
    """";

// Five quotes - when your content has four quotes
string withFourQuotes = """""
    Here's text with """"nested"""" quotes
    """"";

// Six quotes - for even more complex scenarios
string withFiveQuotes = """"""
    Look at these """""nested""""" quotes!
    """""";

The N+1 Rule

The general rule is that if your string content contains N consecutive double quotes, you need to wrap the entire string with at least N+1 quotes. This ensures the compiler can properly distinguish between your content and the string's delimiters.

// Example demonstrating the N+1 rule
string example1 = """
    No quotes inside
    """; // 3 quotes is fine

string example2 = """"
    Contains """three quotes"""
    """"; // Needs 4 quotes (3+1)

string example3 = """""
    Has """"four quotes""""
    """""; // Needs 5 quotes (4+1)

Practical Tips

  • Start with three quotes (""") as your default
  • Only increase the quote count when you actually need to embed quote sequences in your content
  • The closing quotes must be on their own line and should line up with the indentation you want
  • Any whitespace to the left of the closing quotes defines the baseline indentation
// Indentation example
string properlyIndented = """
    {
        "property": "value",
        "nested": {
            "deeper": "content"
        }
    }
    """; // This line's position determines the indentation

This flexibility with quote counts makes raw string literals extremely versatile, especially when dealing with content that itself contains quotes, like JSON, XML, or other structured text formats.

2
75

Primary constructors, introduced in C# 12, offer a more concise way to define class parameters and initialize fields.

This feature reduces boilerplate code and makes classes more readable.

Traditional Approach vs Primary Constructor

Before primary constructors, you would likely write something like the following:

public class UserService
{
    private readonly ILogger _logger;
    private readonly IUserRepository _repository;

    public UserService(ILogger logger, IUserRepository repository)
    {
        _logger = logger;
        _repository = repository;
    }

    public async Task<User> GetUserById(int id)
    {
        _logger.LogInformation("Fetching user {Id}", id);
        return await _repository.GetByIdAsync(id);
    }
}

With primary constructors, this becomes:

public class UserService(ILogger logger, IUserRepository repository)
{
    public async Task<User> GetUserById(int id)
    {
        logger.LogInformation("Fetching user {Id}", id);
        return await repository.GetByIdAsync(id);
    }
}

Key Benefits

  1. Reduced Boilerplate: No need to declare private fields and write constructor assignments
  2. Parameters Available Throughout: Constructor parameters are accessible in all instance methods
  3. Immutability by Default: Parameters are effectively readonly without explicit declaration

Real-World Example

Here's a practical example using primary constructors with dependency injection:

public class OrderProcessor(
    IOrderRepository orderRepo,
    IPaymentService paymentService,
    ILogger<OrderProcessor> logger)
{
    public async Task<OrderResult> ProcessOrder(Order order)
    {
        try
        {
            logger.LogInformation("Processing order {OrderId}", order.Id);
            
            var paymentResult = await paymentService.ProcessPayment(order.Payment);
            if (!paymentResult.Success)
            {
                return new OrderResult(false, "Payment failed");
            }

            await orderRepo.SaveOrder(order);
            return new OrderResult(true, "Order processed successfully");
        }
        catch (Exception ex)
        {
            logger.LogError(ex, "Failed to process order {OrderId}", order.Id);
            throw;
        }
    }
}

Tips and Best Practices

  1. Use primary constructors when the class primarily needs dependencies for its methods
  2. Combine with records for immutable data types:
public record Customer(string Name, string Email)
{
    public string FormattedEmail => $"{Name} <{Email}>";
}
  1. Consider traditional constructors for complex initialization logic

Primary constructors provide a cleaner, more maintainable way to write C# classes, especially when working with dependency injection and simple data objects.

1
69