SQL DISTINCT Explained

SQL DISTINCT Explained — Remove Duplicates with Examples



The SQL DISTINCT keyword is a simple but powerful tool to remove duplicate rows from query results. When you want unique values — for example, a list of unique countries from a customer table — DISTINCT is your friend. This guide explains how DISTINCT works, common use cases, how it interacts with COUNT() and grouping, and when you might prefer alternatives for performance.



What Does DISTINCT Do?



At its core, DISTINCT filters duplicate rows from the result set. It considers all columns in the SELECT list when determining duplicates.



SELECT DISTINCT Country
FROM Customers;
  


This returns a list of countries found in your Customers table — each country appears only once.


📽 Watch Video!

Distinct on Multiple Columns



If you include multiple columns in DISTINCT, the combination of those columns determines uniqueness.



SELECT DISTINCT Country, State
FROM Customers;
  


Rows are considered duplicates only if both Country and State match — useful when you want unique pairs.



DISTINCT with COUNT()



A very common pattern is counting unique values:



SELECT COUNT(DISTINCT CustomerID) AS UniqueCustomers
FROM Orders;
  


Use this when you want to know how many unique customers placed orders (not how many orders total).



When DISTINCT and GROUP BY Overlap



Sometimes DISTINCT and GROUP BY produce similar results, but they serve different purposes.



-- DISTINCT
SELECT DISTINCT Category FROM Products;

-- GROUP BY (returns one row per category too)
SELECT Category FROM Products GROUP BY Category;
  


Use GROUP BY when you need aggregation (SUM, COUNT, AVG) per group; use DISTINCT when you only need unique rows.



Performance Considerations



DISTINCT requires sorting or hashing under the hood, which can be expensive on large datasets. Here are tips to improve performance:



  • Only select the columns you need — fewer columns means fewer comparisons.


  • Use indexes on the columns you’re selecting if possible.


  • Consider GROUP BY with aggregates if you also need counts or sums.


  • For large tables, test both approaches (DISTINCT vs. GROUP BY vs. window functions) to see which is faster in your DBMS.


Example: DISTINCT vs GROUP BY Performance



-- DISTINCT
SELECT DISTINCT CustomerID FROM Orders;

-- GROUP BY
SELECT CustomerID FROM Orders GROUP BY CustomerID;
  


Both return unique customer IDs; performance depends on indexes and DB engine optimization.



Common Uses of DISTINCT



  • Lists of unique values (countries, categories, tags)


  • Removing duplicates before exporting data


  • Counting unique entities with COUNT(DISTINCT ...)


  • When data contains accidental duplicates and you need a clean list


Examples — Real Queries



1. Unique customer emails



SELECT DISTINCT Email
FROM Customers
WHERE Email IS NOT NULL;
  


2. Unique combinations (city + state)



SELECT DISTINCT City, State
FROM Customers;
  


3. Count unique products ordered



SELECT COUNT(DISTINCT ProductID) AS UniqueProducts
FROM OrderLines;
  


When NOT to Use DISTINCT



Sometimes DISTINCT hides underlying data issues. Consider alternatives:



  • Fix data duplicates at the source if they are accidental.


  • Use ROW_NUMBER() with window functions to pick a single row per group when additional logic is needed (for example, pick the latest record per customer).


  • Use GROUP BY when you need aggregated metrics besides uniqueness.


Alternative: ROW_NUMBER() approach



WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY CreatedAt DESC) AS rn
  FROM CustomerLogins
)
SELECT * FROM ranked WHERE rn = 1;
  


This returns the most recent login per customer — more flexible than a simple DISTINCT.



Edge Cases & Gotchas



  • NULLs:

    Several DBs treat NULL as equal for DISTINCT; results may show one row with NULL values.


  • Whitespace and case:

    'USA' and 'usa' are different unless you normalize case or use a case-insensitive collation.


  • Large text columns:

    DISTINCT on large VARCHAR/TEXT columns may be slow.


  • Unintended duplicates:

    Duplicate rows caused by joins — consider SELECT DISTINCT on the joined table columns specifically.


Practical Exercises



Try these:



  1. Return the list of distinct product categories from the Products table.


  2. Count how many unique customers ordered in the last 30 days.


  3. Using a join, return distinct product names sold to a specific customer.


FAQ — DISTINCT



Q: Does DISTINCT remove duplicate rows entirely?



A: Yes, DISTINCT removes duplicate rows from the result set — duplicates are determined by all SELECTed columns.



Q: Is DISTINCT the same as GROUP BY?



A: They can produce the same output for simple uniqueness queries, but GROUP BY is used for aggregation. DISTINCT is simpler when you only want unique rows.



Q: How does DISTINCT interact with ORDER BY?



A: You can combine them — DISTINCT runs conceptually before ORDER BY, so you get unique rows which you can then sort.



Conclusion



DISTINCT is an essential SQL tool for cleaning and summarizing results quickly. Use it for unique lists and combined with COUNT() to measure unique users or products. For complex de-duplication or performance-sensitive scenarios, consider window functions or grouping as scalable alternatives. Practice the examples above and use our exercises to build confidence.



⭐ Need an SQL Certificate, CEUs, or a FREE start with Hands-On Practice! → Get Started Here!