In Microsoft Excel, duplicate values refer to two or more instances of the same data in a given range of cells. For example, if you have a list of names, and two or more of the names are the same, those are considered duplicates. Duplicates can be problematic in Excel because they can skew data analysis and lead to errors.
Finding and removing duplicates from Excel is important for mainly two reasons, i.e. Data Accuracy and Data Consistency. Duplicates can make it difficult to accurately analyze data. Removing duplicates ensures that each piece of data is only counted once, which can improve the accuracy of your analysis. It can also lead to inconsistencies in your spreadsheet. For example, if you have a list of product prices, and some prices are duplicated, it can make it difficult to know which price is correct.
In this blog, we will learn how to find out duplicates in Excel and remove them at the right time. Keep reading to find out.
How to Identify and Highlight the Duplicates?
Let’s start with a step-by-step guide on identifying the duplicate values in your Excel sheet.
- Open your Excel spreadsheet and select the range of cells where you want to check for duplicates.
- Click on the “Home” tab on the Ribbon at the top of the Excel window.
- Look for the “Conditional Formatting” button in the “Styles” section and click on it. A drop-down menu will appear.
- Select “Highlight Cell Rules” from the drop-down menu, and then select “Duplicate Values” from the sub-menu.
- In the “Duplicate Values” dialog box, choose a formatting option for the duplicates. For example, you can choose to highlight duplicates in red.
- Click on the “OK” button to apply the formatting.
Excel will now highlight all the duplicate values in the selected range of cells. You can easily see which cells contain duplicates by looking for the cells that are highlighted in the chosen formatting.
Wasn’t that easy? Now, what if the duplicates are case-sensitive? What to do next? Let’s find out.
How to Find Case-Sensitive Duplicates in Excel
To find case-sensitive duplicates in Excel, you can use a combination of Excel’s built-in functions and conditional formatting. Here are the steps to follow:
- Select the range of cells where you want to search for duplicates.
- On the Home tab of the Excel ribbon, click on the Conditional Formatting button and select “New Rule”.
- In the “New Formatting Rule” dialog box, select “Use a formula to determine which cells to format”.
- In the “Format values where this formula is true” box, enter the following formula:
- This formula assumes that you are searching for duplicates in column A, and that the range of cells you selected in step 1 starts at A1 and ends at A10. If your data is in a different range of cells, adjust the formula accordingly. The formula counts how many times the value in the current cell appears in the range, and formats it if the count is greater than 1 (i.e., there are duplicates).
- Click on the “Format” button to select the formatting you want to apply to the duplicate values. For example, you could choose to highlight them in red.
- Click “OK” to close the “New Formatting Rule” dialog box, and then click “OK” again to apply the formatting to the selected range of cells.
Excel will now highlight all the case-sensitive duplicates in the selected range.
Note that this method only works for exact matches of case-sensitive text. If you need to find duplicates that are not case-sensitive, you can use the “Conditional Formatting > Highlight Cells Rules > Duplicate Values” option instead.
Now, what if you have to Filter Duplicate Rows and Columns? Let’s see.
How to Filter Duplicate rows and Columns in Excel
To filter duplicate rows and columns in Excel, you can use the “Remove Duplicates” feature. Here are the steps to follow:
- Select the range of cells that you want to filter for duplicates.
- On the Data tab of the Excel ribbon, click on “Remove Duplicates” in the Data Tools group.
- In the “Remove Duplicates” dialog box, select the columns or rows that you want to filter for duplicates. To filter for duplicate rows, select all columns. To filter for duplicate columns, select all rows.
- Click on “OK” to apply the filter.
Excel will now remove all duplicate rows or columns from the selected range.
Please note that when you filter for duplicates in columns, Excel will remove all but the first occurrence of each duplicate column. If you want to keep the last occurrence of each duplicate column instead, you can use a different method, such as a macro or a formula. Similarly, if you want to filter for duplicates based on specific criteria, you can use the “Advanced Filter” feature instead of “Remove Duplicates”.
Lastly, now that we have learnt about identifying and removing duplicates, let’s see how to highlight the duplicates in your Excel sheet which you can remove immediately or later as per your requirement.
How to Highlight Duplicates in Excel?
To highlight duplicates in Excel, you can use the “Conditional Formatting” feature. Here are the steps to follow:
- Select the range of cells that you want to highlight duplicates in.
- On the Home tab of the Excel ribbon, click on the “Conditional Formatting” button in the Styles group.
- Click on “Highlight Cells Rules”, and then select “Duplicate Values” from the submenu.
- In the “Duplicate Values” dialog box, choose the formatting style you want to use for the duplicates. For example, you could choose to highlight them in red.
- Click on “OK” to close the dialog box and apply the formatting.
Excel will now highlight all the duplicates in the selected range.
Note that by default, Excel will highlight both the first and subsequent occurrences of each duplicate value. If you only want to highlight the subsequent occurrences (i.e., the duplicates), you can change the “Duplicate Values” settings to only apply the formatting to “Duplicate”. Also, note that this method is not case-sensitive. If you want to highlight only case-sensitive duplicates, you can use a different formula in the Conditional Formatting rule.
In Conclusion, Finding and removing duplicates in Excel is an important task for maintaining data accuracy and consistency. Whether you need to identify duplicates in a small range of cells or a large dataset, Excel provides several built-in features that can help you quickly and efficiently identify and remove duplicates. By using conditional formatting, sorting, filtering, and other tools, you can streamline your workflow and ensure that your data is clean and reliable. With these techniques in your toolbox, you can save time and reduce errors in your Excel projects.