914World.com > OT: SQL Question ...

914World.com > The 914 Forums > 914World Garage

SirAndy

Mar 17 2005, 07:11 PM

SQL Question:

what's the easiest way to find duplicates in a table?

let's say i have a table (tbl_Users) with 20000 rows and the table has a column for the users email addy (sEmail) and i want to list all the accounts that have duplicate email addys.

i could run a script that takes each row and runs a count(*) on that addy, but that'll take forever.
anything simpler than that ???

Andy

Sparky

Mar 17 2005, 07:15 PM

SELECT

FROM
tablename
GROUP BY

HAVING
count(*) > 1

But then again my sql is a bit rusty...

My best,
Mike D.

Part Pricer

Mar 17 2005, 07:17 PM

I've done this with MySQL tables. I've just exported the table and imported it into Excel. Then sort by the column you want. Then, add a column that contains a formula that is true if a value is the same as the value in the row above it.

john rogers

Mar 17 2005, 07:20 PM

Do it like this:

sqlplus> delete from userID.my_email_data a where rowid <
(select max(rowid)
from userID.my_email_data b
where b.address = a.address
and b.username = a.username);

Notes: userID is the schema owner, my_email_data is the name of the table and address, username are the fields to check for dupes to. You can not use any functions such as substr(....) or you'll get inconsistant results. Additionally, the email field entries have to be exacly alike, even to case or it will fail cause then they aren't really duplicates!

SirAndy

Mar 17 2005, 07:34 PM

QUOTE (Paul Heery @ Mar 17 2005, 05:17 PM)

Then sort by the column you want.

ok, how the heck do i sort a column in excel by name ????

Andy

SirAndy

Mar 17 2005, 07:37 PM

QUOTE (SirAndy @ Mar 17 2005, 05:34 PM)

ok, how the heck do i sort a column in excel by name ????

ok, got it ...

that worked!

Andy

Part Pricer

Mar 17 2005, 07:37 PM

Select the entire worksheet. Then go to Data then Sort. Choose the column you want to sort by.

JB 914

Mar 17 2005, 07:38 PM

QUOTE (SirAndy @ Mar 17 2005, 05:34 PM)

QUOTE (Paul Heery @ Mar 17 2005, 05:17 PM)

Then sort by the column you want.

ok, how the heck do i sort a column in excel by name ????

Andy

highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.

hope this helps

Part Pricer

Mar 17 2005, 07:41 PM

QUOTE (joe buckle @ Mar 17 2005, 08:38 PM)

highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.

Choosing Unique Records will hide the dupes.

JB 914

Mar 17 2005, 07:49 PM

QUOTE (Paul Heery @ Mar 17 2005, 05:41 PM)

QUOTE (joe buckle @ Mar 17 2005, 08:38 PM)

highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.

Choosing Unique Records will hide the dupes.

then you can import the now "good" records table back to SQL

Part Pricer

Mar 17 2005, 07:51 PM

I think Andy just wants to identify the dupes. I think he would cause some problems if he abitrarily deleted the record that appeared second for duplicate email addresses. What if he was deleting the wrong account?

JB 914

Mar 17 2005, 08:06 PM

this might help

http://office.microsoft.com/en-us/assistan...1366161033.aspx

SirAndy

Mar 17 2005, 08:21 PM

QUOTE (Paul Heery @ Mar 17 2005, 05:51 PM)

What if he was deleting the wrong account?

yeah, i have to go through them by hand and figure out which ones are used and which ones are the ghosts ...

i found 531 duplicate accounts ...

Andy

PS: i changed the code so now it catches duplicate email addys upon registration. should have done that in the first place. oh well ...

This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.