Help - Search - Members - Calendar
Full Version: OT: SQL Question ...
914World.com > The 914 Forums > 914World Garage
SirAndy
SQL Question:

what's the easiest way to find duplicates in a table?

let's say i have a table (tbl_Users) with 20000 rows and the table has a column for the users email addy (sEmail) and i want to list all the accounts that have duplicate email addys.

i could run a script that takes each row and runs a count(*) on that addy, but that'll take forever.
anything simpler than that ???

idea.gif Andy
Sparky
SELECT

FROM
tablename
GROUP BY

HAVING
count(*) > 1


But then again my sql is a bit rusty...

My best,
Mike D.
Part Pricer
I've done this with MySQL tables. I've just exported the table and imported it into Excel. Then sort by the column you want. Then, add a column that contains a formula that is true if a value is the same as the value in the row above it.
john rogers
Do it like this:

sqlplus> delete from userID.my_email_data a where rowid <
(select max(rowid)
from userID.my_email_data b
where b.address = a.address
and b.username = a.username);


Notes: userID is the schema owner, my_email_data is the name of the table and address, username are the fields to check for dupes to. You can not use any functions such as substr(....) or you'll get inconsistant results. Additionally, the email field entries have to be exacly alike, even to case or it will fail cause then they aren't really duplicates!
SirAndy
QUOTE (Paul Heery @ Mar 17 2005, 05:17 PM)
Then sort by the column you want.

ok, how the heck do i sort a column in excel by name ????

confused24.gif Andy
SirAndy
QUOTE (SirAndy @ Mar 17 2005, 05:34 PM)
ok, how the heck do i sort a column in excel by name ????

ok, got it ...

that worked!
beerchug.gif Andy
Part Pricer
Select the entire worksheet. Then go to Data then Sort. Choose the column you want to sort by.

JB 914
QUOTE (SirAndy @ Mar 17 2005, 05:34 PM)
QUOTE (Paul Heery @ Mar 17 2005, 05:17 PM)
Then sort by the column you want.

ok, how the heck do i sort a column in excel by name ????

confused24.gif Andy

highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.

hope this helps
Part Pricer
QUOTE (joe buckle @ Mar 17 2005, 08:38 PM)
highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.


Choosing Unique Records will hide the dupes.
JB 914
QUOTE (Paul Heery @ Mar 17 2005, 05:41 PM)
QUOTE (joe buckle @ Mar 17 2005, 08:38 PM)
highlight the rows ( without the column header row at the top) then use data\filter\advanced filter to sort the data. they have a check box for unique records.


Choosing Unique Records will hide the dupes.

then you can import the now "good" records table back to SQL
Part Pricer
I think Andy just wants to identify the dupes. I think he would cause some problems if he abitrarily deleted the record that appeared second for duplicate email addresses. What if he was deleting the wrong account?

JB 914
this might help

http://office.microsoft.com/en-us/assistan...1366161033.aspx

biggrin.gif
SirAndy
QUOTE (Paul Heery @ Mar 17 2005, 05:51 PM)
What if he was deleting the wrong account?

yeah, i have to go through them by hand and figure out which ones are used and which ones are the ghosts ...

i found 531 duplicate accounts ...
wacko.gif Andy

PS: i changed the code so now it catches duplicate email addys upon registration. should have done that in the first place. oh well ... headbang.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2024 Invision Power Services, Inc.