duplicate-data

mysql duplicate data deletion

谁都会走 提交于 2019-12-05 17:44:19
This shows me all the first names and last names that have exactly two entries that are identical SELECT `firstname`,`lastname`,COUNT(*) AS Count FROM `people` GROUP BY `firstname`,`lastname` HAVING Count = 2 How do I turn this into a DELETE FROM WHERE statement with a LIMIT to only remove one of each of the entries and leave the other one. okay this appears to be way to technical i'm just going to do it in a php while loop You can create a table with 1 record of each of the duplicates: Then delete all the dup records from the people table and then re-insert the dup records. -- Setup for

Remove Duplicates with Caveats

左心房为你撑大大i 提交于 2019-12-05 05:12:42
问题 I have a table with rowID, longitude, latitude, businessName, url, caption. This might look like: rowID | long | lat | businessName | url | caption 1 20 -20 Pizza Hut yum.com null How do I delete all of the duplicates, but only keep the one that has a URL (first priority), or keep the one that has a caption if the other doesn't have a URL (second priority) and delete the rest? 回答1: Here's my looping technique. This will probably get voted down for not being mainstream - and I'm cool with that

SQL find duplicate records occuring within 1 minute of each other

半世苍凉 提交于 2019-12-05 03:10:15
问题 I am checking website entrys that are recorded in a database columns: browser, click_type_id, referrer, and datetime if multiple rows have the same browser, click_type_id, and referrer and are timestamped (occur within 1 minute of one another) they are considered a duplicate. I need a sql statement that can query for these duplicates based on the above criteria. Any help is appreciated. 回答1: SELECT T1.browser, T1.click_type, T1.referrer, T1.datetime, T2.datetime FROM My_Table T1 INNER JOIN My

How to compare 2 lists and merge them in Python/MySQL?

橙三吉。 提交于 2019-12-04 21:18:25
I want to merge data. Following are my MySQL tables. I want to use Python to traverse though a list of both Lists (one with dupe = 'x' and other with null dupes). This is sample data. Actual data is humongous. For instance : a b c d e f key dupe -------------------- 1 d c f k l 1 x 2 g h j 1 3 i h u u 2 4 u r t 2 x From the above sample table, the desired output is : a b c d e f key dupe -------------------- 2 g c h k j 1 3 i r h u u 2 What I have so far : import string, os, sys import MySQLdb from EncryptedFile import EncryptedFile enc = EncryptedFile( os.getenv("HOME") + '/.py-encrypted-file

How to identify duplicate items gathered from multiple feeds and link to them in a Database

谁说胖子不能爱 提交于 2019-12-04 20:16:31
I have a Database storing details of products which are taken from many sites, and gathered through the individual sites API's. When I call the feed, the details are stored in a database table. The problem I'm having is that because the exact same product is listed on many sites by the seller I end up having duplicate items in my database, and then when I display them on a web page there are many duplicates. The problem is that the item doesn't have any obvious unique identifier, it has specific details of the item (of which there could be many), and then a description of the item from the

Is it possible for SQL to find records with duplicates?

故事扮演 提交于 2019-12-04 08:38:14
Can I use a SQL query to find records where one field is identical in both? That is, can I use the following table and return 1,3 (the ids) by comparing the name columns (and ignoring the phone)? ID | Name | Phone 1 | Bob | 5555555555 2 | John | 1234567890 3 | Bob | 1515151515 4 | Tim | 5555555555 To get all names that exist more than once you can execute this statement: SELECT Name FROM People GROUP BY Name HAVING COUNT(*)>1; To get the IDs of the duplicates "1,3" concatenated that way use GROUP_CONCAT : SELECT GROUP_CONCAT( ID SEPARATOR ',' ) FROM Table GROUP BY Name HAVING COUNT(*) > 1

Finding duplicate in SQL Server Table

此生再无相见时 提交于 2019-12-04 07:05:42
问题 I have a table +--------+--------+--------+--------+--------+ | Market | Sales1 | Sales2 | Sales3 | Sales4 | +--------+--------+--------+--------+--------+ | 68 | 1 | 2 | 3 | 4 | | 630 | 5 | 3 | 7 | 8 | | 190 | 9 | 10 | 11 | 12 | +--------+--------+--------+--------+--------+ I want to find duplicates between all the above sales fields. In above example markets 68 and 630 have a duplicate Sales value that is 3. My problem is displaying the Market having duplicate sales. 回答1: This problem

SQL: Removing Duplicate records - Albeit different kind

≡放荡痞女 提交于 2019-12-04 05:13:50
Consider the following table: TAB6 A B C ---------- ---------- - 1 2 A 2 1 A 2 3 C 3 4 D I consider, the records {1,2, A} and {2, 1, A} as duplicate. I need to select and produce the below record set: A B C A B C ---------- ---------- - ---------- ---------- - 1 2 A or 2 1 A 2 3 C 2 3 C 3 4 D 3 4 D I tried the below queries. But to no avail. select t1.* from t6 t1 , t6 t2 where t1.a <> t2.b and t1.b <> t2.a and t1.rowid <> t2.rowid / A B C ---------- ---------- - 1 2 A 2 1 A 2 1 A 2 3 C 3 4 D 3 4 D 6 rows selected. Or even this: select * from t6 t1 where exists (select * from t6 t2 where t1.a

How can I find indices of each row of a matrix which has a duplicate in matlab?

拈花ヽ惹草 提交于 2019-12-04 04:06:57
I want to find the indices all the rows of a matrix which have duplicates. For example A = [1 2 3 4 1 2 3 4 2 3 4 5 1 2 3 4 6 5 4 3] The vector to be returned would be [1,2,4] A lot of similar questions suggest using the unique function, which I've tried but the closest I can get to what I want is: [C, ia, ic] = unique(A, 'rows') ia = [1 3 5] m = 5; setdiff(1:m,ia) = [2,4] But using unique I can only extract the 2nd,3rd,4th...etc instance of a row, and I need to also obtain the first. Is there any way I can do this? NB: It must be a method which doesn't involve looping through the rows, as I'm

Combine two data frames and remove duplicate columns

 ̄綄美尐妖づ 提交于 2019-12-04 00:54:17
问题 I want to cbind two data frames and remove duplicated columns. For example: df1 <- data.frame(var1=c('a','b','c'), var2=c(1,2,3)) df2 <- data.frame(var1=c('a','b','c'), var3=c(2,4,6)) cbind(df1,df2) #this creates a data frame in which column var1 is duplicated I want to create a data frame with columns var1 , var2 and var3 , in which column var2 is not repeated. 回答1: merge will do that work. try: merge(df1, df2) 回答2: In case you inherit someone else's dataset and end up with duplicate columns