set-difference

Perform non-pairwise all-to-all comparisons between two unordered character vectors — The opposite of intersect — all-to-all setdiff

若如初见. 提交于 2019-12-02 01:03:38
问题 EXAMPLE DATA v1 <- c("E82391", "X2329323", "C239923", "E1211", "N23932", "F93249232", "X93201", "X9023111", "O92311", "9000F", "K9232932", "L9232932", "X02311111") v2 <- c("L9232932", "C239923", "E1211", "E82391", "F93249232", "U82832") PROBLEM I want to extract only those items that are in one of the vectors and not in the other. I understand that setdiff is unable to compare two unordered character vectors and find all the differences between the two.. Does, for example, %in% perform all-to

All-to-all setdiff on two numeric vectors with a numeric threshold for accepting matches

狂风中的少年 提交于 2019-12-01 18:51:53
What I want to do is more or less a combination of the problems discussed in the two following threads: Perform non-pairwise all-to-all comparisons between two unordered character vectors --- The opposite of intersect --- all-to-all setdiff Merge data frames based on numeric rownames within a chosen threshold and keeping unmatched rows as well I have two numeric vectors: b_1 <- c(543.4591, 489.36325, 12.03, 896.158, 1002.5698, 301.569) b_2 <- c(22.12, 53, 12.02, 543.4891, 5666.31, 100.1, 896.131, 489.37) I want to compare all elements in b_1 against all elements in b_2 and vice versa. If

How do I delete the intersection of sets A and B from A without sorting in MATLAB?

馋奶兔 提交于 2019-12-01 16:21:07
问题 Two matrices, A and B: A = [1 2 3 9 7 5 4 9 4 1 4 7] B = [1 2 3 1 4 7] All rows of matrix B are members of matrix A. I wish to delete the common rows of A and B from A without sorting. I have tried setdiff() but this sorts the output. For my particular problem (atomic coordinates in protein structures) maintaining the ordered integrity of the rows is important. 回答1: Use ISMEMBER: %# find rows in A that are also in B commonRows = ismember(A,B,'rows'); %# remove those rows A(commonRows,:) = [];

bash, Linux: Set difference between two text files

点点圈 提交于 2019-11-29 20:23:31
I have two files A - nodes_to_delete and B - nodes_to_keep . Each file has a many lines with numeric ids. I want to have the list of numeric ids that are in nodes_to_delete but NOT in nodes_to_keep , e.g. . Doing it within a PostgreSQL database is unreasonably slow. Any neat way to do it in bash using Linux CLI tools? UPDATE: This would seem to be a Pythonic job, but the files are really, really large. I have solved some similar problems using uniq , sort and some set theory techniques. This was about two or three orders of magnitude faster than the database equivalents. The comm command does

c++ STL set difference

孤街浪徒 提交于 2019-11-29 20:23:21
Does the C++ STL set data structure have a set difference operator? Yes there is, it is in <algorithm> and is called: std::set_difference . The usage is: #include <algorithm> #include <set> #include <iterator> // ... std::set<int> s1, s2; // Fill in s1 and s2 with values std::set<int> result; std::set_difference(s1.begin(), s1.end(), s2.begin(), s2.end(), std::inserter(result, result.end())); In the end, the set result will contain the s1-s2 . Yes, there is a set_difference function in the algorithms header. Edits: FYI, the set data structure is able to efficiently use that algorithm, as

c++ STL set difference

人盡茶涼 提交于 2019-11-28 16:25:00
问题 Does the C++ STL set data structure have a set difference operator? 回答1: Yes there is, it is in <algorithm> and is called: std::set_difference. The usage is: #include <algorithm> #include <set> #include <iterator> // ... std::set<int> s1, s2; // Fill in s1 and s2 with values std::set<int> result; std::set_difference(s1.begin(), s1.end(), s2.begin(), s2.end(), std::inserter(result, result.end())); In the end, the set result will contain the s1-s2 . 回答2: Yes, there is a set_difference function

bash, Linux: Set difference between two text files

主宰稳场 提交于 2019-11-28 16:24:57
问题 I have two files A - nodes_to_delete and B - nodes_to_keep . Each file has a many lines with numeric ids. I want to have the list of numeric ids that are in nodes_to_delete but NOT in nodes_to_keep , e.g. . Doing it within a PostgreSQL database is unreasonably slow. Any neat way to do it in bash using Linux CLI tools? UPDATE: This would seem to be a Pythonic job, but the files are really, really large. I have solved some similar problems using uniq , sort and some set theory techniques. This

MySQL: difference of two result sets

时光总嘲笑我的痴心妄想 提交于 2019-11-28 05:15:59
How can I get the set difference of two result sets? Say I have a result set (just one column in each): result1: 'a' 'b' 'c' result2: 'b' 'c' I want to minus what is in result1 by result2: result1 - result2 such that it equals: difference of result1 - result2: 'a' rjh To perform result1 - result2, you can join result1 with result2, and only output items that exist in result1. For example: SELECT DISTINCT result1.column FROM result1 LEFT JOIN result2 ON result1.column = result2.column WHERE result2.column IS NULL Note that is not a set difference , and won't output items in result2 that don't

Find the set difference between two large arrays (matrices) in Python

时光怂恿深爱的人放手 提交于 2019-11-28 00:56:00
I have two large 2-d arrays and I'd like to find their set difference taking their rows as elements. In Matlab, the code for this would be setdiff(A,B,'rows') . The arrays are large enough that the obvious looping methods I could think of take too long. This should work, but is currently broken in 1.6.1 due to an unavailable mergesort for the view being created. It works in the pre-release 1.7.0 version. This should be the fastest way possible, since the views don't have to copy any memory: >>> import numpy as np >>> a1 = np.array([[1,2,3],[4,5,6],[7,8,9]]) >>> a2 = np.array([[4,5,6],[7,8,9],

Exclude characters from a character class

喜欢而已 提交于 2019-11-27 05:16:27
Is there a simple way to match all characters in a class except a certain set of them? For example if in a lanaguage where I can use \w to match the set of all unicode word characters, is there a way to just exclude a character like an underscore "_" from that match? Only idea that came to mind was to use negative lookahead/behind around each character but that seems more complex than necessary when I effectively just want to match a character against a positive match AND negative match. For example if & was an AND operator I could do this... ^(\w&[^_])+$ Martin Ender It really depends on your