stream-compaction

Thrust: Removing duplicates in key-value arrays

守給你的承諾、 提交于 2020-01-01 03:42:33
问题 I have a pair of arrays of equal size, I will call them keys and values. For example: K: V 1: 99 1: 100 1: 100 1: 100 1: 103 2: 103 2: 105 3: 45 3: 67 The keys are sorted and the values associated with each key are sorted. How do I remove the value duplicates associated with each key and its corresponding key? That is, I want to compact the above to: 1: 99 1: 100 1: 103 2: 103 <-- This should remain, since key is different 2: 105 3: 45 3: 67 I looked at the stream compaction functions

Thrust: Removing duplicates in key-value arrays

情到浓时终转凉″ 提交于 2019-12-03 08:56:15
I have a pair of arrays of equal size, I will call them keys and values. For example: K: V 1: 99 1: 100 1: 100 1: 100 1: 103 2: 103 2: 105 3: 45 3: 67 The keys are sorted and the values associated with each key are sorted. How do I remove the value duplicates associated with each key and its corresponding key? That is, I want to compact the above to: 1: 99 1: 100 1: 103 2: 103 <-- This should remain, since key is different 2: 105 3: 45 3: 67 I looked at the stream compaction functions available in Thrust , but was not able to find anything which does this. Is this possible with Thrust? Or do I

efficient way to convert scatter indices into gather indices?

非 Y 不嫁゛ 提交于 2019-11-28 12:13:04
I'm trying to write a stream compaction (take an array and get rid of empty elements) with SIMD intrinsics. Each iteration of the loop processes 8 elements at a time (SIMD width). With SSE intrinsics, I can do this fairly efficiently with _mm_shuffle_epi8(), which does a 16 entry table lookup (gather in parallel computing terminology). The shuffle indices are precomputed, and looked up with a bit mask. for (i = 0; i < n; i += 8) { v8n_Data = _mm_load_si128(&data[i]); mask = _mm_movemask_epi8(&is_valid[i]) & 0xff; // is_valid is byte array v8n_Compacted = _mm_shuffle_epi8(v16n_ShuffleIndices

efficient way to convert scatter indices into gather indices?

巧了我就是萌 提交于 2019-11-27 06:50:35
问题 I'm trying to write a stream compaction (take an array and get rid of empty elements) with SIMD intrinsics. Each iteration of the loop processes 8 elements at a time (SIMD width). With SSE intrinsics, I can do this fairly efficiently with _mm_shuffle_epi8(), which does a 16 entry table lookup (gather in parallel computing terminology). The shuffle indices are precomputed, and looked up with a bit mask. for (i = 0; i < n; i += 8) { v8n_Data = _mm_load_si128(&data[i]); mask = _mm_movemask_epi8(