processing-efficiency

'mutate' to add two columns with a single fn-call in tidyverse in R

Deadly 提交于 2021-01-29 09:21:37
问题 This is an R Version 3.4.4 question A voting function voteOnBase , takes 2 arguments and returns a 2-element list: the WINNER and the VOTE.COUNT . I want to use it to add those two columns to notVotedYet , a tibble. The following code runs correctly. library(tidyverse) withVotes <- notVotedYet %>% group_by(BASE) %>% mutate(WINNER = voteOnBase(BASE, CODES)[[1]], VOTE.COUNT = voteOnBase(BASE, CODES)[[2]]) However, it calls voteOnBase twice on the same inputs. How can I eliminate the extra

Database efficiency - table per user vs. table of users

那年仲夏 提交于 2020-12-15 04:58:27
问题 For a website having users. Each user having the ability to create any amount of, we'll call it "posts": Efficiency-wise - is it better to create one table for all of the posts, saving the user-id of the user which created the post, for each post - OR creating a different separate table for each user and putting there just the posts created by that user? 回答1: The database layout should not change when you add more data to it, so the user data should definitely be in one table. Also: Having

Is setting `turtle.speed(0)` necessary when we have turtle.tracer(0)

若如初见. 提交于 2020-07-22 05:59:39
问题 Is there a difference between: import turtle turtle.tracer(0) turtle.speed(0) while True: turtle.goto(turtle.xcor()+1) turtle.update() And: import turtle turtle.tracer(0) while True: turtle.goto(turtle.xcor()+1) turtle.update() I've heard that setting turtle.speed(0) make things faster, but if so, I don't see any difference. 回答1: According to: https://www.eg.bucknell.edu/~hyde/Python3/TurtleDirections.html The tracer() method: Can be used to accelerate the drawing of complex graphics. Turning

Paralle apply function on df in python

一笑奈何 提交于 2020-06-16 20:46:38
问题 I have a function that go over two lists: items and dates. The function return an updated list of items. For now it runs with apply which is not that efficent on million of rows. I want to make it more efficient by parallelizing it. Items in item list are on chronological order, as well as the corresponding date list (item_list and date_list are the same size). This is the df: Date item_list date_list 12/05/20 [I1,I3,I4] [10/05/20, 11/05/20, 12/05/20 ] 11/05/20 [I1,I3] [11/05/20 , 14/05/20]

Make LEFT JOIN query more efficient

。_饼干妹妹 提交于 2020-01-16 18:12:48
问题 The following query with LEFT JOIN is drawing too much memory (~4GB), but the host only allows about 120MB for this process. SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ? Create table syntax for grades: CREATE TABLE `grades` ( `grade_id` int(11) unsigned NOT NULL AUTO_INCREMENT, `evaluation_id` int(10) unsigned

Equal loading for parallel task distribution

守給你的承諾、 提交于 2020-01-07 08:19:08
问题 I have a large number of independent tasks I would like to run, and I would like to distribute them on a parallel system such that each processor does the same amount of work, and maximizes my efficiency. I would like to know if there is a general approach to finding a solution to this problem, or possibly just a good solution to my exact problem. I have T=150 tasks I would like to run, and the time each task will take is t=T. That is, task1 takes 1 one unit of time, task2 takes 2 units of

Circular buffer in MATLAB, **without** copying old data

夙愿已清 提交于 2020-01-03 17:17:06
问题 There are some good posts on here (such as this one) on how to make a circular buffer in MATLAB. However from looking at them, I do not believe they fit my application, because what I am seeking, is a circular buffer solution in MATLAB, that does NOT involve any copying of old data. To use a simple example, let us say that I am processing 50 samples at a time, and I read in 10 samples each iteration. I would first run through 5 iterations, fill up my buffer, and in the end, process my 50

Is there a faster method to convert bitmap pixels to greyscale?

自古美人都是妖i 提交于 2020-01-02 07:06:14
问题 At the moment, I am using the SetPixel() method to change the colour of every pixel in a bitmap. This works fine on small images with small dimensions, but when I test it on large images it does take a while. I haven't worked with images in VB.Net before, so I might just be overlooking something obvious. I'm doing this to make a program which converts an image to grey scale. This produces the right result but at a low speed, and during this time the UI freezes, so I'm keen to maximize the

Is there a faster method to convert bitmap pixels to greyscale?

匆匆过客 提交于 2020-01-02 07:06:11
问题 At the moment, I am using the SetPixel() method to change the colour of every pixel in a bitmap. This works fine on small images with small dimensions, but when I test it on large images it does take a while. I haven't worked with images in VB.Net before, so I might just be overlooking something obvious. I'm doing this to make a program which converts an image to grey scale. This produces the right result but at a low speed, and during this time the UI freezes, so I'm keen to maximize the

The most reliable and efficient udp packet size?

浪尽此生 提交于 2019-12-29 03:27:25
问题 Would sending lots a small packets by UDP take more resources (cpu, compression by zlib, etc...). I read here that sending one big packet of ~65kBYTEs by UDP would probably fail so I'm thought that sending lots of smaller packets would succeed more often, but then comes the computational overhead of using more processing power (or at least thats what I'm assuming). The question is basically this; what is the best scenario for sending the maximum successful packets and keeping computation down