processing-efficiency | 易学教程

'mutate' to add two columns with a single fn-call in tidyverse in R

阅读更多关于 'mutate' to add two columns with a single fn-call in tidyverse in R

问题 This is an R Version 3.4.4 question A voting function voteOnBase , takes 2 arguments and returns a 2-element list: the WINNER and the VOTE.COUNT . I want to use it to add those two columns to notVotedYet , a tibble. The following code runs correctly. library(tidyverse) withVotes <- notVotedYet %>% group_by(BASE) %>% mutate(WINNER = voteOnBase(BASE, CODES)[[1]], VOTE.COUNT = voteOnBase(BASE, CODES)[[2]]) However, it calls voteOnBase twice on the same inputs. How can I eliminate the extra

Database efficiency - table per user vs. table of users

阅读更多关于 Database efficiency - table per user vs. table of users

问题 For a website having users. Each user having the ability to create any amount of, we'll call it "posts": Efficiency-wise - is it better to create one table for all of the posts, saving the user-id of the user which created the post, for each post - OR creating a different separate table for each user and putting there just the posts created by that user? 回答1: The database layout should not change when you add more data to it, so the user data should definitely be in one table. Also: Having

Is setting `turtle.speed(0)` necessary when we have turtle.tracer(0)

阅读更多关于 Is setting `turtle.speed(0)` necessary when we have turtle.tracer(0)

问题 Is there a difference between: import turtle turtle.tracer(0) turtle.speed(0) while True: turtle.goto(turtle.xcor()+1) turtle.update() And: import turtle turtle.tracer(0) while True: turtle.goto(turtle.xcor()+1) turtle.update() I've heard that setting turtle.speed(0) make things faster, but if so, I don't see any difference. 回答1: According to: https://www.eg.bucknell.edu/~hyde/Python3/TurtleDirections.html The tracer() method: Can be used to accelerate the drawing of complex graphics. Turning

Paralle apply function on df in python

阅读更多关于 Paralle apply function on df in python

问题 I have a function that go over two lists: items and dates. The function return an updated list of items. For now it runs with apply which is not that efficent on million of rows. I want to make it more efficient by parallelizing it. Items in item list are on chronological order, as well as the corresponding date list (item_list and date_list are the same size). This is the df: Date item_list date_list 12/05/20 [I1,I3,I4] [10/05/20, 11/05/20, 12/05/20 ] 11/05/20 [I1,I3] [11/05/20 , 14/05/20]

Make LEFT JOIN query more efficient

阅读更多关于 Make LEFT JOIN query more efficient

问题 The following query with LEFT JOIN is drawing too much memory (~4GB), but the host only allows about 120MB for this process. SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ? Create table syntax for grades: CREATE TABLE `grades` ( `grade_id` int(11) unsigned NOT NULL AUTO_INCREMENT, `evaluation_id` int(10) unsigned

Equal loading for parallel task distribution

阅读更多关于 Equal loading for parallel task distribution

问题 I have a large number of independent tasks I would like to run, and I would like to distribute them on a parallel system such that each processor does the same amount of work, and maximizes my efficiency. I would like to know if there is a general approach to finding a solution to this problem, or possibly just a good solution to my exact problem. I have T=150 tasks I would like to run, and the time each task will take is t=T. That is, task1 takes 1 one unit of time, task2 takes 2 units of

Circular buffer in MATLAB, without copying old data

阅读更多关于 Circular buffer in MATLAB, **without** copying old data

问题 There are some good posts on here (such as this one) on how to make a circular buffer in MATLAB. However from looking at them, I do not believe they fit my application, because what I am seeking, is a circular buffer solution in MATLAB, that does NOT involve any copying of old data. To use a simple example, let us say that I am processing 50 samples at a time, and I read in 10 samples each iteration. I would first run through 5 iterations, fill up my buffer, and in the end, process my 50

Is there a faster method to convert bitmap pixels to greyscale?

阅读更多关于 Is there a faster method to convert bitmap pixels to greyscale?

问题 At the moment, I am using the SetPixel() method to change the colour of every pixel in a bitmap. This works fine on small images with small dimensions, but when I test it on large images it does take a while. I haven't worked with images in VB.Net before, so I might just be overlooking something obvious. I'm doing this to make a program which converts an image to grey scale. This produces the right result but at a low speed, and during this time the UI freezes, so I'm keen to maximize the

Is there a faster method to convert bitmap pixels to greyscale?

阅读更多关于 Is there a faster method to convert bitmap pixels to greyscale?

The most reliable and efficient udp packet size?

阅读更多关于 The most reliable and efficient udp packet size?

问题 Would sending lots a small packets by UDP take more resources (cpu, compression by zlib, etc...). I read here that sending one big packet of ~65kBYTEs by UDP would probably fail so I'm thought that sending lots of smaller packets would succeed more often, but then comes the computational overhead of using more processing power (or at least thats what I'm assuming). The question is basically this; what is the best scenario for sending the maximum successful packets and keeping computation down