data-munging | 易学教程

openxlsx::write.xlsx overwriting existing worksheet instead append

阅读更多关于 openxlsx::write.xlsx overwriting existing worksheet instead append

问题 The openxlsx::write.xlsx function is overwriting spreadsheet instead of adding another tab. I tried do follow some orientations of Stackoverflow, but without sucess. dt.escrita <- format(Sys.time(), '%Y%m%d%H%M%S') write.xlsx( tbl.messages ,file = paste('.\\2_Datasets\\messages_',dt.escrita,'.xlsx') ,sheetName = format(Sys.time(), '%d-%m-%y') ,append = FALSE) write.xlsx( tbl.dic.dados ,file = paste('.\\2_Datasets\\messages_',dt.escrita,'.xlsx') ,sheetName = 'Dicionario_Dados' ,append = TRUE)

wide to long data table transformation with variables in columns and rows

阅读更多关于 wide to long data table transformation with variables in columns and rows

问题 I have a csv with multiple tables with variables stored in both rows and columns. About this csv: I'd want to go "wide" to "long" There are multiple "data frames" in one csv There are different types of variables for each "data frames" > df3 V1 V2 V3 V4 V5 V6 V7 V8 1 nyc 123 main st month 1 2 3 4 5 2 nyc 123 main st x 58568 567567 567909 35876 56943 3 nyc 123 main st y 5345 3673 3453 3467 788 4 nyc 123 main st z 53223 563894 564456 32409 56155 5 6 la 63 main st month 1 2 3 4 5 7 la 63 main st

melt column by substring of the columns name in pandas (python)

阅读更多关于 melt column by substring of the columns name in pandas (python)

问题 I have dataframe: subject A_target_word_gd A_target_word_fd B_target_word_gd B_target_word_fd subject_type 1 1 2 3 4 mild 2 11 12 13 14 moderate And I want to melt it to a dataframe that will look: cond subject subject_type value_type value A 1 mild gd 1 A 1 mild fg 2 B 1 mild gd 3 B 1 mild fg 4 A 2 moderate gd 11 A 2 moderate fg 12 B 2 moderate gd 13 B 2 moderate fg 14 ... ... Meaning, to melt based on the delimiter of the columns name. What is the best way to do that? 回答1: One more approach

melt column by substring of the columns name in pandas (python)

阅读更多关于 melt column by substring of the columns name in pandas (python)

Retaining the previous date in R

阅读更多关于 Retaining the previous date in R

问题 I got stuck at a fairly easy data munging task. I have a transactional data frame in R that resembles this one: id<-c(11,11,22,22,22) dates<-as.Date(c('2013-11-15','2013-11-16','2013-11-15','2013-11-16','2013-11-17'), "%Y-%m-%d") example<-data.frame(id=id,dates=dates) id dates 1 11 2013-11-15 2 11 2013-11-16 3 22 2013-11-15 4 22 2013-11-16 5 22 2013-11-17 I'm looking for a way to retain the date of the previous transaction. The resulting table would look like this: previous_dates<-as.Date(c('

Retaining the previous date in R

阅读更多关于 Retaining the previous date in R

Expanding pandas Data Frame rows based on number and group ID (Python 3).

阅读更多关于 Expanding pandas Data Frame rows based on number and group ID (Python 3).

问题 I have been struggling with finding a way to expand/clone observation rows based on a pre-determined number and a grouping variable (id). For context, here is an example data frame using pandas and numpy (python3). df = pd.DataFrame([[1, 15], [2, 20]], columns = ['id', 'num']) df Out[54]: id num 0 1 15 1 2 20 I want to expand/clone the rows by the number given in the "num" variable based on their ID group. In this case, I would want 15 rows for id = 1 and 20 rows for id = 2. This is probably

How to efficiently rearrange pandas data as follows?

阅读更多关于 How to efficiently rearrange pandas data as follows?

问题 I need some help with a concise and first of all efficient formulation in pandas of the following operation: Given a data frame of the format id a b c d 1 0 -1 1 1 42 0 1 0 0 128 1 -1 0 1 Construct a data frame of the format: id one_entries 1 "c d" 42 "b" 128 "a d" That is, the column "one_entries" contains the concatenated names of the columns for which the entry in the original frame is 1. 回答1: Here's one way using boolean rule and applying lambda func. In [58]: df Out[58]: id a b c d 0 1 0

How to efficiently rearrange pandas data as follows?

阅读更多关于 How to efficiently rearrange pandas data as follows?

How to convert a python datetime.datetime to excel serial date number

阅读更多关于 How to convert a python datetime.datetime to excel serial date number

问题 I need to convert dates into Excel serial numbers for a data munging script I am writing. By playing with dates in my OpenOffice Calc workbook, I was able to deduce that '1-Jan 1899 00:00:00' maps to the number zero. I wrote the following function to convert from a python datetime object into an Excel serial number: def excel_date(date1): temp=dt.datetime.strptime('18990101', '%Y%m%d') delta=date1-temp total_seconds = delta.days * 86400 + delta.seconds return total_seconds However, when I try