» An ultimate catalog of computer data
Forum posts for df.exe
Exploding date range as row is R
I have a data set of the form:
df <- data.frame(var1 = c('1976-07-04' , '1980-07-04' , '1984-07-04' ),
var2 = c('d', 'e', 'f'),
freq = 1:3)
I can expand this data.frame very quickly using indexing by:
df.expanded <- df[rep(seq_len(nrow(df)), df$freq), ]
I however want to have create a sequence instead of a replicate on the date and have the freq tell me the length of the this. i.e for row 3 i can create the entries to fill the exploded data.frame with:
seq(as.Date('1984-7-4'), by = 'days', length = 3)
Can anyone suggest a fast method for doing this? My method is to use various lapply functions to do this
I used a combination of Gavin Simpson's answer and a previous idea for my solution.
ExtendedSeq <- function(df, freq.col, date.col, period = 'month') {
#' An R function to take a data fame that has a frequency col and explode the
#' the dataframe to have that number of rows and based on a sequence.
#' Args:
#' df: A data.frame to be exploded.
#' freq.col: A column variable indicating the number of replicates in the
#' new dataset to make.
#' date.col: A column variable indicating the name or position of the date
#' variable.
#' period: The periodicity to apply to the date.
# Replicate expanded data form
df.expanded <- df[rep(seq_len(nrow(df)), df[[freq.col]]), ]
DateExpand <- function(row, df.ex, freq, col.date, period) {
#' An inner functions to explode a data set and build out days sequence
#' Args:
#' row: Each row of a data set
#' df.ex: A data.frame, to expand
#' freq: Column indicating the number of replicates to make.
#' date: Column indicating the date variable
#' Output:
#' An exploded data set based on a sequence expansion of a date.
times <- df.ex[row, freq]
# period <- can edit in the future if row / data driven.
date.ex <- seq(df.ex[row, col.date], by = 'days', length = times)
return(date.ex)
}
dates <- lapply(seq_len(nrow(df)),
FUN = DateExpand,
df.ex = df,
freq = freq.col,
col.date = date.col,
period = period)
df.expanded[[date.col]] <- as.Date(unlist(dates), origin = '1970-01-01')
row.names(df.expanded) <- NULL
return(df.expanded)
}
Personally i dont like the way i need to covert the dates back from the list and supply an origin based on this conversion in case this changes in teh future, but i really appreciate the ideas and help
View complete forum thread with replies
Other posts related to df.exe
See Related Forum Messages: Follow the Links Below to View Complete Thread
R - Create subset data frame using variable
Exploding date range as row is R
Is there a better way to find the percent of one column that meets a criteria for each value…
Error: This name does not have a type, and must have an explicit type
Fortran program errors
Fortran “Error: The shapes of the array expressions do not conform.”