2 years ago
#49450
Chris
If with two conditions involving is.na() returning unexpected output in R tibble?
Please forgive me but I can't figure out how to create a tibble (could only create data.frame) so I"m pasting my data (I know, bad form) enter code here.
# A tibble: 13 x 2
`Date CADD Rec'd` `CADD Completed`
<dttm> <dttm>
1 2015-01-20 00:00:00 2015-01-19 00:00:00
2 2015-01-29 00:00:00 2017-04-16 00:00:00
3 2016-12-21 00:00:00 2017-12-20 00:00:00
4 2017-01-03 00:00:00 2018-01-03 00:00:00
5 2017-01-03 00:00:00 2017-01-03 00:00:00
6 2021-08-23 00:00:00 NA
7 2021-08-23 00:00:00 2021-12-15 00:00:00
8 2021-08-23 00:00:00 2021-11-23 00:00:00
9 2021-12-25 00:00:00 2021-12-27 00:00:00
10 2022-01-02 00:00:00 NA
11 2022-01-02 00:00:00 NA
12 2022-01-03 00:00:00 2022-01-04 00:00:00
13 2022-01-07 00:00:00 NA
The desired output is " if second column date is before the first column then set the first column equal to the second column. If it is greater than or NA then do not modify first column. I can get it to work but am thrown an error "the condition has length > 1 and only the first element will be used" Does the first element refer to the first row or the first test in my two conditionals (the is.na() and < tests) ? I still do not know how when I run this I get the desired change on the first row but then all the NAs in second column are also behaving unexpectedly (whether I use is.na(issue) or !is.na(issue) the NAs are forced onto first column.
issue = "Date CADD Rec'd"
complete = "CADD Completed"
if ( ( ( is.na(df[complete]) ) & ( df[complete] < df[issue] ) ) ) {
df[issue] <- df[complete] } else {
df[issue] <- df[issue]
}
Resulting in this:
# A tibble: 13 x 2
`Date CADD Rec'd` `CADD Completed`
<dttm> <dttm>
1 2015-01-19 00:00:00 2015-01-19 00:00:00
2 2017-04-16 00:00:00 2017-04-16 00:00:00
3 2017-12-20 00:00:00 2017-12-20 00:00:00
4 2018-01-03 00:00:00 2018-01-03 00:00:00
5 2017-01-03 00:00:00 2017-01-03 00:00:00
6 NA NA
7 2021-12-15 00:00:00 2021-12-15 00:00:00
8 2021-11-23 00:00:00 2021-11-23 00:00:00
9 2021-12-27 00:00:00 2021-12-27 00:00:00
10 NA NA
11 NA NA
12 2022-01-04 00:00:00 2022-01-04 00:00:00
13 NA NA
I'm at a loss as to what's happening with how NA is working or perhaps my [lack] of understanding what the error code is and how it applies. The result I expect is only the first row will have the date in first column change and the 4 NAs will not be forced into first column (they'll preserve their values since second column is NA). I don't understand I'm getting opposite affects with is.na() either.
r
na
0 Answers
Your Answer