R - error "variable lengths differ"
This happens because in your first step you created a separate variable outside of your data frame, transLOT<-log(LengthofTimemin)
. When you remove a row from the data, transLOT
is unchanged. Even worse than differing lengths, your data doesn't line up any more - if the different lengths were ignored, your rows of data would be "off by one" compared to the response after the row you removed.
The simple solution is to create your transLOT
variable in the data frame. Then, whenever you do things to the data (like remove rows), the same thing is done to transLOT
.
resdata$transLOT <- log(resdata$LengthofTimemin)
Note that I also use the resdata$LengthofTimemin
rather than LengthofTimemin
which you seem to have in your workspace. Did you use attach()
at some point? You shouldn't use attach
for exactly this reason. Keep variables in the data frame!
Related videos on Youtube
Kelsey Spencer
Updated on January 19, 2020Comments
-
Kelsey Spencer almost 3 years
> #transforming length of time > transLOT<-log(LengthofTimemin) > > #checking for outliers > fit<-lm(transLOT~DielEnd+TideEnd+TideStart+Moonphase+TideStart*Moonphase, data=resdata) > outlierTest(fit) rstudent unadjusted p-value Bonferonni p 295 4.445284 1.1025e-05 0.0052808 > > #getting rid of the outlier data in row 295 > rdata<-resdata[-295, ] > print(rdata[294:296,5:10]) # A tibble: 3 × 6 DepartureDate DepartureTime LengthofTime LengthofTimemin EventLengthCategories <dttm> <dttm> <dttm> <dbl> <chr> 1 2016-09-19 1899-12-30 23:46:46 1899-12-30 00:05:49 5.816667 5-15 2 2016-09-20 1899-12-30 01:55:28 1899-12-30 00:01:20 1.333333 <5 3 2016-09-20 1899-12-30 04:07:28 1899-12-30 00:01:21 1.350000 <5 > newfit<-lm(transLOT~DielEnd+TideEnd+TideStart+Moonphase+TideStart*Moonphase, na.action=na.exclude, data=rdata) Error in model.frame.default(formula = transLOT ~ DielEnd + TideEnd + : variable lengths differ (found for 'DielEnd') > #now all of a sudden the variable lengths differ
I understand that the problem occurs with the removal of the row of data but I assumed that na.exclude would account for it. After thoroughly searching, I am unable to determine why this error is occurring.
-
Kelsey Spencer almost 6 yearsThank you very much! I did indeed use attach() earlier in the code. I had previously not been instructed against it, but from now on I will use $ to link my data frame.
-
Gregor Thomas almost 6 yearsther good options exist - you seem to be using
dplyr
already (at least yourrdata
is a tibble), somutate
is a nice alternative that both keeps data in your data frame and saves you from having to re-type the name of the data frame hundreds of times.