Find string in data.frame
Solution 1
What about:
which(df == "horse", arr.ind = TRUE)
# row col
# [1,] 2 1
# [2,] 3 1
# [3,] 5 2
# [4,] 4 3
Solution 2
Another way around:
l <- sapply(colnames(df), function(x) grep("horse", df[,x]))
$animal
[1] 2 3
$level
[1] 5
$length
[1] 4
If you want the output to be matrix:
sapply(l,'[',1:max(lengths(l)))
animal level length
[1,] 2 5 4
[2,] 3 NA NA
Solution 3
Another way to do it is the following:
library(data.table)
library(zoo)
library(dplyr)
library(timeDate)
library(reshape2)
data frame name = tbl_account
first,Transpose it :
temp = t(tbl_Account)
Then, put it in to a list :
temp = list(temp)
This essentially puts every single observation in a data frame in to one massive string, allowing you to search the whole data frame in one go.
then do the searching :
temp[[1]][grep("Horse",temp[[1]])] #brings back the actual value occurrences
grep("Horse", temp[[1]]) # brings back the position of the element in a list it occurs in
hope this helps :)
Jonas Lindeløv
Former researcher. These days I'm mostly using R and SQL but I have an extended history with Python and Matlab. Also did some projects in HTML5 (javascript) and PHP. Linux fan pressured to use Windows.
Updated on October 05, 2020Comments
-
Jonas Lindeløv over 3 years
How do I search for a string in a data.frame? As a minimal example, how do I find the locations (columns and rows) of 'horse' in this data.frame?
> df = data.frame(animal=c('goat','horse','horse','two', 'five'), level=c('five','one','three',30,'horse'), length=c(10, 20, 30, 'horse', 'eight')) > df animal level length 1 goat five 10 2 horse one 20 3 horse three 30 4 two 30 horse 5 five horse eight
... so row 4 and 5 have the wrong order. Any output that would allow me to identify that 'horse' has shifted to the
level
column in row 5 and to thelength
column in row 4 is good. Maybe:> magic_function(df, 'horse') col row 'animal', 2 'animal', 3 'length', 4 'level', 5
Here's what I want to use this for: I have a very large data frame (around 60 columns, 20.000 rows) in which some columns are messed up for some rows. It's too large to eyeball in order to identify the different ways that order can be wrong, so searching would be nice. I will use this info to move data to the correct columns for these rows.