R: regular expression to specify end of string char is a letter

17,079

Solution 1

Use Positive lookahead

> string = c("Hello-", "HelloA", "Helloa")
> grep('Hello(?=[A-Za-z])', string, perl=T)
[1] 2 3

(?=[A-Za-z]) this positive lookahead asserts that the character following the string Hello must be a letter.

OR

> grep('Hello[A-Za-z]', string)
[1] 2 3

Add a $ in the regex if there is only one letter following the string Hello. $ Asserts that we are at the end.

> grep('Hello[A-Za-z]$', string)
[1] 2 3
> grep('Hello(?=[A-Za-z]$)', string, perl=T)
[1] 2 3

Solution 2

The "$" is the symbol for the end of the string, so you need to remove.

string = c("Hello-", "HelloA", "Helloa")
grep("Hello[A-z]", string)
#[1] 2 3
 ?regex  # to my memory of the "alpha" version of the character class 

grep("Hello[[:alpha:]]", string)
#[1] 2 3

The second one is preferable because "A-z" can be ambiguous or misleading in locales where that is not a correct definition of the collation order of characters for "alphabetic".

Share:
17,079
Adrian
Author by

Adrian

Updated on June 16, 2022

Comments

  • Adrian
    Adrian almost 2 years
        string = c("Hello-", "HelloA", "Helloa")
        grep("Hello$[A-z]", string)
    

    I wish to find the indices of the strings in which the next character after the word "Hello" is a letter (case insensitive). The code above doesn't work, but I would like grep() to return indices 2 and 3 since those words have a letter after "Hello"

  • Avinash Raj
    Avinash Raj over 9 years
    You could add a case insensitive modifier like grep('Hello(?i)[A-Z]', string, perl=T)
  • Laxmi Agarwal
    Laxmi Agarwal over 2 years
    hey! I want to extract the word which comes after operating System : and before Ultra in a string . How do i write a regex for this? This is in the middle of a string.
  • Avinash Raj
    Avinash Raj over 2 years
    @LaxmiAgarwal ask it as a new question.