scanf regex - C

27,512

Solution 1

scanf allows regular expressions as far as I know

Unfortunately, it does not allow regular expressions: the syntax is misleadingly close, but there is nothing even remotely similar to the regex in the implementation of scanf. All that's there is a support for character classes of regex, so %[<something>] is treated implicitly as [<something>]*. That's why your call of scanf translates into read a string consisting of characters other than '(', ')', 'x', and '\n'.

To solve your problem at hand, you can set up a loop that read the input character by character. Every time you get a '\n', check that

  • You have at least three characters in the input that you've seen so far,
  • That the character immediately before '\n' is an 'x', and
  • That the character before the 'x' is another '\n'

If all of the above is true, you have reached the end of your anticipated input sequence; otherwise, your loop should continue.

Solution 2

scanf does not support regular expressions. It has limited support for character classes but that's not at all the same thing.

Never use scanf, fscanf, or sscanf, because:

  1. Numeric overflow triggers undefined behavior. The C runtime is allowed to crash your program just because someone typed too many digits.
  2. Some format specifiers (notably %s) are unsafe in exactly the same way gets is unsafe, i.e. they will cheerfully write past the end of the provided buffer and crash your program.
  3. They make it extremely difficult to handle malformed input robustly.

You don't need regular expressions for this case; read a line at a time with getline and stop when the line read is just "x". However, the standard (not ISO C, but POSIX) regular expression library routines are called regcomp and regexec.

Share:
27,512
pasadinhas
Author by

pasadinhas

Computer Engineering student at Instituto Superior Técnico Github: @pasadinhas

Updated on March 28, 2020

Comments

  • pasadinhas
    pasadinhas about 4 years

    I needed to read a string until the following sequence is written: \nx\n :

    (.....)\n
    x\n
    

    \n is the new line character and (.....) can be any characters that may include other \n characters.

    scanf allows regular expressions as far as I know, but i can't make it to read a string untill this pattern. Can you help me with the scanf format string?


    I was trying something like:

    char input[50000];
    scanf(" %[^(\nx\n)]", input);
    

    but it doesn't work.

  • Peter Cordes
    Peter Cordes over 7 years
    Note that most (all?) real implementations of scanf (including on GNU systems) do not crash your program or do anything nasty on integer overflow. Discussion here suggests that the standard could be re-worded to require sane behaviour and probably no implementation would have to change. (specifically Keith Thompson's post). However, with the standard worded as it is, scanf on bogus input is only safe on "good" C implementations, and isn't portable.
  • Peter Cordes
    Peter Cordes over 7 years
    And even without UB, I agree with point 3: recovering from matching failures is often difficult.
  • Spikatrix
    Spikatrix about 7 years
    Note: Problem #2 can be avoided by using a length modifier.
  • zwol
    zwol about 7 years
    @CoolGuy I generally think, if you have to take extra, optional steps to avoid shooting yourself in the foot, it's a badly designed API.