Convert string with explicit escape sequence into relative character

16,904

Solution 1

I think that you must write such function yourself since escape characters is a compile-time feature, i.e. when you write "\n" the compiler would replace the \n sequence with the eol character. The resulting string is of length 1 (excluding the terminating zero character).

In your case a string "\\n" is of length 2 (again excluding terminating zero) and contains \ and n.

You need to scan your string and when encountering \ check the following char. if it is one of the legal escapes, you should replace both of them with the corresponding character, otherwise skip or leave them both as is.

( http://ideone.com/BvcDE ):

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}

Solution 2

You can do that fairly easy, using the boost string algorithm library. For example:

#include <string>
#include <iostream>
#include <boost/algorithm/string.hpp>

void escape(std::string& str)
{
  boost::replace_all(str, "\\\\", "\\");
  boost::replace_all(str, "\\t",  "\t");
  boost::replace_all(str, "\\n",  "\n");
  // ... add others here ...
}

int main()
{
  std::string str = "This\\tis\\n \\\\a test\\n123";

  std::cout << str << std::endl << std::endl;
  escape(str);
  std::cout << str << std::endl;

  return 0;
}

This is surely not the most efficient way to do this (because it iterates the string multiple times), but it is compact and easy to understand.

Update: As ybungalobill has pointed out, this implementation will be wrong, whenever a replacement string produces a character sequence, that a later replacement is searching for or when a replacement removes/modifies a character sequence, that should have been replaced.

An example for the first case is "\\\\n" -> "\\n" -> "\n". When you put the "\\\\" -> "\\" replacement last (which seems to be the solution at a first glance), you get an example for the latter case "\\\\n" -> "\\\n". Obviously there is no simple solution to this problem, which makes this technique only feasible for very simple escape sequences.

If you need a generic (and more efficient) solution, you should implement a state machine that iterates the string, as proposed by davka.

Solution 3

I'm sure that there is, written by someone, but it's so trivial that I doubt it's been specifically published anywhere.

Just recreate it yourself from the various "find"/"replace"-esque algorithms in the standard library.

Share:
16,904

Related videos on Youtube

Michele De Pascalis
Author by

Michele De Pascalis

Updated on May 28, 2022

Comments

  • Michele De Pascalis
    Michele De Pascalis almost 2 years

    I need a function to convert "explicit" escape sequences into the relative non-printable character. Es:

    char str[] = "\\n";
    cout << "Line1" << convert_esc(str) << "Line2" << endl:
    

    would give this output:

    Line1
    
    Line2
    

    Is there any function that does this?

  • Michele De Pascalis
    Michele De Pascalis about 13 years
    Infact I was wondering if that function had been written already.
  • Lightness Races in Orbit
    Lightness Races in Orbit about 13 years
    How would you use printf to do this?
  • ollb
    ollb about 13 years
    I think printf can't do this, because the replacement of escape sequences is done by the compiler and not by printf.
  • Lightness Races in Orbit
    Lightness Races in Orbit about 13 years
    @Gnafoo: Actually it's the pre-processor, if you're talking about string literals. But that's still completely irrelevant to the question, which involves translating a string {'\\', 'n'} into a string {'\n'}.
  • davka
    davka about 13 years
    @Glaedr: I am sure it's been written hundred of times, but it's too simple to spend too much time looking. I'd make a quick google search and if I don't find right away I'd write it myself, it's a nice exercise. if you copy the result to a new string it is even simpler as you don't need to shrink your original string and you can accept const input
  • Michele De Pascalis
    Michele De Pascalis about 13 years
    All right, I set up a function like the one above, and everything works, thanks!
  • Yakov Galka
    Yakov Galka almost 11 years
    This is completely wrong. escape("\\\\t") will return "\t" instead of "\\t".
  • ollb
    ollb almost 11 years
    @ybungalobill Thanks for your comment. I've updated the answer, so that it also points out the limitations of the technique.
  • Brent Bradburn
    Brent Bradburn about 7 years
    You could similarly do this by generating a little C++ program to print the string, compiling, and running it. That approach is probably a little heavier though.
  • Algoman
    Algoman over 5 years
    what about \x ?
  • v.oddou
    v.oddou about 5 years
    "It's too simple" : it's never simple. when you reinvent the wheel you almost always break canonicalization. there are always obscure features in specifications. for example how will you handle the \u unicode sequences, or the IO/OS specific \n that can get hijacked by the OS especially in file streams in text mode...