Parsing string C#

20,173

Solution 1

Try this:

class Program
    {
        static void Main(string[] args)
        {
            var inString = LireFichier(@"C:\temp\file.txt");
            Console.WriteLine(ParseString(inString));
            Console.ReadKey();
        }

        public static string LireFichier(string FilePath) //Read the file, send back a string with the text
        {
            using (StreamReader streamReader = new StreamReader(FilePath))
            {
                string text = streamReader.ReadToEnd();
                streamReader.Close();
                return text;
            }
        }

        public static string ParseString(string input)
        {
            input = input.Replace(Environment.NewLine,string.Empty);
            input = input.Replace(" ", string.Empty);
            string[] chunks = input.Split(',');
            StringBuilder sb = new StringBuilder();
            foreach (string s in chunks)
            {
                sb.Append(s);
                sb.Append(";");
            }
            return sb.ToString(0, sb.ToString().Length - 1);
        }
    }

Or this:

public static string ParseFile(string FilePath)
{
    using (var streamReader = new StreamReader(FilePath))
    {
        return streamReader.ReadToEnd().Replace(Environment.NewLine, string.Empty).Replace(" ", string.Empty).Replace(',', ';');
    }
}

Solution 2

I think your main issue is

  string[] words = phrase.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);

Solution 3

You could try using the string splitting option to remove empty entries for you:

string[] words = phrase.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);

See the documentation here.

Solution 4

Your main problem is that you are splitting on \n, but the linebreaks read from your file are \r\n.

You output string does contain all of your items, but the \r characters left in it cause later "lines" to overwrite earlier "lines" on the console.

(\r is a "return to start of line" instruction; without the \n "move to the next line" instruction your words from line 1 are being overwritten by those in line 2, then line 3 and line 4.)

As well as splitting on \r as well as \n, you need to check a string is not null or empty before adding it to your output (or, preferably, use StringSplitOptions.RemoveEmptyEntries as others have mentioned).

Share:
20,173
WizLiz
Author by

WizLiz

Degree in computer science (French Baccalauréat +3 years). Currently training myself with all Microsoft technologies : C# .NETv4.5, ADO.NET, asp.NET, EntityFramework, WPF Applications and Silverlight

Updated on July 05, 2022

Comments

  • WizLiz
    WizLiz almost 2 years

    So here is my problem, I'm trying to get the content of a text file as a string, then parse it. What I want is a tab containing each word and only words (no blank, no backspace, no \n ...) What I'm doing is using a function LireFichier that send me back the string containing the text from the file (works fine because it's displayed correctly) but when I try to parse it fails and start doing random concatenation on my string and I don't get why. Here is the content of the text file I'm using :

    truc,
    ohoh,
    toto, tata, titi, tutu,
    tete,
    

    and here's my final string :

    ;tete;;titi;;tata;;titi;;tutu;
    

    which should be:

    truc;ohoh;toto;tata;titi;tutu;tete;
    

    Here is the code I wrote (all using are ok):

    namespace ConsoleApplication1{
    
    class Program
    {
        static void Main(string[] args)
        {
            string chemin = "MYPATH";
            string res = LireFichier(chemin);
            Console.WriteLine("End of reading...");
            Console.WriteLine("{0}",res);// The result at this point is good
            Console.WriteLine("...starting parsing");
            res = parseString(res);
            Console.WriteLine("Chaine finale : {0}", res);//The result here is awfull
            Console.ReadLine();//pause
        }
    
        public static string LireFichier(string FilePath) //Read the file, send back a string with the text
        {
            StreamReader streamReader = new StreamReader(FilePath);
            string text = streamReader.ReadToEnd();
            streamReader.Close();
            return text;
        }
    
        public static string parseString(string phrase)//is suppsoed to parse the string
        {
            string fin="\n";
            char[] delimiterChars = { ' ','\n',',','\0'};
            string[] words = phrase.Split(delimiterChars);
    
            TabToString(words);//I check the content of my tab
    
            for(int i=0;i<words.Length;i++)
            {
                if (words[i] != null)
                {
                    fin += words[i] +";";
                    Console.WriteLine(fin);//help for debug
                }
            }
            return fin;
        }
    
        public static void TabToString(string[] montab)//display the content of my tab
        {
            foreach(string s in montab)
            {
                Console.WriteLine(s);
            }
        }
    }//Fin de la class Program
    }
    
  • WizLiz
    WizLiz about 12 years
    Actually that nearly made the trick, it overcomes the issue with the double ;; in the final string but there are still mistake like some word missing from the txt file :
  • Rawling
    Rawling about 12 years
    @WizardLizard See mine or StaWho's answers for the missing words problem.
  • WizLiz
    WizLiz about 12 years
    @Henk Holterman: I still dont get why "truc" and "ohoh" are disapearing and "tete" place itself at the begining of the string
  • WizLiz
    WizLiz about 12 years
    @Rawling Thanks a lot StaWho gave me the solution. To Everyone Thanks for fast answer
  • WizLiz
    WizLiz about 12 years
    That made the tricks, thanks a lot, i'm studying what you did at the moment
  • Rawling
    Rawling about 12 years
    @Wiz If this is the answer that helped you, mark it as accepted by clicking the grey check mark near the voting buttons. This will give the author some reputation. It will also make his answer more visible relative to the other, highly-voted but incomplete, answers, so it is more likely other people will give him some rep too.