Parsing string C#
Solution 1
Try this:
class Program
{
static void Main(string[] args)
{
var inString = LireFichier(@"C:\temp\file.txt");
Console.WriteLine(ParseString(inString));
Console.ReadKey();
}
public static string LireFichier(string FilePath) //Read the file, send back a string with the text
{
using (StreamReader streamReader = new StreamReader(FilePath))
{
string text = streamReader.ReadToEnd();
streamReader.Close();
return text;
}
}
public static string ParseString(string input)
{
input = input.Replace(Environment.NewLine,string.Empty);
input = input.Replace(" ", string.Empty);
string[] chunks = input.Split(',');
StringBuilder sb = new StringBuilder();
foreach (string s in chunks)
{
sb.Append(s);
sb.Append(";");
}
return sb.ToString(0, sb.ToString().Length - 1);
}
}
Or this:
public static string ParseFile(string FilePath)
{
using (var streamReader = new StreamReader(FilePath))
{
return streamReader.ReadToEnd().Replace(Environment.NewLine, string.Empty).Replace(" ", string.Empty).Replace(',', ';');
}
}
Solution 2
I think your main issue is
string[] words = phrase.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);
Solution 3
You could try using the string splitting option to remove empty entries for you:
string[] words = phrase.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);
See the documentation here.
Solution 4
Your main problem is that you are splitting on \n
, but the linebreaks read from your file are \r\n
.
You output string does contain all of your items, but the \r
characters left in it cause later "lines" to overwrite earlier "lines" on the console.
(\r
is a "return to start of line" instruction; without the \n
"move to the next line" instruction your words from line 1 are being overwritten by those in line 2, then line 3 and line 4.)
As well as splitting on \r
as well as \n
, you need to check a string is not null or empty before adding it to your output (or, preferably, use StringSplitOptions.RemoveEmptyEntries
as others have mentioned).
WizLiz
Degree in computer science (French Baccalauréat +3 years). Currently training myself with all Microsoft technologies : C# .NETv4.5, ADO.NET, asp.NET, EntityFramework, WPF Applications and Silverlight
Updated on July 05, 2022Comments
-
WizLiz almost 2 years
So here is my problem, I'm trying to get the content of a text file as a string, then parse it. What I want is a tab containing each word and only words (no blank, no backspace, no \n ...) What I'm doing is using a function
LireFichier
that send me back the string containing the text from the file (works fine because it's displayed correctly) but when I try to parse it fails and start doing random concatenation on my string and I don't get why. Here is the content of the text file I'm using :truc, ohoh, toto, tata, titi, tutu, tete,
and here's my final string :
;tete;;titi;;tata;;titi;;tutu;
which should be:
truc;ohoh;toto;tata;titi;tutu;tete;
Here is the code I wrote (all using are ok):
namespace ConsoleApplication1{ class Program { static void Main(string[] args) { string chemin = "MYPATH"; string res = LireFichier(chemin); Console.WriteLine("End of reading..."); Console.WriteLine("{0}",res);// The result at this point is good Console.WriteLine("...starting parsing"); res = parseString(res); Console.WriteLine("Chaine finale : {0}", res);//The result here is awfull Console.ReadLine();//pause } public static string LireFichier(string FilePath) //Read the file, send back a string with the text { StreamReader streamReader = new StreamReader(FilePath); string text = streamReader.ReadToEnd(); streamReader.Close(); return text; } public static string parseString(string phrase)//is suppsoed to parse the string { string fin="\n"; char[] delimiterChars = { ' ','\n',',','\0'}; string[] words = phrase.Split(delimiterChars); TabToString(words);//I check the content of my tab for(int i=0;i<words.Length;i++) { if (words[i] != null) { fin += words[i] +";"; Console.WriteLine(fin);//help for debug } } return fin; } public static void TabToString(string[] montab)//display the content of my tab { foreach(string s in montab) { Console.WriteLine(s); } } }//Fin de la class Program }
-
WizLiz about 12 yearsActually that nearly made the trick, it overcomes the issue with the double ;; in the final string but there are still mistake like some word missing from the txt file :
-
Rawling about 12 years@WizardLizard See mine or StaWho's answers for the missing words problem.
-
WizLiz about 12 years@Henk Holterman: I still dont get why "truc" and "ohoh" are disapearing and "tete" place itself at the begining of the string
-
WizLiz about 12 years@Rawling Thanks a lot StaWho gave me the solution. To Everyone Thanks for fast answer
-
WizLiz about 12 yearsThat made the tricks, thanks a lot, i'm studying what you did at the moment
-
Rawling about 12 years@Wiz If this is the answer that helped you, mark it as accepted by clicking the grey check mark near the voting buttons. This will give the author some reputation. It will also make his answer more visible relative to the other, highly-voted but incomplete, answers, so it is more likely other people will give him some rep too.