c# Fastest way to compare strings
Solution 1
string
s operator equals does the length check before comparing the chars. So you do not save the comparison of the contents with this trick. You might still save a few CPU cycles because your length check assumes that the strings are not null, while the BCL must check that. So if the lengths are not equal most of the time, you will short-circuit a few instructions.
I might just be wrong here, though. Maybe the operator gets inlined and the checks optimized out. Who knows for sure? (He who measures.)
If you care about saving every cycle you can you should probably use a different strategy in the first place. Maybe managed code is not even the right choice. Given that, I recommend to use the shorter form and not use the additional check.
Solution 2
String.Equality Operator or ==
internally calls string.Equals
, so use string.Equals
or ==
provided by the framework. It is already optimized enough.
It first compare references, then length and then actual characters.
You can find the source code here
Code: (Source: http://www.dotnetframework.org/default.aspx/4@0/4@0/DEVDIV_TFS/Dev10/Releases/RTMRel/ndp/clr/src/BCL/System/String@cs/1305376/String@cs)
// Determines whether two strings match.
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public override bool Equals(Object obj) {
if (this == null) //this is necessary to guard against reverse-pinvokes and
throw new NullReferenceException(); //other callers who do not use the callvirt instruction
String str = obj as String;
if (str == null)
return false;
if (Object.ReferenceEquals(this, obj))
return true;
return EqualsHelper(this, str);
}
and
[System.Security.SecuritySafeCritical] // auto-generated
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
private unsafe static bool EqualsHelper(String strA, String strB)
{
Contract.Requires(strA != null);
Contract.Requires(strB != null);
int length = strA.Length;
if (length != strB.Length) return false;
fixed (char* ap = &strA.m_firstChar) fixed (char* bp = &strB.m_firstChar)
{
char* a = ap;
char* b = bp;
// unroll the loop
#if AMD64
// for AMD64 bit platform we unroll by 12 and
// check 3 qword at a time. This is less code
// than the 32 bit case and is shorter
// pathlength
while (length >= 12)
{
if (*(long*)a != *(long*)b) break;
if (*(long*)(a+4) != *(long*)(b+4)) break;
if (*(long*)(a+8) != *(long*)(b+8)) break;
a += 12; b += 12; length -= 12;
}
#else
while (length >= 10)
{
if (*(int*)a != *(int*)b) break;
if (*(int*)(a+2) != *(int*)(b+2)) break;
if (*(int*)(a+4) != *(int*)(b+4)) break;
if (*(int*)(a+6) != *(int*)(b+6)) break;
if (*(int*)(a+8) != *(int*)(b+8)) break;
a += 10; b += 10; length -= 10;
}
#endif
// This depends on the fact that the String objects are
// always zero terminated and that the terminating zero is not included
// in the length. For odd string sizes, the last compare will include
// the zero terminator.
while (length > 0)
{
if (*(int*)a != *(int*)b) break;
a += 2; b += 2; length -= 2;
}
return (length <= 0);
}
}
Solution 3
My test results
Compare 10000 strings to 10000 other strings all the same length (256)
Time (s1 == s2): 32536889 ticks
Time (s1.Length == s2.Length) && (s1 == s2): 37380529 ticks
Compare 10000 strings to 10000 other strings Random length max 256
Time (s1 == s2): 27223517 ticks
Time (s1.Length == s2.Length) && (s1 == s2): 23419529 ticks
Compare 10000 strings to 10000 other strings Random length min 256 max 512
Time (s1 == s2): 28904898 ticks
Time (s1.Length == s2.Length) && (s1 == s2): 25442710 ticks
What I find mind boggling is that a compare of 10000 equal length strings will take longer than comparing the same amount of data that is larger.
All these test have been done with exactly the same data.
Solution 4
For the geeks among us, here's a page which does a great job at benchmarking numerous ways to compare strings.
In a nutshell, the fastest method appears to be the CompareOrdinal:
if (string.CompareOrdinal(stringsWeWantToSeeIfMatches[x], stringsWeAreComparingAgainst[x]) == 0)
{
//they're equal
}
The second best way seems to be using either a Dictionary or Hashset with the "key" as the string you want to compare.
Makes for an interesting read.
Solution 5
According ILSpy, the string ==
operator is defined as:
public static bool operator ==(string a, string b)
{
return string.Equals(a, b);
}
Which is defined as
public static bool Equals(string a, string b)
{
return a == b || (a != null && b != null && a.Length == b.Length && string.EqualsHelper(a, b));
}
I assume that first a == b
is actually a reference equality check (ILSpy is just rendering it as ==
), otherwise this would be an infinitely recursive method.
This means that ==
already checks the lengths of the strings before actually comparing their characters.
CoolCodeBro
Updated on July 21, 2022Comments
-
CoolCodeBro almost 2 years
I've noticed that
string1.Length == string2.Length && string1 == string2
on large strings is slightly faster than just
string1 == string2
Is this true? And is this a good practice to compare large string lengths before comparing actual strings?