Check if a string contains a word but only in specific position?
Solution 1
I like Sertac's idea
about deleting strings enclosed by brackets and searching for a string after that. Here is a code sample extended by a search for whole words and case sensitivity:
function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True;
ACaseSensitive: Boolean = False): Boolean;
var
S: string;
BracketEnd: Integer;
BracketStart: Integer;
SearchOptions: TStringSearchOptions;
begin
S := AText;
BracketEnd := Pos(']', S);
BracketStart := Pos('[', S);
while (BracketStart > 0) and (BracketEnd > 0) do
begin
Delete(S, BracketStart, BracketEnd - BracketStart + 1);
BracketEnd := Pos(']', S);
BracketStart := Pos('[', S);
end;
SearchOptions := [soDown];
if AWholeWord then
Include(SearchOptions, soWholeWord);
if ACaseSensitive then
Include(SearchOptions, soMatchCase);
Result := Assigned(SearchBuf(PChar(S), StrLen(PChar(S)), 0, 0, AWord,
SearchOptions));
end;
Here is an optimized version of the function, which uses pointer char iteration without string manipulation. In comparison with a previous version this handles the case when you have a string with missing closing bracket like for instance My [favorite color is
. Such string is there evaluated to True because of that missing bracket.
The principle is to go through the whole string char by char and when you find the opening bracket, look if that bracket has a closing pair for itself. If yes, then check if the substring from the stored position until the opening bracket contains the searched word. If yes, exit the function. If not, move the stored position to the closing bracket. If the opening bracket doesn't have own closing pair, search for the word from the stored position to the end of the whole string and exit the function.
For commented version of this code follow this link.
function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True;
ACaseSensitive: Boolean = False): Boolean;
var
CurrChr: PChar;
TokenChr: PChar;
TokenLen: Integer;
SubstrChr: PChar;
SubstrLen: Integer;
SearchOptions: TStringSearchOptions;
begin
Result := False;
if (Length(AText) = 0) or (Length(AWord) = 0) then
Exit;
SearchOptions := [soDown];
if AWholeWord then
Include(SearchOptions, soWholeWord);
if ACaseSensitive then
Include(SearchOptions, soMatchCase);
CurrChr := PChar(AText);
SubstrChr := CurrChr;
SubstrLen := 0;
while CurrChr^ <> #0 do
begin
if CurrChr^ = '[' then
begin
TokenChr := CurrChr;
TokenLen := 0;
while (TokenChr^ <> #0) and (TokenChr^ <> ']') do
begin
Inc(TokenChr);
Inc(TokenLen);
end;
if TokenChr^ = #0 then
SubstrLen := SubstrLen + TokenLen;
Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord,
SearchOptions));
if Result or (TokenChr^ = #0) then
Exit;
CurrChr := TokenChr;
SubstrChr := CurrChr;
SubstrLen := 0;
end
else
begin
Inc(CurrChr);
Inc(SubstrLen);
end;
end;
Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord,
SearchOptions));
end;
Solution 2
In regular expressions, there is a thing called look-around you could use. In your case you can solve it with negative lookbehind: you want "favorite" unless it's preceded with an opening bracket. It could look like this:
(?<!\[[^\[\]]*)favorite
Step by step: (?<!
is the negative lookbehind prefix, we're looking for \[
optionally followed by none or more things that are not closing or opening brackets: [^\[\]]*
, close the negative lookbehind with )
, and then favorite
right after.
Admin
Updated on June 09, 2022Comments
-
Admin almost 2 years
How can I check if a string contains a substring, but only in a specific position?
Example string:
What is your favorite color? my [favorite] color is blue
If I wanted to check if the string contained a specific word I usually do this:
var S: string; begin S := 'What is your favorite color? my [favorite] color is blue'; if (Pos('favorite', S) > 0) then begin // end; end;
What I need is to determine if the word favorite exists in the string, ignoring though if it appears inside the [ ] symbols, which the above code sample clearly does not do.
So if we put the code into a boolean function, some sample results would look like this:
TRUE: What is your favorite color? my [my favorite] color is blue
TRUE: What is your favorite color? my [blah blah] color is blue
FALSE: What is your blah blah color? my [some favorite] color is blue
The first two samples above are true because the word favorite is found outside of the [ ] symbols, whether it is inside them or not.
The 3rd sample is false because even though there is the word favorite, it only appears inside the [ ] symbols - we should only check if it exists outside of the symbols.
So I need a function to determine whether or not a word (favorite in this example) appears in a string, but ignoring the fact if the word is surrounded inside [ ] symbols.
-
Admin over 11 yearsGreat answer, especially useful is the link to the answer with comments, makes it a little easier to digest and understand what is happening.
-
TLama over 11 yearsThanks! Anyway, regex is the right way to do what you need (and surely easier), but on the other hand, this is more straight just to this specific task (and more efficient I'd say, since regex at least needs to parse the expression before starts to match). I'd say, if you're not going to build some parser for instance, where would you have many similar tasks like this match, then this solution might be lighter than including regex. But the main reason, why I've posted this is that none of the answers here used pure Delphi.
-
diegoaguilar almost 11 yearsI think yours is an elegant and proper solution