PHP mb_substr() not working correctly?

php string utf-8 multibyte mbstring

22,152

Solution 1

Try passing the encoding parameter to mb_substr, as such:

print mb_substr('éxxx', 0, 1, 'utf-8');

The encoding is never detected automatically.

Solution 2

In practice I've found that, in some systems, multi-byte functions default to ISO-8859-1 for internal encoding. That effectively ruins their ability to handle multi-byte text.

Setting a good default will probably fix this and some other issues:

mb_internal_encoding('UTF-8');

22,152

Author by

Alex

I'm still learning so I'm only here to ask questions :P

Updated on December 20, 2020

Comments

Alex over 3 years

This code

print mb_substr('éxxx', 0, 1);

prints an empty space :(

It is supposed to print the first character, é. This seems to work however:

print mb_substr('éxxx', 0, 2);

But it's not right, because (0, 2) means 2 characters...
Gromski over 11 years

The encoding is never detected automatically, it just always defaults to something.
Alvin Wong over 11 years

Could it be a better idea if you use mb_detect_encoding to actually try to detect the encoding?
Gromski over 11 years

@AlvinWong No. Know what encoding you're working with, there's no other way.
povilasp over 11 years

@Alvin Wong, that would be more correct, yes, but I could also say that using anything but utf-8 can be considered adventurous and marginal :)
povilasp over 11 years

@deceze, wasn't sure, but thanks for the clarification, I updated the answer.
Alex over 11 years

tx that works. Can mb_substr work like substr($string, 1) without giving it the mb_strlen() argument ?
povilasp over 11 years

@Alex, that I think is another question, but my guess would be that yes - because the parameter is optional as it is in substr.
Alex over 11 years

yes, but that UTF-8 thing has to go after that argument. Anyway nvm, I`ll just use mb_strlen ..
Alvin Wong over 11 years

OK, then how about mb_internal_encoding instead of passing "utf-8" to all mb_* functions? Just like Álvaro G. Vicario has pointed out
povilasp over 11 years

@AlvinWong is right, it's better to look to mb_internal_encoding if this is not only function usage and you are planning to use a lot of mb_* functions through out your code.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Related

mb_detect_encoding detects ASCII as UTF-8?

PHP and character encoding problem with Â character

PHP: mb_strtoupper not working

How do I find the number of bytes within UTF-8 string with PHP?

How to check if there are only spaces in string in PHP?

PHP: Converting Unicode strings to ANSI strings

Replacing invalid UTF-8 characters by question marks, mbstring.substitute_character seems ignored

php iconv translit for removing accents: not working as excepted?

Comparing UTF-8 String

call to undefined function mb_strimwidth