PHP, preg_split() and utf-8
Sunday, 20 November 2005This is just a quickie for anyone who’s battling with the UTF-8 support (or lack thereof) in PHP4.
According to the online docs the PCRE family of functions can be made UTF-8 aware by adding a u modifier to the pattern you’re using. Since the standard explode function doesn’t support UTF-8, you might think of using the preg_split function like this to split a UTF-8 string into an array of characters:
[php] $characters = preg_split(’//u’, $source); [/php]
Unfortunately it seems that preg_split is the only PCRE function that doesn’t support the u modifier. Instead, you’ll have to use preg_match_all like this to get the same effect:
[php] preg_match_all(’/./u’, $source, $matches); $characters = $matches[0]; [/php]






That helped me, thanks.
JL | Tuesday, 21 February 2006 | 6:33 pmThat helped me, thanks.
Thanks for this! You saved me from getting crazy!!!
Fabio Varesano | Tuesday, 23 October 2007 | 12:00 pmThanks for this! You saved me from getting crazy!!!
Yet another "thank you"
Sergey | Thursday, 21 May 2009 | 2:13 pmYet another “thank you”
Interesting. This works for me. Does it not
Trevor | Saturday, 19 September 2009 | 1:39 amInteresting. This works for me. Does it not work for you?
$chars = preg_split(’//u’,$unistr);