PHP, preg_split() and utf-8
Sunday, 20 November 2005This is just a quickie for anyone who’s battling with the UTF-8 support (or lack thereof) in PHP4.
According to the online docs the PCRE family of functions can be made UTF-8 aware by adding a u modifier to the pattern you’re using. Since the standard explode function doesn’t support UTF-8, you might think of using the preg_split function like this to split a UTF-8 string into an array of characters:
[php] $characters = preg_split(’//u’, $source); [/php]
Unfortunately it seems that preg_split is the only PCRE function that doesn’t support the u modifier. Instead, you’ll have to use preg_match_all like this to get the same effect:
[php] preg_match_all(’/./u’, $source, $matches); $characters = $matches[0]; [/php]





