Swift UTF8 encoding and non UTF8 character
15,762
I've found a solution.
The UTF8 take 8 bit of table ASCII, and the UTF16 take 16 bit ASCII table, the solution is simple by modifying my function to:
func stringToUTF16String (stringaDaConvertire stringa: String) -> String {
let encodedData = stringa.dataUsingEncoding(NSUTF16StringEncoding)!
let attributedOptions = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]
let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!
//println(attributedString.string)
return attributedString.string
}
Author by
luca carboni
Updated on June 04, 2022Comments
-
luca carboni almost 2 years
I've a some text from json file. In this text I've applied UTF8 encode but this encoder don't recognize a non standard character
àèìòù
and it's capital char, is there a method to purify my string?My function:
func stringToUTF8String (stringaDaConvertire stringa: String) -> String { let encodedData = stringa.dataUsingEncoding(NSUTF8StringEncoding)! let attributedOptions = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType] let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)! //println(attributedString.string) return attributedString.string }
-
Yuming Cao about 9 yearsYes, this works, but I still don't know why dataUsingEncoding is not able to identify the character using UTF8StringEncoding. In my case, I verified my file is stored as UTF-8, so
encodedData
should contain the right content, my guess is that NSAttributedString uses UTF-16 encoding, after all that is the only encoding supported by NSString, the documentation is not clear about this though. -
samwize almost 8 yearsI was having the same problem and worked out it must be due to
NSAttributedString
. The documentation never specify what encoding the parameterdata
should have, but I think we have verified that it MUST beNSUTF16StringEncoding
. Internally they probably decode with that. -
ctietze over 6 yearsThe foundational
NSString
is represented using UTF-16, so that default would make sense. That being said, you can specifyoptions: [characterEncoding: NSUTF8StringEncoding]
to match the incoming data.