c++ - Convert UTF8 encoded byte buffer to wstring? -


does c++ standard template library (stl) provide method convert utf8 encoded byte buffer wstring?

for example:

const unsigned char* szbuf = (const unsigned char*) "d\xc3\xa9j\xc3\xa0 vu"; std::wstring str = method(szbuf); // should assign "déjà vu" str 

i want avoid having implement own utf8 conversion code, this:

const unsigned char* pch = szbuf;     while (*pch != 0) {     if ((*pch & 0x80) == 0)     {     str += *pch++;     }     else if ((*pch & 0xe0) == 0xc0 && (pch[1] & 0xc0) == 0x80)     {         wchar_t ch = (((*pch & 0x1f) >> 2) << 8) +             ((*pch & 0x03) << 6) +             (pch[1] & 0x3f);         str += ch;         pch += 2;     }     else if (...)     {         // other cases omitted     } } 

edit: comments , answer. code fragment performs desired conversion:

std::wstring_convert<std::codecvt_utf8<wchar_t>,wchar_t> convert; str = convert.from_bytes((const char*)szbuf); 

in c++11 can use std::codecvt_utf8. if don't have that, may able persuade iconv want; unfortunately, that's not ubiquitous either, not implementations have support utf-8, , i'm not aware of any way find out appropriate thing pass iconv_open conversion wchar_t.

if don't have either of things, best bet third-party library such icu. surprisingly, boost not appear have purpose, although coulda missed it.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

html - Unable to style the color of bullets in a list -

c# - must be a non-abstract type with a public parameterless constructor in redis -