Archived Forum Post

Index of archived forum posts


Bullet symbol not properly being converted from HTML to XML

Mar 08 '15 at 04:24


I try to convert this simple html file. It is just a centered bullet symbol. I use the CkHtmlToXmlW class for that purpose.

When using ConvertFile method, I get the "Replacement Character" instead of the bullet. When I use ToXml method, I get the quotation mark instead.

What should I do to convert the bullet right?

I use VS2010 with x86.

Thanks in advance.


The HTML file contains this:


There needs to be a META tag indicating the utf-8 charset because the bullet char, if you examine the HTML in a hex editor, is composed of 3 bytes in the utf-8 encoding. By not specifying any charset in an HTML meta, the default choice is ANSI and therefore the bytes that compose the bullet are interpreted according to the 1-byte-per-char ANSI encoding of whatever computer you happen to be running on...