Nightly 20070202 and new encoding switch
Dirk
vss2svn at nogga.de
Mon Feb 5 11:45:10 EST 2007
Hi Jonathan,
> Please note that XML discourages or forbids some Unicode codepoints,
> not bytes in specific codepages. Specifically, windows-1252 does not
> map any byte to a codepoint in the range [0x80-0x9F].
> (see http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx)
> For example, 0x80 in windows-1252 maps to Unicode 0x20AC (Euro sign).
>
This is interesting to read. I was somewhat new to XML (and I'm still
not an expert) when I researched this. I expected the behavior that you
state above. I made a XML file with "encoding=windows-1252" and entered
a few of the problematic bytes/characters (in the windows-1252
codepage). I expected the file to be valid XML, since in the encoding I
used all bytes are allowed and defined. To verify I opened the file in
XMLSpy and the tool complained about invalid characters. Regardless
whether I used the direct character or the XML byte encoding. Therefore
I concluded to interpret it as "problematic bytes" and not as
"problematic codepoints".
Perhaps, I did something else completely wrong at that time.
Thanks for the info and the clarification.
Dirk
More information about the vss2svn-users
mailing list