Post Reply 
Converting non latin characters to UTF-8 as Proxomitron can't read Unicode UTF-16
Jun. 14, 2009, 04:08 PM (This post was last modified: Jun. 14, 2009 04:16 PM by sidki3003.)
Post: #10
RE: Proxomitron cannot read Unicode UTF-16 Hebrew
Actually, you can filter UTF-16 with Proxomitron. I'm doing so.

However, you need a webfilter to convert the page (i don't have a config independent version to post), *and* you're losing any double-byte information that goes beyond UTF-8. For the little-endian case that means the second byte is supposed to be x00. If not, the double byte will be replaced by a dummy char.

Luckily, most little-endian and all big-endian pages i've seen are indeed using just one byte for char information. But not in your example. I've once written a UTF-16 example to test with Proxomitron (little-endian).
Add Thank You Quote this message in a reply
Post Reply 


Messages In This Thread
RE: Proxomitron cannot read Unicode UTF-16 Hebrew - sidki3003 - Jun. 14, 2009 04:08 PM

Forum Jump: