60 Commits

Author SHA1 Message Date
Xueming Shen
3c65bb6343 6847092: (cs) CharsetEncoder.isLegalReplacement of US_ASCII behaves differently since
Updated the US_ASCII and ISO-8859-1 to fix the failure.

Reviewed-by: alanb, martin
2009-06-22 19:22:47 -07:00
Xueming Shen
721a90bda5 6299219: euro sign failed to be printed in Console on Localized Windows platform with GBK encoding
4891024: EUC-KR and JOHAB converters need to be updated to include two new characters
4287467: Character converter generator tool

Migrated some of the doublebyte charsets to the new implementation.

Reviewed-by: okutsu
2009-06-19 14:39:06 -07:00
Xueming Shen
f625a6d545 6843578: Re-implement IBM doublebyte charsets
6639450: IBM949C encoder modifies state of IBM949 encoder
6569191: Cp943 io converter returns U+0000 and U+FFFD for unconvertable character
6577466: Character encoder IBM970 throws a BufferOverflowException
5065777: CharsetEncoder canEncode() methods often incorrectly return false

Re-write 11 IBM doublebyte charsets. Thanks Ulf.Zibis for the codereview!

Reviewed-by: martin
2009-05-21 23:32:46 -07:00
Xueming Shen
15baf98a0a 6843079: Putback for the new EUC_TW is not complete
Putback the files missed in last putback

Reviewed-by: alanb
2009-05-19 16:03:02 -07:00
Xueming Shen
a1958b22ef 6831794: charset EUC_TW is 12.6% of the total size of charsets.jar
6229811: Several codepoints in EUC_TW failed in roundtrip conversion

Re-write EUC_TW charset to address the size and roundtrip issue.

Reviewed-by: alanb
2009-05-19 15:25:29 -07:00
Xueming Shen
3f0b988cfc 6636323: Optimize handling of builtin charsets
6636319: Encoders should implement isLegalReplacement(byte[] repl)

Optimized new String(byte[], cs/csn) and String.getBytes(cs/csn) for speed and memory consumption in singlebyte case.

Reviewed-by: alanb
2009-03-23 09:19:23 -07:00
Xueming Shen
790bc3042d 4849617: (cs)Revise Charset spec to allow '+' in names
Update the spec and code to accept '+' as a charset name character

Reviewed-by: alanb
2008-08-27 10:12:22 -07:00
Xueming Shen
630d73eb0a 4486841: UTF-8 decoder should adhere to corrigendum to Unicode 3.0.1
6636317: Optimize UTF-8 coder for ASCII input

Re-write the UTF-8 charset to obey the standard and improve the performance

Reviewed-by: alanb
2008-08-22 14:37:46 -07:00
Xueming Shen
dd2dfec9f5 6675856: Open charset tests
Moved non-confidiential test cased from closed repo to open repo

Reviewed-by: martin
2008-06-30 14:06:34 -07:00
J. Duke
319a3b9947 Initial load 2007-12-01 00:00:00 +00:00