4752069: (cs spec) BOM should not be ignored in UTF-16 charsets

API doc update regarding BOM hanlding in UTF-16 charsets

Reviewed-by: alanb
This commit is contained in:
Xueming Shen 2008-06-25 08:27:06 -07:00
parent 2ec317a88e
commit d1ff48eb56

@ -188,21 +188,22 @@ import sun.security.action.GetPropertyAction;
* <ul> * <ul>
* *
* <li><p> When decoding, the <tt>UTF-16BE</tt> and <tt>UTF-16LE</tt> * <li><p> When decoding, the <tt>UTF-16BE</tt> and <tt>UTF-16LE</tt>
* charsets ignore byte-order marks; when encoding, they do not write * charsets interpret the initial byte-order marks as a <small>ZERO-WIDTH
* NON-BREAKING SPACE</small>; when encoding, they do not write
* byte-order marks. </p></li> * byte-order marks. </p></li>
* *
* <li><p> When decoding, the <tt>UTF-16</tt> charset interprets a byte-order * <li><p> When decoding, the <tt>UTF-16</tt> charset interprets the
* mark to indicate the byte order of the stream but defaults to big-endian * byte-order mark at the beginning of the input stream to indicate the
* if there is no byte-order mark; when encoding, it uses big-endian byte * byte-order of the stream but defaults to big-endian if there is no
* order and writes a big-endian byte-order mark. </p></li> * byte-order mark; when encoding, it uses big-endian byte order and writes
* a big-endian byte-order mark. </p></li>
* *
* </ul> * </ul>
* *
* In any case, when a byte-order mark is read at the beginning of a decoding * In any case, byte order marks occuring after the first element of an
* operation it is omitted from the resulting sequence of characters. Byte * input sequence are not omitted since the same code is used to represent
* order marks occuring after the first element of an input sequence are not * <small>ZERO-WIDTH NON-BREAKING SPACE</small>.
* omitted since the same code is used to represent <small>ZERO-WIDTH
* NON-BREAKING SPACE</small>.
* *
* <p> Every instance of the Java virtual machine has a default charset, which * <p> Every instance of the Java virtual machine has a default charset, which
* may or may not be one of the standard charsets. The default charset is * may or may not be one of the standard charsets. The default charset is