ä¸é¢çå
容转èªæçç¾åº¦ç©ºé´,æ¯ææ¶éæ¥ç,å¨è¿éçèµ·æ¥å¦æè§å¾æçä¸å¥½,å¯ä»¥ç´æ¥çæç空é´å
çæç« :
æå
³UTF-8çä¸äºèµæ2008å¹´06æ13æ¥ ææäº 08:17ä¸, æéè¦ç,UTF-8åUnicodeç转æ¢
UTF-8 ç¼ç æ¯ä¸ç§è¢«å¹¿æ³åºç¨çç¼ç ï¼è¿ç§ç¼ç è´åäºæå
¨ççè¯è¨çº³å
¥ä¸ä¸ªç»ä¸çç¼ç ï¼ç®åå·²ç»å°å ç§äºæ´²è¯è¨çº³å
¥ãUTF 代表 UCS Transformation Format.
UTF-8 éç¨åé¿åº¦åèæ¥è¡¨ç¤ºå符ï¼ç论ä¸æå¤å¯ä»¥å° 6 个åèé¿åº¦ãUTF-8 ç¼ç å
¼å®¹äº ASC II(0-127)ï¼ ä¹å°±æ¯è¯´ UTF-8 å¯¹äº ASC II å符çç¼ç æ¯å ASC II ä¸æ ·çã对äºè¶
è¿ä¸ä¸ªåèé¿åº¦çå符ï¼æç¨ä»¥ä¸ç¼ç è§èï¼
左边第ä¸ä¸ªåè1ç个æ°è¡¨ç¤ºè¿ä¸ªå符ç¼ç åèçä½æ°ï¼ä¾å¦ä¸¤ä½åèå符ç¼ç æ ·å¼ä¸ºä¸ºï¼110xxxxx 10xxxxxxï¼ ä¸ä½åèå符çç¼ç æ ·å¼ä¸ºï¼1110xxxx 10xxxxxx 10xxxxxx.ï¼ä»¥æ¤ç±»æ¨ï¼å
ä½åèå符çç¼ç æ ·å¼ä¸ºï¼1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxxã xxx çå¼ç±å符ç¼ç çäºè¿å¶è¡¨ç¤ºçä½å¡«å
¥ãåªç¨æççé£ä¸ªè¶³å¤è¡¨è¾¾ä¸ä¸ªå符ç¼ç çå¤åè串ãä¾å¦ï¼
Unicode åç¬¦ï¼ 00 A9ï¼çæ符å·ï¼ = 1010 1001ï¼ UTF-8 ç¼ç 为ï¼11000010 10101001 = 0x C2 0xA9; å符 22 60 (ä¸çäºç¬¦å·) = 0010 0010 0110 0000ï¼ UTF-8 ç¼ç 为ï¼11100010 10001001 10100000 = 0xE2 0x89 0xA0
以ä¸è½¬æ¢ä¾åå·²ç»ç¡®è®¤æ¯æ£ç¡®ç,ä¸ç¨æç,å¦æçä¸æ请åä»ç»æ³æ³
Unicodeç¼ç åutf-8ç¼ç ä¹é´ç对åºå
³ç³»è¡¨
The table below summarizes the format of these different octet types.
The letter x indicates bits available for encoding bits of the
character number.
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx //////A/////////
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
è¿æ¯ä¸ä¸ªUnicodeç¼ç åutf-8ç¼ç ä¹é´ç对åºå
³ç³»è¡¨ãä¸æçUnicodeç¼ç èå´å¨0000 0800-0000 FFFF ä¸ã
äº, å
³äºBOM
UTF-8以åè为ç¼ç åå
ï¼æ²¡æåèåºçé®é¢ãUTF-16以两个åè为ç¼ç åå
ï¼å¨è§£éä¸ä¸ªUTF-16ææ¬åï¼é¦å
è¦å¼æ¸
æ¥æ¯ä¸ªç¼ç åå
çåèåºãä¾å¦æ¶å°ä¸ä¸ªâå¥âçUnicodeç¼ç æ¯594Eï¼âä¹âçUnicodeç¼ç æ¯4E59ãå¦ææ们æ¶å°UTF-16åèæµâ594Eâï¼é£ä¹è¿æ¯âå¥âè¿æ¯âä¹âï¼
Unicodeè§èä¸æ¨èçæ è®°åè顺åºçæ¹æ³æ¯BOMãBOMä¸æ¯âBill Of MaterialâçBOM表ï¼èæ¯Byte Order MarkãBOMæ¯ä¸ä¸ªæç¹å°èªæçæ³æ³ï¼
å¨UCSç¼ç ä¸æä¸ä¸ªå«å"ZERO WIDTH NO-BREAK SPACE"çå符ï¼å®çç¼ç æ¯FEFFãèFFFEå¨UCSä¸æ¯ä¸åå¨çå符ï¼æ以ä¸åºè¯¥åºç°å¨å®é
ä¼ è¾ä¸ãUCSè§è建议æ们
å¸æè½å¸®å°ä½ ï¼åäºå天没ååºæ¥â¦â¦
æ²å§å ææºåå¤ ä¸å®¹æ ç½è¾è¦äº è¿æ¯ç»ä¸ªåå§ è°¢è°¢
温馨提示:内容为网友见解,仅供参考