Very old fj.kanji discussion 558/622

No. 558/622 Index Prev Next
Path: titcca!ccut!tomo!wada
From: wada@tomo.t.u-tokyo.ac.jp (Eiiti Wada)
Newsgroups: fj.kanji
Subject: Re: alphebet numeric and symbol cahr.
Message-ID: < 18257@ccut.cc.u-tokyo.ac.jp> 
Date: 26 May 90 08:06:19 GMT
Sender: news@ccut.cc.u-tokyo.ac.jp
Reply-To: wada@tomo.t.u-tokyo.ac.jp (Eiiti Wada)
Distribution: fj
Organization: Wada Lab., Dept. of Mathematical Engineering, Univ. of Tokyo
Lines: 55


和田です(東大 計数工学科)
;wada@tomo.t.u-tokyo.ac.jp,wada@wadalab.t.u-tokyo.ac.jp
;(03)812-2111 ex 7410

In article < 15413@rena.dit.co.jp> , void@rena.dit.co.jp (Youichi Kusakabe) writes:
> 	以前、このNewsGroupで(別のところだったかもしれない)、
> 	「JISの漢字コードの非漢字の部分で、
> 	  1byte系の文字のほうにも同じ文字があるものは、なるべく
> 	  使用しないように(1byteのほうを使うように」
> 	という勧告があったとかいうのを見かけた憶えがあるのですが、
> 
> 	どなたか、御存知でしたら、詳細を教えてください。
>   ヘ_ヘ
> ミ・・ ミ    void@rena.dit.co.jp
>  (     )〜  くさかべ@でぃあいてぃ
> ---------------------------------
情報処理学会日本語機能委員会の2年前の報告書に基盤コード系の推奨案として,

G0 (8ビットのコード表の左側, b8(最上位ビット)=0) にASCIIを
G1 (8ビットのコード表の右側, b8(最下位ビット)=1) にJIS X 0208-83を
呼び出して使う

というのがあります.
こうして使うとラテン文字AがASCIIとJIS X 0208の第3区と両方に入ります.　これを
別の文字と思うか同じ文字と思うかですが,　SC2で決めたgraphic characterのnaming
の規則ではともに" Latin Capital Letter A" となり同じ文字ということになります.
一方コード表は文字とコードを1対1に対応させるのが目的なので,　同じ文字が2箇所以上
にあるのはのぞましくありません.　ISO2022は同時に呼び出すコードに同じ文字がある
のを禁止していますが,　いろいろなコード表を呼び出すとどうしても同じ文字が2
箇所以上に入いるのが避けられないことがあります.　そこでISO4873では,
9.2 Unique coding of characters
In a version the same character may occur in more than one of the G0, G1, G2
and G3 sets. Such a character shall be regarded as the same character as a
character in another of those sets if both characters have the same name
within the specification, or ISO International Register entries, that 
respectively define the two sets.　(名前が同じなら..がここにでてくる)
If the same character has been allocated to more than one of the G0, G1, G2
and G3 sets, either within the set itself or within the character repertoire
associated with that set, then that character shall be represented by the coded
representation taken from the lowest numbered set (in the sequence G0, G1, G2,
G3) in which the character has been allocated.
A coded representation for such a character within one of the other, higher
numbered sets shall not be used, even if the higher numbered set is already
invoked and the lowest numbered set in which the character is allocated is
not currently invoked.
この規定はISO2022にはまだ入っていませんが,　次の2022の改訂では入ると予想さ
れます.　ASCIIがG0, JIS X 0208 がG1に呼び出されていれば,　同じ文字はG0の方
つまりASCIIの方を使うということです.
ASCIIにある文字は大体JIS X 0208にあります.　ないのはダブルクォートだけという
ことです.　JISではダブルクォートは左と右の両方になっているからです.
シングルクォートはASCIIでは左のものをバッククォートに変換すれば両方ASCIIにある
ことになります.

この程度の回答でよいでしょうか?
Next
Continue