Bug 2968 - Globalization : Error in JUnit execution in the Japanese locale.
: Globalization : Error in JUnit execution in the Japanese locale.
Status: RESOLVED FIXED
: Java WS Core
globus_wsrf_core
: 3.9.5
: PC Linux
: P3 normal
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-03-17 04:00 by
Modified: 2005-06-09 09:53 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-03-17 04:00:50
When the Red Hat Linux system is in the Japanese locale, an error occurs in 
the junits for the "handlers" package of WS-core.

Steps:
1. Switch locale to Japanese -- in this case it is "ja_JP.eucJP".
2. Run JUnits for WS-core.

Results:
There is an error thrown in the "testDIINoWSDLService" testcase of the 
handlers package. The error message is given below.

Additional Info:
When the locale is switched back to en_US the JUnits work fine.

************** Error Message thrown ****************

WSDLドキュメントの処理エラー: java.io.UTFDataFormatException: 1 バイトの UTF-8 シー
ケンスのバイト 1 が無効です。 / [en]-(Error processing WSDL document: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効です。)

javax.xml.rpc.ServiceException: WSDLドキュメントの処理エラー: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効で
す。 / [en]-(Error processing WSDL document: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効です。)
at org.apache.axis.client.Service.initService(Service.java:239)
at org.apache.axis.client.Service.<init>(Service.java:154)
at org.apache.axis.client.ServiceFactory.createService(ServiceFactory.java:198)
at org.globus.wsrf.handlers.DIITests.testDIINoWSDLService(DIITests.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:79)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java(Compiled Code))
at org.globus.wsrf.test.GridTestSuite.run(GridTestSuite.java:64)

************* Error Message Thrown ***************************************
------- Comment #1 From 2005-03-23 13:59:10 -------
Is that the only test that fails in Japanese locale? 
------- Comment #2 From 2005-03-28 00:01:03 -------
yes -- that's the only one that fails.
------- Comment #3 From 2005-04-12 10:46:04 -------
Via email:

> Hello Sam,
> 
> I am working on fixing bug #2968 (Globalization : Error in JUnit execution
> in the Japanese locale).
> 
> After having duplicated Satish's working environment, I received the same
> test error message as he did.
> 
> However, I noticed one interesting thing.
> This error occurs for Japanese locale set as "ja_JP.EUC-JP EUC-JP", but it
> does not occur for Japanese locale set as "ja_JP.UTF-8 UTF-8".
> I am not a Japanese nor speak I Japanese, but UTF-8 seems to me more
> modern Unicode implementation than EUC.
> 
------- Comment #4 From 2005-04-12 10:46:34 -------
How important is it to get this working in the EUC locale rather than the UTF
locale?

/Sam
------- Comment #5 From 2005-04-12 14:25:30 -------
can you run this command and post the results?
locale -a | grep jp
------- Comment #6 From 2005-04-12 14:30:39 -------
Tom,

Who do you want to run the commands, us or Satish?

/Sam
------- Comment #7 From 2005-04-13 02:11:44 -------
Well, on our machine the result is thus: 
$ locale -a | grep jp
ja_JP.eucjp
------- Comment #8 From 2005-04-13 03:19:24 -------
I think that the command specified by Tom is not accurate enough:

chris $ locale -a  |grep jp
ja_JP.eucjp

More accurate results are produced by the following command:

chris $ locale -a  |grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc
------- Comment #9 From 2005-04-14 08:25:54 -------
Satish,  I am suprised that there is just one ja locale on your box. Are you
running on a RH AS 2.x box? (we don't plan to support RH AS 2.x ) This is the
output on my box:

cat /etc/redhat-release 
Red Hat Enterprise Linux AS release 3 (Taroon Update 3)

$  locale -a | grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc

Please run the same test with locale set to ja_JP.utf8, ujis, and eucjp(if
available) on RH AS 3 and SUSE SLES 9 and report the results as well as the
results of /cat/SuSE-release and locale -a | grep ja

------- Comment #10 From 2005-04-14 11:14:46 -------
Tom,

The reason I got just one "ja" result is - as Chris has stated - I seemed to 
have used 
$locale -a | grep jp
The other locales are all present actually.

Here are the right results from the Box I checked this out on,

$ cat /etc/redhat-release
Red Hat Enterprise Linux AS release 3 (Taroon Update 3)

$ locale -a | grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc

Also --> with the locale set to :

ja_JP.utf8                 -- The error does not occur.
ja_JP.eucjp and ja_JP.ujis -- the error occurs.

I have not tested this on SLES 9 machines as yet. The above results are for 
Red Hat alone.

------- Comment #11 From 2005-04-21 09:40:37 -------
Here are the results from the SLES 9 box:

$ cat /etc/SuSE-release
SUSE LINUX Enterprise Server 9 (i586)
VERSION = 9

$ locale -a | grep ja
ja_JP.eucjp
ja_JP.sjis
ja_JP.utf8

Also --> with the locale set to :

ja_JP.utf8                 -- The error does not occur.
ja_JP.eucjp and ja_JP.sjis -- The error occurs.
------- Comment #12 From 2005-04-22 15:50:16 -------
It looks like only UTF8 encoding of different locales are supported by GT/Axis 
at this point. Do you need support for other (non unicode) encodings?
------- Comment #13 From 2005-04-27 11:07:26 -------
Yes - we do need support for other non-utf8 locales.  There are 2 concerns: 
1) non-UTF locales may be the norm for many languages.  For example EUC and SJIS
are popular encodings for the Japanese language.

 2) This problem may show up on various languages other than Japanese.  
------- Comment #14 From 2005-04-28 11:12:19 -------
Jarek,

Do you think that the lack of support non-UTF8 locale encodings (e.g. EUC) is
the inherent feature of current GT and Apache Axis (version bundled with GT) or
only a matter of configuration?
------- Comment #15 From 2005-05-01 23:26:16 -------
Chris,

I think it is mostly Axis problem but GT code in certain cases also assumes 
utf8 encoding. I think we need a bit better tests to verify. I think we need a 
test that returns a soap message/fault that contains non unicode characters. 
Also, a test that uses wsdl with non unicode characters. 
------- Comment #16 From 2005-05-01 23:29:38 -------
Tom,

Ok. What other languages for example (just want to check the non-unicode 
theory)?
------- Comment #17 From 2005-05-02 17:23:24 -------
Some additional choices:

zh_CN.gb18030 - for Simplified Chinese PRC

ko_KR.euckr   - for Korean
------- Comment #18 From 2005-06-09 09:53:45 -------
This should be fixed now. Fix was committed to cvs trunk and globus_40_branch 
(will make 4.0.1 release). As it turns out this was a bug in our standalone 
container when requesting WSDL of a service. The WSDL string was converted to 
bytes using platform's default charset but it should have been converted using 
UTF charset. 
Thanks to Chris for finding out the key to this problem.