<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://bugzilla.globus.org/bugzilla/bugzilla.dtd">

<bugzilla version="3.2.3"
          urlbase="http://bugzilla.globus.org/bugzilla/"
          maintainer="bacon@mcs.anl.gov"
>

    <bug>
          <bug_id>2968</bug_id>
          
          <creation_ts>2005-03-17 04:00</creation_ts>
          <short_desc>Globalization : Error in JUnit execution in the Japanese locale.</short_desc>
          <delta_ts>2005-06-09 09:53:45</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>Java WS Core</product>
          <component>globus_wsrf_core</component>
          <version>3.9.5</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          
          
          <priority>P3</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Satish S Belawadi">sbelawadi@in.ibm.com</reporter>
          <assigned_to name="Jarek Gawor">gawor@mcs.anl.gov</assigned_to>
          <cc>ankumari@in.ibm.com</cc>
    
    <cc>childers@mcs.anl.gov</cc>
    
    <cc>chris@gridwisetech.com</cc>
    
    <cc>gawor@mcs.anl.gov</cc>
    
    <cc>hsherick@us.ibm.com</cc>
    
    <cc>meder@mcs.anl.gov</cc>
    
    <cc>millerjj@us.ibm.com</cc>
    
    <cc>paxhia@us.ibm.com</cc>
    
    <cc>seelbach@us.ibm.com</cc>

      

      
          <long_desc isprivate="0">
            <who name="Satish S Belawadi">sbelawadi@in.ibm.com</who>
            <bug_when>2005-03-17 04:00:50</bug_when>
            <thetext>When the Red Hat Linux system is in the Japanese locale, an error occurs in 
the junits for the &quot;handlers&quot; package of WS-core.

Steps:
1. Switch locale to Japanese -- in this case it is &quot;ja_JP.eucJP&quot;.
2. Run JUnits for WS-core.

Results:
There is an error thrown in the &quot;testDIINoWSDLService&quot; testcase of the 
handlers package. The error message is given below.

Additional Info:
When the locale is switched back to en_US the JUnits work fine.

************** Error Message thrown ****************

WSDLドキュメントの処理エラー: java.io.UTFDataFormatException: 1 バイトの UTF-8 シー
ケンスのバイト 1 が無効です。 / [en]-(Error processing WSDL document: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効です。)

javax.xml.rpc.ServiceException: WSDLドキュメントの処理エラー: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効で
す。 / [en]-(Error processing WSDL document: 
java.io.UTFDataFormatException: 1 バイトの UTF-8 シーケンスのバイト 1 が無効です。)
at org.apache.axis.client.Service.initService(Service.java:239)
at org.apache.axis.client.Service.&lt;init&gt;(Service.java:154)
at org.apache.axis.client.ServiceFactory.createService(ServiceFactory.java:198)
at org.globus.wsrf.handlers.DIITests.testDIINoWSDLService(DIITests.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:79)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java(Compiled Code))
at org.globus.wsrf.test.GridTestSuite.run(GridTestSuite.java:64)

************* Error Message Thrown ***************************************</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jarek Gawor">gawor@mcs.anl.gov</who>
            <bug_when>2005-03-23 13:59:10</bug_when>
            <thetext>Is that the only test that fails in Japanese locale? </thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Satish S Belawadi">sbelawadi@in.ibm.com</who>
            <bug_when>2005-03-28 00:01:03</bug_when>
            <thetext>yes -- that&apos;s the only one that fails.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Sam Meder">meder@mcs.anl.gov</who>
            <bug_when>2005-04-12 10:46:04</bug_when>
            <thetext>Via email:

&gt; Hello Sam,
&gt; 
&gt; I am working on fixing bug #2968 (Globalization : Error in JUnit execution
&gt; in the Japanese locale).
&gt; 
&gt; After having duplicated Satish&apos;s working environment, I received the same
&gt; test error message as he did.
&gt; 
&gt; However, I noticed one interesting thing.
&gt; This error occurs for Japanese locale set as &quot;ja_JP.EUC-JP EUC-JP&quot;, but it
&gt; does not occur for Japanese locale set as &quot;ja_JP.UTF-8 UTF-8&quot;.
&gt; I am not a Japanese nor speak I Japanese, but UTF-8 seems to me more
&gt; modern Unicode implementation than EUC.
&gt; </thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Sam Meder">meder@mcs.anl.gov</who>
            <bug_when>2005-04-12 10:46:34</bug_when>
            <thetext>How important is it to get this working in the EUC locale rather than the UTF
locale?

/Sam</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Tom Seelbach">seelbach@us.ibm.com</who>
            <bug_when>2005-04-12 14:25:30</bug_when>
            <thetext>can you run this command and post the results?
locale -a | grep jp</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Sam Meder">meder@mcs.anl.gov</who>
            <bug_when>2005-04-12 14:30:39</bug_when>
            <thetext>Tom,

Who do you want to run the commands, us or Satish?

/Sam</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Satish S Belawadi">sbelawadi@in.ibm.com</who>
            <bug_when>2005-04-13 02:11:44</bug_when>
            <thetext>Well, on our machine the result is thus: 
$ locale -a | grep jp
ja_JP.eucjp
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Chris Wilk">chris@gridwisetech.com</who>
            <bug_when>2005-04-13 03:19:24</bug_when>
            <thetext>I think that the command specified by Tom is not accurate enough:

chris $ locale -a  |grep jp
ja_JP.eucjp

More accurate results are produced by the following command:

chris $ locale -a  |grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Tom Seelbach">seelbach@us.ibm.com</who>
            <bug_when>2005-04-14 08:25:54</bug_when>
            <thetext>Satish,  I am suprised that there is just one ja locale on your box. Are you
running on a RH AS 2.x box? (we don&apos;t plan to support RH AS 2.x ) This is the
output on my box:

cat /etc/redhat-release 
Red Hat Enterprise Linux AS release 3 (Taroon Update 3)

$  locale -a | grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc

Please run the same test with locale set to ja_JP.utf8, ujis, and eucjp(if
available) on RH AS 3 and SUSE SLES 9 and report the results as well as the
results of /cat/SuSE-release and locale -a | grep ja

</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Satish S Belawadi">sbelawadi@in.ibm.com</who>
            <bug_when>2005-04-14 11:14:46</bug_when>
            <thetext>Tom,

The reason I got just one &quot;ja&quot; result is - as Chris has stated - I seemed to 
have used 
$locale -a | grep jp
The other locales are all present actually.

Here are the right results from the Box I checked this out on,

$ cat /etc/redhat-release
Red Hat Enterprise Linux AS release 3 (Taroon Update 3)

$ locale -a | grep ja
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc

Also --&gt; with the locale set to :

ja_JP.utf8                 -- The error does not occur.
ja_JP.eucjp and ja_JP.ujis -- the error occurs.

I have not tested this on SLES 9 machines as yet. The above results are for 
Red Hat alone.

</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Satish S Belawadi">sbelawadi@in.ibm.com</who>
            <bug_when>2005-04-21 09:40:37</bug_when>
            <thetext>
Here are the results from the SLES 9 box:

$ cat /etc/SuSE-release
SUSE LINUX Enterprise Server 9 (i586)
VERSION = 9

$ locale -a | grep ja
ja_JP.eucjp
ja_JP.sjis
ja_JP.utf8

Also --&gt; with the locale set to :

ja_JP.utf8                 -- The error does not occur.
ja_JP.eucjp and ja_JP.sjis -- The error occurs.
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jarek Gawor">gawor@mcs.anl.gov</who>
            <bug_when>2005-04-22 15:50:16</bug_when>
            <thetext>It looks like only UTF8 encoding of different locales are supported by GT/Axis 
at this point. Do you need support for other (non unicode) encodings?
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Tom Seelbach">seelbach@us.ibm.com</who>
            <bug_when>2005-04-27 11:07:26</bug_when>
            <thetext>Yes - we do need support for other non-utf8 locales.  There are 2 concerns: 
1) non-UTF locales may be the norm for many languages.  For example EUC and SJIS
are popular encodings for the Japanese language.

 2) This problem may show up on various languages other than Japanese.  </thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Chris Wilk">chris@gridwisetech.com</who>
            <bug_when>2005-04-28 11:12:19</bug_when>
            <thetext>Jarek,

Do you think that the lack of support non-UTF8 locale encodings (e.g. EUC) is
the inherent feature of current GT and Apache Axis (version bundled with GT) or
only a matter of configuration?</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jarek Gawor">gawor@mcs.anl.gov</who>
            <bug_when>2005-05-01 23:26:16</bug_when>
            <thetext>Chris,

I think it is mostly Axis problem but GT code in certain cases also assumes 
utf8 encoding. I think we need a bit better tests to verify. I think we need a 
test that returns a soap message/fault that contains non unicode characters. 
Also, a test that uses wsdl with non unicode characters. 
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jarek Gawor">gawor@mcs.anl.gov</who>
            <bug_when>2005-05-01 23:29:38</bug_when>
            <thetext>Tom,

Ok. What other languages for example (just want to check the non-unicode 
theory)?
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Harry Sherick">hsherick@us.ibm.com</who>
            <bug_when>2005-05-02 17:23:24</bug_when>
            <thetext>Some additional choices:

zh_CN.gb18030 - for Simplified Chinese PRC

ko_KR.euckr   - for Korean
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jarek Gawor">gawor@mcs.anl.gov</who>
            <bug_when>2005-06-09 09:53:45</bug_when>
            <thetext>This should be fixed now. Fix was committed to cvs trunk and globus_40_branch 
(will make 4.0.1 release). As it turns out this was a bug in our standalone 
container when requesting WSDL of a service. The WSDL string was converted to 
bytes using platform&apos;s default charset but it should have been converted using 
UTF charset. 
Thanks to Chris for finding out the key to this problem.
</thetext>
          </long_desc>
      
      

    </bug>

</bugzilla>