Bug 4721 - misconfigured wsrf-query script?
: misconfigured wsrf-query script?
Status: RESOLVED FIXED
: Java WS Core
globus_wsrf_core
: 4.0.3
: PC Linux
: P3 normal
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2006-09-20 17:00 by
Modified: 2006-10-02 21:19 (History)


Attachments
The update.xml Ant script (406 bytes, application/xml)
2006-09-21 15:12, Jarek Gawor
Details
Latest update.xml (987 bytes, application/xml)
2006-10-02 21:19, Jarek Gawor
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2006-09-20 17:00:56
I have been trying to track down why wsrf-query was taking so long for us. It
seems in all our installs, the wsrf-query is configured to call java with the
following argument: 

-Djava.security.egd=/dev/urandom 

These guides indicate that this should be a url: 

http://java.sun.com/j2se/1.4.2/docs/guide/security/jaas/JAASRefGuide.html
http://java.sun.com/j2se/1.5.0/docs/guide/security/jaas/JAASRefGuide.html

When I change the script to use -Djava.security.egd=file:///dev/urandom (ie.
use a url) instead I get considerable speed up (about 6 seconds). I am not sure
if I am misconfiguring it by doing this, but the longer duration occurs when I
do non-existant files so I believe the url is the correct form. 

My apologies if I have misunderstood and this is not a bug.  

Phil
------- Comment #1 From 2006-09-20 17:23:17 -------
I see the speedup in globus-start-container using
EGD_DEVICE="file:///dev/urandom"
------- Comment #2 From 2006-09-20 17:47:49 -------
(In reply to comment #1)
> I see the speedup in globus-start-container using
> EGD_DEVICE="file:///dev/urandom"
> 

As a clarification, if you simply change the top of the script to 

EGD_DEVICE="file:///dev/urandom"

Then you will effectively remove the configuration because further down the
script checks if the file exists:

f [ -c "$EGD_DEVICE" -a  -r "$EGD_DEVICE" ]; then
    updateOptions "java.security.egd" "$EGD_DEVICE"
fi

This seems to have the speedup as well, but the speedup also occurs when you
modify the script to actually use -Djava.security.egd=/dev/urandom in the java
options. 
------- Comment #3 From 2006-09-20 17:57:55 -------

Sorry the previous comment should read:

This seems to have the speedup as well, but the speedup also occurs when you
modify the script to actually use -Djava.security.egd=file:///dev/urandom in
the java
options. 
------- Comment #4 From 2006-09-20 20:36:20 -------
Oops, thankyou for pointing this out.  I confirmed (using bash -x) that when
simply changing the EGD_DEVICE value, the new value is not getting passed and
so is reverting to the default JVM setting.

What happened here was that I was using Sun's Java 1.5 which defaults to
file:/dev/urandom

Sun's Java 1.4 defaults to file:/dev/random which is what we had problems with
in the past (Bug 2651).

For Sun JVMs at least, you can check the default with:

grep -C 5 securerandom ./jre/lib/security/java.security


So to me, it looks like Java core should 

a) make this the default EGD_DEVICE value:

EGD_DEVICE="file:/dev/urandom"

b) and remove the test block so it is always passed to the JVM:

updateOptions "java.security.egd" "$EGD_DEVICE"


As an aside, instead of using bash -x, you can also confirm the setting is
passed by adding this to your container-log4j.properties:

log4j.category.org.globus.wsrf.container.ServiceContainer=DEBUG
------- Comment #5 From 2006-09-20 22:58:08 -------
I was curious why this would make a difference.  grep-ing through the java 1.5
source, I found two things of interest.  

One is a comment in sun.security.provider.Sun about the result of an exact
string match on "file:/dev/urandom":  "Register these first to speed up "new
SecureRandom() which iterates through the list of algorithms"

It doesn't seem to make too much of a difference if "file:///dev/urandom" is
used, so that doesn't seem like the culprit...

The second thing of interest is in sun.security.provider.SeedGenerator, the
static initializer first makes exact string comparisons on "file:/dev/random"
and "file:/dev/urandom" and creates a NativeSeedGenerator instance.  

If the those string comparisons failed, it falls back to a URLSeedGenerator
instance, passing in whatever is in our "java.security.egd" property.

If that instantiation fails, it uses a ThreadedSeedGenerator instance. 

And it turns out that NativeSeedGenerator is just an implementation-less
extension of the URLSeedGenerator class, so they are effectively equivalent.

So what is happening is that sending the "file:" or "file://" forms gets us the
native seed generator, but sending just "/dev/urandom" causes a
MalformedURLException and triggers the ThreadedSeedGenerator instance to be
used.  

I didn't time it or anything, but seeing this difference I'd say that the lag
we're experiencing is coming from the initialization or use of this in-JVM seed
generator.

It's also using an extra thread (from kill -QUIT: "SeedGenerator Thread" daemon
prio=1 ...).

So while the Bug 2651 fix does stop the container from using /dev/random and
stops those hangs, the reason is not because /dev/urandom is being used.

To see, add "-Djava.security.debug=provider" to the list of JVM options being
sent to the container.


With "file:/dev/urandom":

provider: Using operating system seed generator


With "file:///dev/urandom"

provider: Using URL seed generator reading from file:///dev/urandom

(this is functionally equivalent to the last one)


With "/dev/urandom"

provider: Failed to create seed generator with /dev/urandom:
java.net.MalformedURLException: no protocol: /dev/urandom
provider: Using default threaded seed generator



Java 1.4 shows the same behavior, a MalformedURLException and extra thread, but
there is no distinction between operating system seed generator and URL seed
generator there.
------- Comment #6 From 2006-09-21 13:03:08 -------
Phil, thanks for finding this problem! It looks like it was misconfigured like
that for over a year! Tim, thanks for debugging this further.

I committed fixes for this problem to globus_4_0_branch and trunk.

I will send out an announcement about this issue to our mailing lists.
------- Comment #7 From 2006-09-21 15:11:34 -------
Btw, here's the fix. I changed the following line: 

updateOptions "java.security.egd" "$EGD_DEVICE"

to:

updateOptions "java.security.egd" "file://$EGD_DEVICE"

I also wrote Ant script (attached to this bug) that can update a set of scripts
at once. Usage:

ant -f update.xml -Ddir=$GLOBUS_LOCATION/bin

Make sure to backup the existing files (in the bin/ directory) in case the
script breaks things.
------- Comment #8 From 2006-09-21 15:12:28 -------
Created an attachment (id=1054) [details]
The update.xml Ant script
------- Comment #9 From 2006-10-02 21:19:13 -------
Created an attachment (id=1063) [details]
Latest update.xml

To apply the fix first ensure your $GLOBUS_LOCATION is set properly and than
execute:

ant -f update.xml