Bugzilla – Bug 4721
misconfigured wsrf-query script?
Last modified: 2006-10-02 21:19:13
You need to log in before you can comment on or make changes to this bug.
I have been trying to track down why wsrf-query was taking so long for us. It seems in all our installs, the wsrf-query is configured to call java with the following argument: -Djava.security.egd=/dev/urandom These guides indicate that this should be a url: http://java.sun.com/j2se/1.4.2/docs/guide/security/jaas/JAASRefGuide.html http://java.sun.com/j2se/1.5.0/docs/guide/security/jaas/JAASRefGuide.html When I change the script to use -Djava.security.egd=file:///dev/urandom (ie. use a url) instead I get considerable speed up (about 6 seconds). I am not sure if I am misconfiguring it by doing this, but the longer duration occurs when I do non-existant files so I believe the url is the correct form. My apologies if I have misunderstood and this is not a bug. Phil
I see the speedup in globus-start-container using EGD_DEVICE="file:///dev/urandom"
(In reply to comment #1) > I see the speedup in globus-start-container using > EGD_DEVICE="file:///dev/urandom" > As a clarification, if you simply change the top of the script to EGD_DEVICE="file:///dev/urandom" Then you will effectively remove the configuration because further down the script checks if the file exists: f [ -c "$EGD_DEVICE" -a -r "$EGD_DEVICE" ]; then updateOptions "java.security.egd" "$EGD_DEVICE" fi This seems to have the speedup as well, but the speedup also occurs when you modify the script to actually use -Djava.security.egd=/dev/urandom in the java options.
Sorry the previous comment should read: This seems to have the speedup as well, but the speedup also occurs when you modify the script to actually use -Djava.security.egd=file:///dev/urandom in the java options.
Oops, thankyou for pointing this out. I confirmed (using bash -x) that when simply changing the EGD_DEVICE value, the new value is not getting passed and so is reverting to the default JVM setting. What happened here was that I was using Sun's Java 1.5 which defaults to file:/dev/urandom Sun's Java 1.4 defaults to file:/dev/random which is what we had problems with in the past (Bug 2651). For Sun JVMs at least, you can check the default with: grep -C 5 securerandom ./jre/lib/security/java.security So to me, it looks like Java core should a) make this the default EGD_DEVICE value: EGD_DEVICE="file:/dev/urandom" b) and remove the test block so it is always passed to the JVM: updateOptions "java.security.egd" "$EGD_DEVICE" As an aside, instead of using bash -x, you can also confirm the setting is passed by adding this to your container-log4j.properties: log4j.category.org.globus.wsrf.container.ServiceContainer=DEBUG
I was curious why this would make a difference. grep-ing through the java 1.5 source, I found two things of interest. One is a comment in sun.security.provider.Sun about the result of an exact string match on "file:/dev/urandom": "Register these first to speed up "new SecureRandom() which iterates through the list of algorithms" It doesn't seem to make too much of a difference if "file:///dev/urandom" is used, so that doesn't seem like the culprit... The second thing of interest is in sun.security.provider.SeedGenerator, the static initializer first makes exact string comparisons on "file:/dev/random" and "file:/dev/urandom" and creates a NativeSeedGenerator instance. If the those string comparisons failed, it falls back to a URLSeedGenerator instance, passing in whatever is in our "java.security.egd" property. If that instantiation fails, it uses a ThreadedSeedGenerator instance. And it turns out that NativeSeedGenerator is just an implementation-less extension of the URLSeedGenerator class, so they are effectively equivalent. So what is happening is that sending the "file:" or "file://" forms gets us the native seed generator, but sending just "/dev/urandom" causes a MalformedURLException and triggers the ThreadedSeedGenerator instance to be used. I didn't time it or anything, but seeing this difference I'd say that the lag we're experiencing is coming from the initialization or use of this in-JVM seed generator. It's also using an extra thread (from kill -QUIT: "SeedGenerator Thread" daemon prio=1 ...). So while the Bug 2651 fix does stop the container from using /dev/random and stops those hangs, the reason is not because /dev/urandom is being used. To see, add "-Djava.security.debug=provider" to the list of JVM options being sent to the container. With "file:/dev/urandom": provider: Using operating system seed generator With "file:///dev/urandom" provider: Using URL seed generator reading from file:///dev/urandom (this is functionally equivalent to the last one) With "/dev/urandom" provider: Failed to create seed generator with /dev/urandom: java.net.MalformedURLException: no protocol: /dev/urandom provider: Using default threaded seed generator Java 1.4 shows the same behavior, a MalformedURLException and extra thread, but there is no distinction between operating system seed generator and URL seed generator there.
Phil, thanks for finding this problem! It looks like it was misconfigured like that for over a year! Tim, thanks for debugging this further. I committed fixes for this problem to globus_4_0_branch and trunk. I will send out an announcement about this issue to our mailing lists.
Btw, here's the fix. I changed the following line: updateOptions "java.security.egd" "$EGD_DEVICE" to: updateOptions "java.security.egd" "file://$EGD_DEVICE" I also wrote Ant script (attached to this bug) that can update a set of scripts at once. Usage: ant -f update.xml -Ddir=$GLOBUS_LOCATION/bin Make sure to backup the existing files (in the bin/ directory) in case the script breaks things.
Created an attachment (id=1054) [details] The update.xml Ant script
Created an attachment (id=1063) [details] Latest update.xml To apply the fix first ensure your $GLOBUS_LOCATION is set properly and than execute: ant -f update.xml