Bug 5322 - GRAM/Derby db.lck file not cleaned up after WS container stopped
: GRAM/Derby db.lck file not cleaned up after WS container stopped
Status: RESOLVED FIXED
: Java WS Core
globus_wsrf_core
: alpha
: IA64 Linux
: P1 minor
: 4.2.0
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2007-05-22 12:56 by
Modified: 2008-01-16 15:19 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2007-05-22 12:56:20
If the WS container has previously been run as root and then restarted as a
different user, job submission fails with the following error:

AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode: 
faultString: java.rmi.RemoteException: Job creation failed.; nested exception
is: 
        org.globus.wsrf.ResourceException: ; nested exception is: 
        SQL Exception: An SQL data change is not permitted for a read-only
connection, user or database.

The problem is caused by the db.lck file in
$GLOBUS_HOME/var/gram/ResourceDatabase persisting and being owned by root.
Deleting or chowning the file to the user running the container fixes the
problem, after a container restart.

This problem occurred when submitting jobs with both globusrun-ws and the JSDL
development GlobusRun client. 

The container was started and stopped using the globus-start-container-detached
and globus-stop-container-detached commands.


The problem could be resolved by deleting the db.lck file when the container is
shut down.

Note: the ResourceDatabase directory also contains a subdirectory tmp which has
its ownership switched to root after root has run the container. The presence
of this directory doesn't seem to cause a problem, but it possibly should also
be removed when the container is stopped.
------- Comment #1 From 2007-05-22 13:44:57 -------
An approach here may be to pass a cleanup message to services when the
container's signal handler catches SIGINT?
------- Comment #2 From 2008-01-03 16:36:00 -------
A solution discussed has been to provide a single Derby installation and use
JNDI in core to provide access to it for services. 
------- Comment #3 From 2008-01-16 15:19:34 -------
I have gone ahead and added a new interface called ContainerLifecycleListener
which executes actions at major events in the container lifecycle.  I've made
one called DerbyContainerLifecycleListener which cleanly starts Derby before
the container starts and cleanly shuts it down after the container finishes. 
This should eliminate lock artifacts from existing after the container has shut
down.  However, if the container goes down in an unnatural way (e.g. pulling
the plug on the machine) there is no guarantee that the filesystem resources
will be cleaned up.  in this case, you should remove the db.lck file by hand. 
Note, a Ctrl-C is considered a natural way of shutting down, so the resources
should be cleaned in that situation.  This replaces the need for a single
shared db.  Different service can reference their own db, and all derby db's
should be cleaned up.