Bug 5959 - globusrun-ws error in job with invalid staging request
: globusrun-ws error in job with invalid staging request
Status: RESOLVED FIXED
: GRAM
wsrf managed execution job service
: alpha
: Macintosh All
: P2 blocker
: 4.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-03-27 13:49 by
Modified: 2008-05-14 07:55 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-03-27 13:49:43
The following error happens to me if i submit a job with a staging request
where the file to be staged does not exist. Not sure if this happens only
in my environment.

[martin@osg-test1 tmp]$ globusrun-ws -submit -S -f jobMitFileStaging.xml 
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:35d8cd2a-fc2e-11dc-8a9d-0013d4c3b957
Termination time: 03/27/3008 18:47 GMT
Current job state: StageIn
*** glibc detected *** double free or corruption (fasttop): 0x08a81e20 ***
Aborted
------- Comment #1 From 2008-03-31 21:54:13 -------
This happens only with local invocations from gram to rft
if a transfer fails. The fault in the failed-notification message
from rft to gram looks a bit odd, and cannot be serialized or cloned: 
"java.io.IOException: Non nillable element 'timestamp' is null."

I tried some things, and the first time it cannot be serialized
is in ResourcePropertyTopic.set(..) after RFT sets the fault.

I found a while ago that those nillable error message are due to
Axis being more restrict than in 4.0
------- Comment #2 From 2008-04-04 14:49:08 -------
Additional info on this:

It does not happen when submitting using the java GlobusRun program

You can duplicate it by:
  Turn on local invocation on the gram service
  Submitting a job with a staging error like the description below.


<job>
   <executable>/bin/sleep</executable>
   <argument>0</argument>
   <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
   <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
   <fileStageIn>
       <transfer>
           <sourceUrl>gsiftp://osg-test1.unl.edu:2811/bin/foobar</sourceUrl>
          
<destinationUrl>gsiftp://osg-test1.unl.edu:2811/tmp/myEcho</destinationUrl>
       </transfer>
   </fileStageIn>
   <fileStageOut>
       <transfer>
           <sourceUrl>gsiftp://osg-test1.unl.edu:2811/tmp/myEcho</sourceUrl>
          
<destinationUrl>gsiftp://osg-test1.unl.edu:2811/tmp/myEcho.sik</destinationUrl>
       </transfer>
   </fileStageOut>
</job>
------- Comment #3 From 2008-04-18 14:04:17 -------
The I receive when I try to do this contains an invalid XML fault notification
message:

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <soapenv:Header>
       
<wsa:MessageID>uuid:e0e5b710-0d62-11dd-bf9f-f2294a2cccec</wsa:MessageID>
       
<wsa:To>https://localhost:38004/wsrf/services/NotificationConsumerService</wsa:To>
       
<wsa:Action>http://docs.oasis-open.org/wsn/bw-2/NotificationConsumer/NotifyRequest</wsa:Action>
        <From xmlns="http://www.w3.org/2005/08/addressing">
            <Address>http://www.w3.org/2005/08/addressing/anonymous</Address>
        </From>
        <ns06:ResourceID xmlns:ns04="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ns05="http://www.w3.org/2001/XMLSchema"
xmlns:ns06="http://docs.oasis-open.org/wsn/bw-2/provider"
ns04:type="ns05:string"
wsa:IsReferenceParameter="true">df49358a-0d62-11dd-8df2-0017f2cb7d49</ns06:ResourceID>
    </soapenv:Header>
    <soapenv:Body>
        <Notify xmlns="http://docs.oasis-open.org/wsn/b-2">
            <NotificationMessage xmlns:ns1="http://docs.oasis-open.org/wsn/b-2"
xsi:type="ns1:NotificationMessageHolderType">
                <Topic
xmlns:ns2="http://www.globus.org/namespaces/2008/03/gram/job"
Dialect="http://docs.oasis-open.org/wsn/t-1/TopicExpression/Simple">ns2:stateChangeInformation</Topic>
                <ProducerReference>
                   
<wsa:Address>https://mactop.local:38000/wsrf/services/ManagedJobFactoryService</wsa:Address>
                    <wsa:ReferenceParameters>
                        <ns3:ResourceID
xmlns:ns3="http://www.globus.org/namespaces/2008/03/gram/job">df73fea0-0d62-11dd-bf9f-f2294a2cccec</ns3:ResourceID>
                    </wsa:ReferenceParameters>
                </ProducerReference>
                <Message>
                    <ns4:stateChangeNotificationMessage
xmlns:ns4="http://www.globus.org/namespaces/2008/03/gram/job">
                        <ns5:state
xmlns:ns5="http://www.globus.org/namespaces/2008/03/gram/job/types">Failed</ns5:state>
                        <ns6:fault
xmlns:ns6="http://www.globus.org/namespaces/2008/03/gram/job/faults"
xsi:type="ns6:FaultResourcePropertyType">
                            <ns6:stagingFault>
                                <ns7:Timestamp
xmlns:ns7="http://docs.oasis-open.org/wsrf/bf-2">2008-04-18T16:17:00.928Z</ns7:Timestamp>
                                <ns8:Originator
xmlns:ns8="http://docs.oasis-open.org/wsrf/bf-2">
                                   
<wsa:Address>https://mactop.local:38000/wsrf/services/ManagedJobFactoryService</wsa:Address>
                                    <wsa:ReferenceParameters>
                                       
<ns4:ResourceID>df73fea0-0d62-11dd-bf9f-f2294a2cccec</ns4:ResourceID>
                                    </wsa:ReferenceParameters>
                                </ns8:Originator>
                                <ns9:Description
xmlns:ns9="http://docs.oasis-open.org/wsrf/bf-2">Staging error for RSL element
fileStageIn. </ns9:Description>
                                <ns10:FaultCause
xmlns:ns10="http://docs.oasis-open.org/wsrf/bf-2"
xsi:type="ns10:BaseFaultType">
                                   
<ns10:Timestamp>2008-04-18T16:17:00.928Z</ns10:Timestamp>
                                    <ns10:ErrorCode
dialect="http://www.globus.org/fault/stacktrace">
        at sun.reflect.GeneratedConstructorAccessor146.newInstance(Unknown
Source)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
        at java.lang.Class.newInstance0(Class.java:350)
        at java.lang.Class.newInstance(Class.java:303)
        at org.globus.exec.utils.FaultUtils.makeFault(FaultUtils.java:530)
        at
org.globus.exec.utils.FaultUtils.createStagingFault(FaultUtils.java:386)
        at
org.globus.exec.service.exec.processing.handler.StagingHandler.processCompletedStagingJob(StagingHandler.java:101)
        at
org.globus.exec.service.exec.processing.handler.StageInResponseStateHandler.process(StageInResponseStateHandler.java:76)
        at
org.globus.exec.service.exec.processing.handler.InternalStateHandler.processInternalState(InternalStateHandler.java:49)
        at
org.globus.exec.service.exec.processing.StateMachine.processInternalState(StateMachine.java:118)
        at
org.globus.exec.service.exec.processing.StateProcessingTask.run(StateProcessingTask.java:67)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:613)
</ns10:ErrorCode>
                                   
<ns10:Description>org.globus.exec.generated.StagingFaultType</ns10:Description>
                                    <ns10:FaultCause
xmlns:ns11="http://www.globus.org/namespaces/2004/10/rft"
xsi:type="ns11:TransferFaultType">
                                       
<ns10:Timestamp>2008-04-18T16:17:00.928Z</ns10:Timestamp>
                                        <ns10:FaultCause
xsi:type="ns10:BaseFaultType">
                                           
<ns10:Timestamp>2008-04-18T16:17:00.928Z</ns10:Timestamp>
                                            <ns10:ErrorCode
dialect="http://www.globus.org/fault/stacktrace">
        at sun.reflect.GeneratedConstructorAccessor104.newInstance(Unknown
Source)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
        at java.lang.Class.newInstance0(Class.java:350)
        at java.lang.Class.newInstance(Class.java:303)
        at
org.apache.axis.encoding.ser.BeanDeserializer.&lt;init&gt;(BeanDeserializer.java:105)
        at
org.apache.axis.encoding.ser.BeanDeserializer.&lt;init&gt;(BeanDeserializer.java:91)
        at
org.globus.rft.generated.TransferFaultType.getDeserializer(TransferFaultType.java:99)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.axis.encoding.DeserializationContext.getDeserializerForClass(DeserializationContext.java:510)
        at
org.apache.axis.encoding.ser.BeanDeserializer.onStartChild(BeanDeserializer.java:302)
        at
org.apache.axis.encoding.DeserializationContext.startElement(DeserializationContext.java:1035)
        at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown
Source)
        at
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
        at javax.xml.parsers.SAXParser.parse(Unknown Source)
        at
org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
        at
org.globus.wsrf.encoding.ObjectDeserializer.toObject(ObjectDeserializer.java:118)
        at
org.globus.wsrf.encoding.ObjectDeserializer.getFirstAsObject(ObjectDeserializer.java:216)
        at
org.globus.exec.service.exec.processing.StagingListener.deliver(StagingListener.java:183)
        at
org.globus.wsrf.impl.SimpleSubscriptionTopicListener.notify(SimpleSubscriptionTopicListener.java:230)
        at
org.globus.wsrf.impl.SimpleSubscriptionTopicListener.topicChanged(SimpleSubscriptionTopicListener.java:120)
        at org.globus.wsrf.impl.SimpleTopic.topicChanged(SimpleTopic.java:205)
        at org.globus.wsrf.impl.SimpleTopic.notify(SimpleTopic.java:112)
        at
org.globus.wsrf.impl.ResourcePropertyTopic.fireNotification(ResourcePropertyTopic.java:213)
        at
org.globus.wsrf.impl.ResourcePropertyTopic.set(ResourcePropertyTopic.java:241)
        at
org.globus.transfer.reliable.service.TransferWork.setRequestStatus(TransferWork.java:264)
        at
org.globus.transfer.reliable.service.TransferWork.statusChanged(TransferWork.java:218)
        at
org.globus.transfer.reliable.service.TransferWork.processStates(TransferWork.java:583)
        at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:852)
        at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:376)
        at java.lang.Thread.run(Thread.java:613)
</ns10:ErrorCode>
                                           
<ns10:Description>org.globus.rft.generated.TransferFaultType</ns10:Description>
                                        </ns10:FaultCause>
                                        <Timestamp xmlns=""
xsi:type="xsd:dateTime">2008-04-18T16:17:00.914Z</Timestamp>
                                        <Originator xmlns=""
xmlns:ns3="http://www.w3.org/2005/08/addressing"
xsi:type="ns3:EndpointReferenceType">
                                            <Address xmlns=""
xsi:type="ns3:AttributedURIType">https://mactop.local:38000/wsrf/services/ReliableFileTransferFactoryService</Address>
                                            <ReferenceParameters xmlns=""
xsi:type="ns3:ReferenceParametersType">
                                                <ns2:TransferKey
xmlns:ns2="http://www.globus.org/namespaces/2004/10/rft"
xsi:type="xsd:string">27</ns2:TransferKey>
                                            </ReferenceParameters>
                                            <Metadata xmlns=""
xsi:type="ns3:MetadataType" xsi:nil="true"/>
                                        </Originator>
                                        <ErrorCode xmlns=""
dialect="http://www.globus.org/fault/stacktrace">Error updating Permissions of
a file/garbage/echo. Caused by org.globus.ftp.exception.ServerException: Server
refused performing the request. Custom message: Server refused MLST command
(error code 1) [Nested exception message:  Custom message: Unexpected reply:
500-Command failed : System error in stat: No such file or directory
500-A system call failed: No such file or directory
500 End.].  Nested exception is
org.globus.ftp.exception.UnexpectedReplyCodeException:  Custom message:
Unexpected reply: 500-Command failed : System error in stat: No such file or
directory
500-A system call failed: No such file or directory
500 End.
        at
org.globus.ftp.vanilla.FTPControlChannel.execute(FTPControlChannel.java:415)
        at org.globus.ftp.FTPClient.mlst(FTPClient.java:616)
        at
org.globus.transfer.reliable.service.TransferClient.updatePermissions(TransferClient.java:639)
        at
org.globus.transfer.reliable.service.TransferClient.transfer(TransferClient.java:711)
        at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:819)
        at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:376)
        at java.lang.Thread.run(Thread.java:613)</ErrorCode>
                                        <Description xmlns="">Error updating
Permissions of a file/garbage/echo [Caused by: Server refused performing the
request. Custom message: Server refused MLST command (error code 1) [Nested
exception message:  Custom message: Unexpected reply: 500-Command failed :
System error in stat: No such file or directory
500-A system call failed: No such file or directory
500 End.]]</Description>
                                        <FaultCause xmlns=""
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
soapenc:arrayType="xsd:anyType[1]" xsi:type="soapenc:Array">
                                            <FaultCause xmlns=""
xmlns:ns4="http://docs.oasis-open.org/wsrf/bf-2" xsi:type="ns4:BaseFaultType">
                                                <Timestamp xmlns=""
xsi:type="xsd:dateTime">2008-04-18T16:17:00.915Z</Timestamp>
                                                <Originator xmlns=""
xmlns:ns5="http://www.w3.org/2005/08/addressing"
xsi:type="ns5:EndpointReferenceType" xsi:nil="true"/>
                                                <ErrorCode xmlns=""
xsi:nil="true"/>
                                                <Description xmlns="">Error
updating Permissions of a file/garbage/echo [Caused by: Server refused
performing the request. Custom message: Server refused MLST command (error code
1) [Nested exception message:  Custom message: Unexpected reply: 500-Command
failed : System error in stat: No such file or directory
500-A system call failed: No such file or directory
500 End.]]</Description>
                                                <FaultCause xmlns=""
xsi:type="ns4:BaseFaultType">
                                                    <Timestamp xmlns=""
xsi:type="xsd:dateTime">2008-04-18T16:17:00.915Z</Timestamp>
                                                    <Originator xmlns=""
xmlns:ns6="http://www.w3.org/2005/08/addressing"
xsi:type="ns6:EndpointReferenceType" xsi:nil="true"/>
                                                    <ErrorCode xmlns=""
dialect="http://www.globus.org/fault/stacktrace">Error updating Permissions of
a file/garbage/echo. Caused by org.globus.ftp.exception.ServerException: Server
refused performing the request. Custom message: Server refused MLST command
(error code 1) [Nested exception message:  Custom message: Unexpected reply:
500-Command failed : System error in stat: No such file or directory
500-A system call failed: No such file or directory
500 End.].  Nested exception is
org.globus.ftp.exception.UnexpectedReplyCodeException:  Custom message:
Unexpected reply: 500-Command failed : System error in stat: No such file or
directory
500-A system call failed: No such file or directory
500 End.
        at
org.globus.ftp.vanilla.FTPControlChannel.execute(FTPControlChannel.java:415)
        at org.globus.ftp.FTPClient.mlst(FTPClient.java:616)
        at
org.globus.transfer.reliable.service.TransferClient.updatePermissions(TransferClient.java:639)
        at
org.globus.transfer.reliable.service.TransferClient.transfer(TransferClient.java:711)
        at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:819)
        at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:376)
        at java.lang.Thread.run(Thread.java:613)</ErrorCode>
                                                    <Description
xmlns="">org.globus.transfer.reliable.service.exception.RftException</Description>
                                                    <FaultCause xmlns=""
xsi:type="xsd:anyType" xsi:nil="true"/>
                                                </FaultCause>
                                            </FaultCause>
                                        </FaultCause>
                                    </ns10:FaultCause>
                                </ns10:FaultCause>
                               
<ns6:stateWhenFailureOccurred>StageIn</ns6:stateWhenFailureOccurred>
                                <ns6:gt2ErrorCode>0</ns6:gt2ErrorCode>
                                <ns6:attribute>fileStageIn</ns6:attribute>
                            </ns6:stagingFault>
                        </ns6:fault>
                        <ns12:exitCode
xmlns:ns12="http://www.globus.org/namespaces/2008/03/gram/job/types">0</ns12:exitCode>
                        <ns13:holding
xmlns:ns13="http://www.globus.org/namespaces/2008/03/gram/job/types">false</ns13:holding>
                    </ns4:stateChangeNotificationMessage>
                </Message>
            </NotificationMessage>
        </Notify>
    </soapenv:Body>
</soapenv:Envelope>
------- Comment #4 From 2008-04-18 14:05:40 -------
Note at one point there are elements like:

<Timestamp xmlns=""
xsi:type="xsd:dateTime">2008-04-18T16:17:00.914Z</Timestamp>

which contain no namespace and don't belong at that part of the fault type
sequence.
------- Comment #5 From 2008-04-18 14:20:50 -------
Reassigning this as a job service bug.
------- Comment #6 From 2008-05-14 07:53:01 -------
Mike D'Arcy did a change in Java WS Core that fixed this issue.
------- Comment #7 From 2008-05-14 07:55:03 -------
Here's what he did:

"I think I may have found a solution to this.  Attached is a diff
whichshows the change required.  

Here it is in a nutshell: in local invocation mode,
SimpleSubscriptionTopicListener.notify() method needs to create a
MessageContext before serializing the notification message and calling
the registered callback's deliver() method.  Doing this causes the
correct TypeMappingRegistry to be found, and the serialization errors
disappear.

In addition to the above code change, BaseFaultTypeFaultCause needs to
have a typemapping in client-config.wsdd.  This is because the core API
that I used to create a MC (ContainerConfig.getContext()) uses an Axis
client engine rather than a server engine and thus reads from
client-config.wsdd rather than server-config.wsdd."