Bugzilla – Bug 3751
convert persistence data store from files to a database
Last modified: 2007-10-26 21:46:36
You need to log in before you can comment on or make changes to this bug.
People: Ravi (developer), Stu & Peter (advisory/design review) component impact: WS GRAM ( MJFS, MEJS) Definition: Convert MJFS and MEJS to maintain it's job state information in a DB Benefits: Provide the capability for future enhancements to efficiently support clients to do monitoring and control operations that apply to multiple jobs: - kill all of Ravi’s jobs - show me all jobs that have been running more than 2 hours Provide the capability for future enhancements to efficiently support aggregated views of service information - job submission rate - job failure rate Improve throughput - job state updates should be more efficient and faster. Instead of rewriting a file for each update, only update a field in a row Improve reliability - less thrashing from client's polling + currently job state polling can cause a read of the state file from disk. Converting to DB queries should be more efficient. Tasks: 1. Replace resource persistent calls with equivalent DB calls 2. Make sure all queries are SQL 92 (Intermediate-level conformance) compliant 3. Use an embedded Database by default for usability (Use Derby, leverage RFT work) 4 Other databases should be easily configurable for use instead of Derby 5. Document new configuration - admin guide 6. Testing, run these tests for combinations of: Default GT container, Tomcat, Derby and another DBs (postgress, Oracle, mysql): a) globus_wsrf_gram_scheduler_test b) recoverability test c) throughput test results for TRUNK and campaign branch with monitoring via notification and polling Resources: 1 developer for 4 weeks
*** Bug 3697 has been marked as a duplicate of this bug. ***
Btw, please let me know if this campaign will deal with persisting the subscriptions to a DB. This is something that should probably be part of core as other people might reuse that functionality.
Jarek I was thinking about doing that for RFT and if i get it to work may be do it for GRAM. But my intention was to just play around with it, but we can talk about making it more general.
Created an attachment (id=705) [details] gram database schema
For some reason my comments were lost when I attached the gram database schema. Here they go again : This is the initial version of gram db schema i came up with. If you think some things should be there and are missing let me know. I will start working off this schema and add stuff as I traverse the code base. Let me know if you have comments.
The schema needs to contain all of the fields of the *ResourceDataType complexTypes. These types are defined in the *_data.xsd files. ManagedExecutableJobResourceDataType and ManagedMultiJobResourceDataType inherit elements from ManagedJobResourceDataType. It probably makes sense to create two tables, one for exec jobs and one for multi jobs (both contain the same base type fields).
I will attach version 2 of the db schema. Please comment.
Created an attachment (id=707) [details] db schema version 2 Version 2 of database schema from GRAM
I compare the XML schema types, but it looks good. Just a couple of comments: -- JOB_DESCRIPTION is RSL in the form of a STRING Try not to use "RSL" anywhere. We're trying to leave RSL in GT2 since it defined a proprietary language. I think JOB_DESCRIPTION is self explanatory; but if you need a description, I'd recommend something like "the XML job description document". -- STATE is more likely be one of the values listed in InternalStateEnumeration INTERNAL_STATE is of this type. STATE is of type StateEnumeration. This is the current *external* state (i.e what the client sees). "MANAGED_JOB_EXEC_JOB_DATA" should be "MANAGED_EXEC_JOB_DATA".
I *didn't' compare schemas, rather.
Is job id an UUID always? UUIDs are of constant size - ~32 characters if I remember correctly (in the text form). Also, why are different *_TIME defined as DOUBLE? BIGINT would probably make more sense.
IIRC, you have to give the UUID in the 32char format. I think I helped someone using the GramJob class and had problems getting it working because he used an arbitrary string for the job id.
Looks like I'm taking this over...
Created an attachment (id=792) [details] db schema version 3
1) What about the primary & foreign keys? 2) What about JOBID begin 32-36 characters long? 3) What about *_TIME fields being defined as DOUBLE? 4) Are all the 'STATE' fields different or the same? Some are defined as VARCHAR (11) and some VARCHAR(256). If they are all the same, why not define it as an int? 5) What are SUBSCRIPTION_ENDPOINTS & TOPIC_LISTENERS tables used for & how?
1) What about the primary & foreign keys? I wasn't sure Derby supported that. I'm having trouble finding good documentation on that. 2) What about JOBID begin 32-36 characters long? I thought job IDs could be arbitrarly long. 3) What about *_TIME fields being defined as DOUBLE? Good catch. I was seeing this as LONG and didn't change it. No need to use a f.p. number here. 4) Are all the 'STATE' fields different or the same? Some are defined as VARCHAR (11) and some VARCHAR(256). If they are all the same, why not define it as an int? They're not all the same, but yeah, VARCHAR(256) is stupid. I should make those INTEGERs. The first STATE field should hold a string, and I determined the maximum size to be 11 characters. 5) What are SUBSCRIPTION_ENDPOINTS & TOPIC_LISTENERS tables used for & how? IIRC, the TOPIC_LISTENERS are used to know who was subscribed. I think the SUBSCRIPTION_ENDPOINTS are used for restoring subscription to RFT.
Isn't a JOBID in GRAM always an UUID? I was wondering about the SUBSCRIPTION_ENDPOINTS & TOPIC_LISTENERS tables becuase they only have 1 field. There is no mapping for example between JOBID and a specific endpoint row or topic listener. See: http://db.apache.org/derby/docs/10.1/ref/
Job IDs can be specified during the submission by the client, and there's nothing dictating that it has to be a UUID AFAIK. I did notice the missing JOB_ID keys and fixed those. I just haven't put up a new attachement.
Oh, ok. Cool.
Created an attachment (id=803) [details] latest DB schema
The new schema looks good. But I think all the 'STATE VARCHAR(11)' would probably work better as an integer.
What do you mean by "work better"? I'd have to create some arbitrary integer map and convert to and from that every time I save or retrieve on of those values. It's not that big of deal, but what's the benefit?
Basic single jobs are working with the new DB persistence code. If all goes well with testing then all that will be left to do is update the documentation. Since we still don't have a development documentation repository I will probably have to stall the campaign.
Scheduler tests for Fork and Multi are passing. I have yet to run recoverability or throughput tests. I have also yet to test in Tomcat or even create a DB schema for another DB application other than Derby.
Recoverability was broken partly because of the core change from a single, shared container security descriptor per JVM to one per container. This requires a message context to be able to get the correct one associated with a particular container. The managed job factory home caches the security descriptor while it's running, but upon restart this value is null and remains null unless another createManagedJob() call comes in. I've hacked a fix for this, but it requires that the container's cred subject be in the grid-mapfile. I find this unacceptable, so I'm consulting with Jarek and Rachana to find out if there's a better way to do this. I've comitted the fix to the branch and will resume testing, but hopefully I can go back and put a better fix in later.
Ok, the solution to comment #25 turned out to be "Peter, you're dumb." It seems that there are all sorts of places to get that context. After tracking down a missing fault bug and changing the code to ignore file deletions if it's just because the file did not exist (it logs a warning, though), recoverability tests are passing now. Thus, all that's left is to test in Tomcat and test with a different DB app. Again, I'm assuming the doc won't get updated and the campaign will then be stalled.
I successfully tested with PostgreSQL. For when I'm able to write the documentation, here's what I changed the driver config to: <parameter> <name>driverClassName</name> <value>org.postgresql.Driver</value> </parameter> <parameter> <name>url</name> <value>jdbc:postgresql:gramDatabase</value> </parameter> <parameter> <name>username</name> <value>lane</value> </parameter>
I can't get Tomcat to work. I filled a core bug since globus_4_0_branch works fine but my installs from HEAD and this campaign branch don't. I've listed that bug as a dependency to this campaign.
I finally got throughput tests running. I was having problem with a threading problem hosing the container if the job parallelism got too big. It looks like this is actually in the connection pool implementation for DB connections: http://forum.java.sun.com/thread.jspa?threadID=658047&messageID=3866157 If I increase the maxActive value in the jndi-config.xml to match the number of run queue threads, then I don't seem to have any more problems. I'll have to put a warning comment above both of those settings until someone fixes this in the JVM (if that ever happens) to say that the connection pool size should be equal to the number of run queue thread (a larger pool size would be pointless since all acces to the DB should be restricted by the number of run queue threads).
Bug #4193 was fixed and I've been able to get the branch to work with Tomcat. Things look good there. I'm running throughput tests right now. It looks like the throughput has dropped about 10 jobs per minute, so the design I implemented that pulls resource data and property values from the DB each time might not have been the best idea. It will probably be worth it to cache all these values at some point.
Here are some throughput nubmers for the campaign_540_branch code: http://www-unix.mcs.anl.gov/~lane/Test-reports/GRAM/Throughput/campaign_540_branch/ The comparison numbers for HEAD are here: http://www-unix.mcs.anl.gov/~lane/Test-reports/GRAM/Throughput/HEAD/ As one can see, there was a drop of about 10 jobs/min from the average. Because of this we need another campaign to figure out how to improve the performance before we merge the campaign branch to the trunk. I'm stalling this campaign again because of that and the fact that a 4.2 documenatation branch still isn't ready.
The 4.2-drafts doc branch is in place, but I haven't gotten a chance to get back to the doc part of this campaign. I'll update the status when I'm prepared to work on it.
Removed db persistency from trunk a while ago. There had been some problems with it that were hard to track and it added too much complexity to Gram logic. Looks like Core will provide something for that in short or medium terms.