Bugzilla – Bug 4902
GRAM4 auditing on events
Last modified: 2007-06-27 09:51:06
You need to log in before you can comment on or make changes to this bug.
Title: GRAM4 auditing on events (not just end of job) Technologies: Globus Resource Allocation Manager (GRAM) Definition: GRAM4 auditing outputs an audit record for a job only at the end of a job. There is information in the audit record that could be of value to a user/admin/LRM interface during the execution of the job. The proposed change is to insert the job audit record at the beginning of the job and then update the job audit record after the job submission to the Local Resource Manager is successful and then again at the end of the job. Here are the current audit schema fields: job_grid_id local_job_id subject_name username idempotence_id creation_time queued_time stage_in_grid_id stage_out_grid_id clean_up_grid_id globus_toolkit_version resource_manager_type job_description success_flag These are the fields that will be inserted on the initial update after the createManagedJob operation: job_grid_id subject_name username idempotence_id creation_time resource_manager_type job_description These are the fields that will be added after the LRM job submission is done: local_jod_id queued_time stage_in_grid_id These are the fields that will be added at the end of the job: stage_out_grid_id clean_up_grid_id success_flag Deliverables: 1) A patch to GRAM4 auditing code that applies to the community branch 2) Unit test(s) that exercise the new code Tasks: 1) Build the Globus Toolkit from the globus_4_0_community branch 2) Test installation by verifying the current end of job audit capabilities work 3) Modify code to insert the initial audit record after the reply to the createManagedJob (after the MEJR is created) 4) Modify code to augment the audit record with new information after the LRM job submission is done 5) Modify code to augment the audit record with new information after the job has completed 6) Write unit tests for code 7) test code 8) test performance impact of change (these could be done by us - GRAM team) a. sequential tests (for 20 iterations) - Measure the time for a globusrun-ws job to complete before changes - Measure the time for a globusrun-ws job to complete after changes b. concurrent tests (for 3 iterations) - Measure the time for a globusrun-ws job to complete before changes - Measure the time for a globusrun-ws job to complete after changes 9) create a patch for these audit enhancements and unit tests 10) post to bugzilla campaign
Just so you know, you'll have to completely redo how audit records are written. The DB appender thing isn't going to cut it. That's why we didn't do this from the beginning.
(In reply to comment #1) > Just so you know, you'll have to completely redo how audit records are written. > The DB appender thing isn't going to cut it. That's why we didn't do this from > the beginning. > I didn't quite understand what you meant by "you'll have to complete redo how the audit records are written." I was able to create a quick/simple patch to StateMachine.logAuditRecord() method which lets the AuditDatabaseAppender choose if it's going to do an insert or an update to the gram_audit_database table. Attached is the patch. Is this already acceptable with everyone? Only issue I have is writing a junit test for this. Can you point me on a reusable junit test you guys have written for the WSGRAM Audit patch? How do you want me to test the patch? Should I be testing for GRAM state changes and its effect on the entries in the DB?
Created an attachment (id=1144) [details] patch which updates auditDatabase everytime the user's job changes its state
WRT comment #2, I just meant that at some point it becomes silly to use log4j for database access. Once you start differentiating between types of database updates, it would be better to just scrap log4j and go with a straight method call to a DB class. That said, I wasn't aware at the time I wrote my comment that this was assigned to you, Gerson. If you already did this then I'm not going to complain. If we add this to the 4.0 code base, though, we ought to also port it to HEAD. The HEAD version is significantly different and would be easier to convert to not be based on log4j. The only reason I originally went the log4j route is because they had an appender that hid the DB calls. In HEAD, I do all the DB calls explicitly. Anyway, I'll try to take a look at the patch you submitted today.
(In reply to comment #4) Have you already had a look at the audit patch I created for the GT4 community branch? If you want this patch not to be based on log4j, do you have any suggestions on how users will disable the audit database logging if we are not going to use the log4j appender?
Gerson, The patch looks good except for the fact that it doesn't seem to accomodate job manager audit logging. The main() method takes CSV records that are generated by the GT2 job manager and generates logging statments. But it doesn't look like main() was updated to prepend "insert:" so that it will work properly. This should be trivial to fix. If you can make that minor change, I'll commit it today. As for the HEAD version, you can just do something similar to what you did for the community branch for now. I'm not sure it's going to be high on anybody's proprity list to change things not to use log4j anytime soon. Peter
Created an attachment (id=1160) [details] patch which updates auditDatabase everytime the user's job changes its state Hi Peter, Sorry for the delayed response. I was away on holidays for the past couple of weeks. I basically just added an "insert: " string to the Logger.info() method call and inserted another line to the output of the "--help" option describing the finished_flag column that I added to the gram_audit_table table. This probably will require you guys to modify the GT2 source that outputs the CSV file that you were talking about. Or, I can again modify the source so it assumes that all the info that gets inserted in the gram_audit_table has a finished_flag value of true. Let me know if you still want me to make any changes to the source. With regard to patching the audit stuff on the HEAD branch, I might need help on doing that. I will need to get the container working first before I can start hacking the source. I sent an email to the gram-dev list 12/15/2006 with the subject "Audit Patch on GT4's HEAD branch" but still haven't received any response from it. The jobs that I send to the container (HEAD version) fails because of some RFT issues that I couldn't fix. I am still using postgres because I want to avoid installing Derby just to get the HEAD working.
(In reply to comment #7) > Created an attachment (id=1160) [edit] [details] > patch which updates auditDatabase everytime the user's job changes its state > > Hi Peter, > > Sorry for the delayed response. I was away on holidays for the past couple of > weeks. So were we, so no big deal. > > I basically just added an "insert: " string to the Logger.info() method call > and inserted another line to the output of the "--help" option describing the > finished_flag column that I added to the gram_audit_table table. This probably > will require you guys to modify the GT2 source that outputs the CSV file that > you were talking about. Or, I can again modify the source so it assumes that > all the info that gets inserted in the gram_audit_table has a finished_flag > value of true. Let me know if you still want me to make any changes to the > source. This can be assigned over to Joe Bester when your done so he can add the extra column. > With regard to patching the audit stuff on the HEAD branch, I might need help > on doing that. I will need to get the container working first before I can > start hacking the source. I sent an email to the gram-dev list 12/15/2006 with > the subject "Audit Patch on GT4's HEAD branch" but still haven't received any > response from it. The jobs that I send to the container (HEAD version) fails > because of some RFT issues that I couldn't fix. I am still using postgres > because I want to avoid installing Derby just to get the HEAD working. I'll look for your gram-dev message. Also, you don't have to install Derby since it's already deployed with the GT in HEAD. That's the whole point of using it (i.e. out-of-box DB functionality).
(In reply to comment #7) > With regard to patching the audit stuff on the HEAD branch, I might need help > on doing that. I will need to get the container working first before I can > start hacking the source. I sent an email to the gram-dev list 12/15/2006 with > the subject "Audit Patch on GT4's HEAD branch" but still haven't received any > response from it. The jobs that I send to the container (HEAD version) fails > because of some RFT issues that I couldn't fix. I am still using postgres > because I want to avoid installing Derby just to get the HEAD working. I can't seem to find the email you speak of. I show nothing but the automated test emails on gram-dev around the date you specify. I can't find it in gram-user, discuss, friends, gt-user, or gt-dev either. I also seached my trash and junk folders.
(In reply to comment #9) I just forwarded you the mail I sent to the gram-dev list about the problem I was having configuring the HEAD to use postgres. I also tried using derby but had problems getting it to work. Details are in the mail I sent to the gram-user/dev list.
Created an attachment (id=1202) [details] GT4 HEAD WSGRAM start-of-job audit patch Here's the patch which will enable the start-of-job audit info logging on GT4's HEAD branch.
I had a look, didn't have any complaints and applied that patch to the code from HEAD
The new audit logging model to log information about the job at * the very beginning of job processing * after the job had been submitted to the LRM * at the end of processing (failed or done) is also in 4.0 branch and hence will be part of 4.0.5 that will be sent out probably on Monday. Gerson: i think we're done with that then, right? If so: can you close this campaign?