Bugzilla – Bug 4182
Improve Condor/Fork Job Monitoring for reliability and security
Last modified: 2012-09-05 13:38:55
You need to log in before you can comment on or make changes to this bug.
Title: Improve Condor/Fork Job Monitoring for reliability and security Projects: VDT, OSG, TG Technologies: WS-GRAM Definition: The SEG modules for Condor and Fork that GT 4.0.1 uses require that the job log files be world writable. This can present a situation where malicious users may cause spurious job state changes which can affect any other user's jobs. There is also a potential that a malicious user could cause otherwise unpredictable behavior by writing incorrectly-formatted job state change messages in the log file. Deliverables: 1) Updated Job Monitoring software for WS-GRAM which includes new monitoring code which can handle having multiple different log files per scheduler. 2) Updated Condor and Fork scheduler modules which handle per-user scheduler logs 3) Updated setup packages and administrator documentation for configuring/securing an installation. Tasks: 1) Modify the Job State Monitor to include subscriptions based on user name + Job ID, so that it can verify the origin of SEG events 2) Update the SEG protocol to include a user name in the events so that the SEG can communicate the origin of SEG events to the Java code 3) Update test cases for SEG to include tests for multiple log files, verifying ownership 4) Update recovery code in WS-GRAM to handle recovering state from multiple log files if applicable. Update SEG executable to include a way to restart from different timestamps for different log files if necessary. 5) Update the setup package for WS-GRAM to create a world-writable, sticky directory for log files. Update the condor and fork scheduler setup packages and scripts to look to that directory for log files. 6) Update Sys Admin and SEG documentation Time Estimate: 3-4 weeks
Doing some bugzilla cleanup... Resolving old GRAM3 and GRAM4 issues that are no longer relevant since we've moved on to GRAM5. Also, we're now tracking issue in jira. Any new issues should be added here: http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363