Sane patch schedule for Windows 2003 cluster

We’ve got a cluster of 75 Win2k3 nodes at work in a coarse grained compute cluster. The cluster is behind a mountain of firewalls and resides in its own VLAN. Jobs of all sizes and types run on the cluster and all of the executables running are custom-made.

(ed: additional notes on our executables) The jobs range from 30 seconds to 7 days in duration, and may contain one executable or 2000 sub-jobs (of short duration). Obviously we are trying to avoid the situation where our IT schedules a reboot during a 7 day production job.

Nous avons scheduling software which accomodates all of the normal tasks for a coarse grained cluster and we can control which machines are active for submission, etc. If WSUS was in some way scriptable (or le client could state it’s availability for shutdown) we could coordinate the two systems and help out.

Currently, the patch schedule is the Sunday after Super Tuesday regardless of what is running on the cluster. Nous avons to ask for an exemption every time we want to delay patching a machine for a long running production job. En résumé, while our group is responsible for the machines we have little control over IT’s patch schedule.

  • Is patching monthly with MS’s schedule sane for a production Windows cluster?

  • Are there software hooks in WSUS where we could say, “please don’t reboot just yet”?

1.Is patching monthly with MS’s schedule sane for a production Windows
cluster?

Yes cependant a cluster should not have any downtime associated with a patch as it should fail the jobs over to another node- I would NOT patch the entire cluster at le même time (that would be insane)

2.Are there software hooks in WSUS where we could say, “please don’t
reboot just yet”?

TVoici no way for end users to stop a WSUS update or reboot but it sounds to me like you have a real communication problem between your group and the IT group; cependant you should be able to lose 1 node at a time with little impact to production.