Recovery and failure notifications – Apple Compressor (4.0) User Manual
Page 235

Chapter 8
Use Apple Qmaster to set up a distributed processing system
235
Stage 3: Mount the media storage volumes
Follow the instructions below so that all the computers in the cluster are mounting all the media
volumes in the cluster.
1
On each computer, log in as the administrator.
(The first user account you create when you set up OS X is an administrator account.)
2
On each computer in the group, use the Connect to Server command from the Finder’s Go menu
to mount each media volume.
3
Enter another computer’s name in the Connect to Server dialog and click Connect.
4
Choose the associated media volume as the volume you want to mount.
5
Repeat steps 1 through 4 until all the computers are mounting all the media volumes in the cluster.
After you finish the three tasks above, each one of these computers can be used to submit jobs
for distributed processing. Because of the way access has been configured, all file pathnames are
conveniently consistent and simple for the purposes of specifying them in Compressor, in Shake
scripts, and in Apple Qmaster, assuming that:
•
Users place the source media on a mounted media volume.
•
Users place the Shake scripts on a mounted media volume.
•
All folders and files on the shared media volumes have read-and-write access enabled for
everyone (for Owner, Group, and Others). You can configure this access setting by selecting the
folder or file and choosing File > Get Info.
These three assumptions are important because they ensure that all the computers have read-
and-write access to all the source files and output destinations.
Recovery and failure notifications
The Apple Qmaster distributed processing system has a number of built-in features designed to
attempt recovery if there’s a problem and to notify you when the system attempts a recovery.
Recovery features
The recovery actions described next occur automatically if failures occur in the Apple Qmaster
distributed processing system. There’s no need for you, as the administrator, to enable or
configure these features.
•
If a service stops unexpectedly: If either the cluster controller service or the processing enabled
on a service node stops unexpectedly, the Apple Qmaster distributed processing system
restarts the service. To avoid the risk of endless stopping and restarting, the system restarts the
failed service a maximum of four times. The first two times, it restarts the service right away. If
the service stops abruptly a third or fourth time, the system restarts the service only if it had
been running for at least 10 seconds before it stopped.
•
If a batch is interrupted: When a service stops suddenly while in the middle of processing an
Apple Qmaster batch, the cluster controller resubmits the interrupted batch in a way that
prevents the reprocessing of any batch segments that were complete before the service
stopped. The cluster controller delays resuming the batch for about a minute from the time it
loses contact with the service.
•
If a batch fails: When the service is running, but one batch fails to process, a service exception
occurs. When this happens, the cluster controller resubmits the batch immediately. The
cluster controller resubmits the batch a maximum of two times. If the job fails on the third
submission, the distributed processing system stops resubmitting the job. In Share Monitor,
the job’s status is set to Failed.