Recovering from a failed service (Solaris 10)

(taken from From http://www.sun.com/bigadmin/xperts/sessions/19_smf/#8)

Q: How does an admin easily restart the entire svc init train after a boot time failure without actually rebooting? For example, if a file system fails to mount, nearly all network services never get started. What's the simple one-line command to take another stab at getting SMF to restore or start services after such a condition is found and repaired?

A: The short answer: You just need to tell smf(5) that you've repaired the file system service. Just use svcadm clear for the file system service that was in the maintenance state, and all of the services waiting for the file system to be mounted will automatically start.

The longer answer:

If services aren't being started, you can ask the system what's wrong by running. With no other arguments,  will tell you services which smf(5) considers to be in an unusual state: enabled but not running, or keeping another service from running.

That is, the  command attempts to diagnose service failures to their root cause, rather than just telling you everything that's broken. If you include the  option, , you'll see the list of impacted services for each root cause.

In the specific case described, a file system fails to mount and the appropriate file system service will go into the maintenance state. If you run, you'll see that many services aren't running because that file system service (e.g.  ) is in the maintenance state.

Services in the maintenance state are known by smf(5) to need administrator attention. So, once you've repaired the file system, you just need to let smf(5) know that you believe you've corrected the error and it should continue on with boot. You do this with svcadm(1M). If, for example, it was filesystem/local that was in maintenance, you'd run:

svcadm clear filesystem/local

Then smf(5) would make sure the file systems were OK and continue on with the boot process, starting up all the services that were blocked behind the service in maintenance. A full restart of all services isn't necessary, since smf(5) knows the precise dependency relationships among the services.