Ubuntu logo

Developer Summit

Upstart service readiness

2012-05-08 16:15..17:00 in G. Ballroom F

Upstart currently considers a service "ready" (fully initialised) once:

  • [Services] The process has forked the expected number of times (0-2)
  • [Tasks] The process has been exec'd successfully

For daemons therefore, "service readiness" is inextricably linked to the overloaded 'expect' stanza which is also used for PID tracking.

The problem is that some services (such as cups) are not ready once they have forked 'n' times.

The proposal is to introduce a new 'ready on' stanza coupled with a 'ready' event that would allow explicit control over when Upstart deems a service to be in a usable state:

http://people.canonical.com/~jhunt/blueprints/upstart-service-readiness-table.html

Summary:

  • No change to existing 'expect' behaviour.
  • If no 'ready on' condition specified, 'ready' event emitted immediately   after 'started'.
  • If 'ready on' condition specified, 'ready' event emitted if and when   condition becomes true.
  • 'ready' event can optionally be used by other services as a more   reliable way to know when a service is fully initialized and thus usable.

Observations:

  • possible to specify multiple values in 'ready on' condition such as:   "ready on (dbus NAME=org.bar.foo and file FPATH=/var/log/myapp.log and socket PROTO=inet PORT=80"   "ready on stopped myjob and started myjob2"
  • upstart-socket-bridge will be retained but with advent of (C), no   longer necessary to modify any daemons as is required by systemd for   "socket activation".

Advantages:

  • No change to existing 'expect' behaviour.
  • Solves the readiness problem since .conf files would have a rich   palette of sources of readiness to choose from which should cover 99%   of all cases (udev, dbus, file, socket).
  • More reliable behaviour.
  • Would allow for simplification for jobs that currently fail to work   solely via ptrace (for example, see gross hacks in /etc/init/cups.conf).

Work required:

  • Finish (C).
  • Implement (D) and (E).
  • Modify upstart-udev-bridge to look at "ready on" job stanzas to allow   "ready on [HTML_REMOVED]".

Concerns:

  • (D) would need to be accepted into the upstream kernel.
  • (D) would not currently work in LXC containers since netlink is effectively disabled (as it is not namespace-aware). Correct fix would presumably be to make netlink ns-aware?
  • (D) ties this feature to Linux rather heavily   (could provide a very crude /proc/net/{tcp,udp} implementation but   performance would be poor as file must be continually re-read!)
  • (C) would need to use inotify (or fsnotify to avoid complexities to overcome racy behaviour for inotify recursive watches) but could be ported to other architectures   (such as FreeBSD using kqueue).

Alternative idea (from apw): put the onus on the daemons to inform Upstart when they are ready.

This is in fact already possible using 'expect stop' where Upstart waits for the application to send SIGSTOP before considering it ready. It could be extended to obtain the PID directly via sigaction(2) to avoid the need to obtain it via ptrace(2). Could go a stage further and provide some sort of formal API rather than a signal to allow a daemon to indicate readiness (coupled with a utility command to do the same).

Advantages:

  • simple.
  • puts onus on daemons rather than Upstart.
  • potentially removes the need to use ptrace for PID tracking.
  • if the API idea were selected, this could be used with SysV jobs too (by providing a NOP implementation for the traditional SysV init).
  • no kernel support required (so would map across to other systems (BSD/Hurd if desirable).
  • could be standardized as part of the LSB since it would be init-system-agnostic.

Disadvantages:

  • daemons may ignore the standard behaviour.
  • we would need to modify every daemon in the archive to work with this model.
  • highly unlikely that commercial vendors would modify their products unless it were an approved standard.
  • putting control in the hands of the daemons is not necessarily desirable: consider if they go haywire - Upstart would not be able to control the problem as it may not yet know the PID.