core: WaitForCommunicator - more robust wait for boot

This is a new built-in middleware that is more robust for
waiting for boots. The "max_tries" configuration is now gone, it is
timeout based. Future commits will make this even better as the
SSH communicator will implement the new "wait_for_ready" in a better
way.
This commit is contained in:
Mitchell Hashimoto 2013-08-29 16:27:00 -07:00
parent 2b7a1297c8
commit 261d0ef6cd
12 changed files with 143 additions and 38 deletions

View File

@ -5,8 +5,7 @@ Vagrant.configure("2") do |config|
config.ssh.forward_x11 = false config.ssh.forward_x11 = false
config.ssh.guest_port = 22 config.ssh.guest_port = 22
config.ssh.keep_alive = true config.ssh.keep_alive = true
config.ssh.max_tries = 100 config.ssh.timeout = 300
config.ssh.timeout = 30
config.ssh.shell = "bash -l" config.ssh.shell = "bash -l"
config.ssh.default.username = "vagrant" config.ssh.default.username = "vagrant"

View File

@ -24,6 +24,7 @@ module Vagrant
autoload :SetHostname, "vagrant/action/builtin/set_hostname" autoload :SetHostname, "vagrant/action/builtin/set_hostname"
autoload :SSHExec, "vagrant/action/builtin/ssh_exec" autoload :SSHExec, "vagrant/action/builtin/ssh_exec"
autoload :SSHRun, "vagrant/action/builtin/ssh_run" autoload :SSHRun, "vagrant/action/builtin/ssh_run"
autoload :WaitForCommunicator, "vagrant/action/builtin/wait_for_communicator"
end end
module General module General

View File

@ -13,10 +13,10 @@ module Vagrant
@_provisioner_types = {} @_provisioner_types = {}
# Get all the configured provisioners # Get all the configured provisioners
@_provisioner_instances = env[:machine].config.vm.provisioners.map do |provisioner| @_provisioner_instances = @env[:machine].config.vm.provisioners.map do |provisioner|
# Instantiate the provisioner # Instantiate the provisioner
klass = Vagrant.plugin("2").manager.provisioners[provisioner.name] klass = Vagrant.plugin("2").manager.provisioners[provisioner.name]
result = klass.new(env[:machine], provisioner.config) result = klass.new(@env[:machine], provisioner.config)
# Store in the type map so that --provision-with works properly # Store in the type map so that --provision-with works properly
@_provisioner_types[result] = provisioner.name @_provisioner_types[result] = provisioner.name

View File

@ -20,6 +20,8 @@ module Vagrant
end end
def call(env) def call(env)
@env = env
# Check if we're even provisioning things. # Check if we're even provisioning things.
enabled = true enabled = true
enabled = env[:provision_enabled] if env.has_key?(:provision_enabled) enabled = env[:provision_enabled] if env.has_key?(:provision_enabled)

View File

@ -1,19 +1,25 @@
require "log4r" require "log4r"
require_relative "mixin_provisioners"
module Vagrant module Vagrant
module Action module Action
module Builtin module Builtin
# This action will run the cleanup methods on provisioners and should # This action will run the cleanup methods on provisioners and should
# be used as part of any Destroy action. # be used as part of any Destroy action.
class ProvisionerCleanup class ProvisionerCleanup
include MixinProvisioners
def initialize(app, env) def initialize(app, env)
@app = app @app = app
@logger = Log4r::Logger.new("vagrant::action::builtin::provision_cleanup") @logger = Log4r::Logger.new("vagrant::action::builtin::provision_cleanup")
end end
def call(env) def call(env)
@env = env
# Ask the provisioners to modify the configuration if needed # Ask the provisioners to modify the configuration if needed
provisioners.each do |p| provisioner_instances.each do |p|
env[:ui].info(I18n.t( env[:ui].info(I18n.t(
"vagrant.provisioner_cleanup", "vagrant.provisioner_cleanup",
name: provisioner_type_map[p].to_s)) name: provisioner_type_map[p].to_s))

View File

@ -0,0 +1,76 @@
module Vagrant
module Action
module Builtin
# This waits for the communicator to be ready for a set amount of
# time.
class WaitForCommunicator
def initialize(app, env, states=nil)
@app = app
@states = states
end
def call(env)
# Wait for ready in a thread so that we can continually check
# for interrupts.
ready_thr = Thread.new do
Thread.current[:result] = env[:machine].communicate.wait_for_ready(
env[:machine].config.ssh.timeout)
end
# Start a thread that verifies the VM stays in a good state.
states_thr = Thread.new do
Thread.current[:result] = true
# If we aren't caring about states, just basically put this
# thread to sleep because it'll get killed later.
if !@states
while true
sleep 300
end
next
end
# Otherwise, periodically verify the VM isn't in a bad state.
while true
state = env[:machine].provider.state.id
if !@states.include?(state)
Thread.current[:result] = false
break
end
end
end
# Wait for a result or an interrupt
env[:ui].info I18n.t("vagrant.boot_waiting")
while ready_thr.alive? && states_thr.alive?
return if env[:interrupted]
end
# If it went into a bad state, then raise an error
if !states_thr[:result]
raise Errors::VMBootBadState,
valid: @states.join(", "),
invalid: env[:machine].provider.state.id
end
# If it didn't boot, raise an error
if !ready_thr[:result]
raise Errors::VMBootTimeout
end
env[:ui].info I18n.t("vagrant.boot_completed")
# Make sure our threads are all killed
ready_thr.kill
states_thr.kill
@app.call(env)
ensure
ready_thr.kill
states_thr.kill
end
end
end
end
end

View File

@ -543,6 +543,14 @@ module Vagrant
error_key(:no_base_mac, "vagrant.actions.vm.match_mac") error_key(:no_base_mac, "vagrant.actions.vm.match_mac")
end end
class VMBootBadState < VagrantError
error_key(:boot_bad_state)
end
class VMBootTimeout < VagrantError
error_key(:boot_timeout)
end
class VMCustomizationFailed < VagrantError class VMCustomizationFailed < VagrantError
error_key(:failure, "vagrant.actions.vm.customize") error_key(:failure, "vagrant.actions.vm.customize")
end end

View File

@ -1,3 +1,5 @@
require "timeout"
module Vagrant module Vagrant
module Plugin module Plugin
module V2 module V2
@ -44,6 +46,25 @@ module Vagrant
false false
end end
# wait_for_ready waits until the communicator is ready, blocking
# until then. It will wait up to the given duration or raise an
# exception if something goes wrong.
def wait_for_ready(duration)
# By default, we implement a naive solution.
begin
Timeout.timeout(duration) do
while true
return true if ready?
sleep 0.2
end
end
rescue Timeout::Error
# We timed out, we failed.
end
return false
end
# Download a file from the remote machine to the local machine. # Download a file from the remote machine to the local machine.
# #
# @param [String] from Path of the file on the remote machine. # @param [String] from Path of the file on the remote machine.

View File

@ -9,7 +9,6 @@ module VagrantPlugins
attr_accessor :forward_x11 attr_accessor :forward_x11
attr_accessor :guest_port attr_accessor :guest_port
attr_accessor :keep_alive attr_accessor :keep_alive
attr_accessor :max_tries
attr_accessor :shell attr_accessor :shell
attr_accessor :timeout attr_accessor :timeout
@ -22,7 +21,6 @@ module VagrantPlugins
@forward_x11 = UNSET_VALUE @forward_x11 = UNSET_VALUE
@guest_port = UNSET_VALUE @guest_port = UNSET_VALUE
@keep_alive = UNSET_VALUE @keep_alive = UNSET_VALUE
@max_tries = UNSET_VALUE
@shell = UNSET_VALUE @shell = UNSET_VALUE
@timeout = UNSET_VALUE @timeout = UNSET_VALUE
@ -43,7 +41,6 @@ module VagrantPlugins
@forward_x11 = false if @forward_x11 == UNSET_VALUE @forward_x11 = false if @forward_x11 == UNSET_VALUE
@guest_port = nil if @guest_port == UNSET_VALUE @guest_port = nil if @guest_port == UNSET_VALUE
@keep_alive = false if @keep_alive == UNSET_VALUE @keep_alive = false if @keep_alive == UNSET_VALUE
@max_tries = nil if @max_tries == UNSET_VALUE
@shell = nil if @shell == UNSET_VALUE @shell = nil if @shell == UNSET_VALUE
@timeout = nil if @timeout == UNSET_VALUE @timeout = nil if @timeout == UNSET_VALUE
@ -57,7 +54,7 @@ module VagrantPlugins
def validate(machine) def validate(machine)
errors = super errors = super
[:max_tries, :timeout].each do |field| [:timeout].each do |field|
value = instance_variable_get("@#{field}".to_sym) value = instance_variable_get("@#{field}".to_sym)
errors << I18n.t("vagrant.config.common.error_empty", :field => field) if !value errors << I18n.t("vagrant.config.common.error_empty", :field => field) if !value
end end

View File

@ -71,6 +71,7 @@ module VagrantPlugins
b.use SaneDefaults b.use SaneDefaults
b.use Customize, "pre-boot" b.use Customize, "pre-boot"
b.use Boot b.use Boot
b.use WaitForCommunicator, [:starting, :running]
b.use Customize, "post-boot" b.use Customize, "post-boot"
b.use CheckGuestAdditions b.use CheckGuestAdditions
end end

View File

@ -14,35 +14,9 @@ module VagrantPlugins
# Start up the VM and wait for it to boot. # Start up the VM and wait for it to boot.
env[:ui].info I18n.t("vagrant.actions.vm.boot.booting") env[:ui].info I18n.t("vagrant.actions.vm.boot.booting")
env[:machine].provider.driver.start(boot_mode) env[:machine].provider.driver.start(boot_mode)
raise Vagrant::Errors::VMFailedToBoot if !wait_for_boot
@app.call(env) @app.call(env)
end end
def wait_for_boot
@env[:ui].info I18n.t("vagrant.actions.vm.boot.waiting")
@env[:machine].config.ssh.max_tries.to_i.times do |i|
if @env[:machine].communicate.ready?
@env[:ui].info I18n.t("vagrant.actions.vm.boot.ready")
return true
end
# Return true so that the vm_failed_to_boot error doesn't
# get shown
return true if @env[:interrupted]
# If the VM is not starting or running, something went wrong
# and we need to show a useful error.
state = @env[:machine].provider.state.id
raise Vagrant::Errors::VMFailedToRun if state != :starting && state != :running
sleep 2 if !@env["vagrant.test"]
end
@env[:ui].error I18n.t("vagrant.actions.vm.boot.failed")
false
end
end end
end end
end end

View File

@ -1,5 +1,9 @@
en: en:
vagrant: vagrant:
boot_completed: |-
Machine booted and ready!
boot_waiting: |-
Waiting for machine to boot. This may take a few minutes...
cfengine_bootstrapping: |- cfengine_bootstrapping: |-
Bootstrapping CFEngine with policy server: %{policy_server}... Bootstrapping CFEngine with policy server: %{policy_server}...
cfengine_bootstrapping_policy_hub: |- cfengine_bootstrapping_policy_hub: |-
@ -144,6 +148,25 @@ en:
Any errors that occurred are shown below. Any errors that occurred are shown below.
%{message} %{message}
boot_bad_state: |-
The guest machine entered an invalid state while waiting for it
to boot. Valid states are '%{valid}'. The machine is in the
'%{invalid}' state. Please verify everything is configured
properly and try again.
boot_timeout: |-
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.ssh.timeout" value) time period. This can
mean a number of things.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.ssh.timeout") value.
box_config_changing_box: |- box_config_changing_box: |-
While loading the Vagrantfile, the provider override specified While loading the Vagrantfile, the provider override specified
a new box. This box, in turn, specified a different box. This isn't a new box. This box, in turn, specified a different box. This isn't
@ -815,9 +838,6 @@ en:
vm: vm:
boot: boot:
booting: Booting VM... booting: Booting VM...
waiting: Waiting for VM to boot. This can take a few minutes.
ready: VM booted and ready for use!
failed: Failed to connect to VM!
failed_to_boot: |- failed_to_boot: |-
Failed to connect to VM via SSH. Please verify the VM successfully booted Failed to connect to VM via SSH. Please verify the VM successfully booted
by looking at the VirtualBox GUI. by looking at the VirtualBox GUI.