精华内容
下载资源
问答
  • 这是Openstack liberty云主机迁移源码分析的第二部分 - 在线迁移(热迁移/动态迁移)源码分析;和之前的静态迁移(离线迁移)源码分析一样,也用两篇博文详细阐述liberty中热迁移的过程,两篇博文的内容划分如下: ...

    这是Openstack liberty云主机迁移源码分析的第二部分 - 在线迁移(热迁移/动态迁移)源码分析;和之前的静态迁移(离线迁移)源码分析一样,也用两篇博文详细阐述liberty中热迁移的过程,两篇博文的内容划分如下:

    • 第一篇:分析nova-api,nova-conductor的处理过程
    • 第二篇:分析nova-compute的处理过程

    下面来看第一篇,在线迁移过程中nova-api,nova-conductor的处理过程:

    发起在线迁移

    用户可以通过 nova live-migration发起在线迁移操作,如:

    #nova --debug live-migration  57fe59d1-2566-42c1-8b17-b2a1e50c889e

    可以通过nova help live-migration帮助命令查看使用规则

    --debug选项用来打印客户端调试日志:

    POST http://10.240.xxx.xxx:8774/v2.1/25520b29dce346d38bc4b055c5ffbfcb/servers/57fe59d1-2566-42c1-8b17-b2a1e50c889e/action -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-OpenStack-Nova-API-Version: 2.6" -H "X-Auth-Token: {SHA1}6b48595f60d527e2c55a182c65c32c6f9cc76e67" -d '{"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": null}}'

    从上面的日志以及nova-api启动时建立的路由映射,我们很容易的知道消息的入口是:nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate_live,下面一起来看看具体的处理过程:

    nova-api处理阶段

    #这里省略装饰器定义(知道:装饰器在函数之前执行,并且各个装饰器的执行顺序与#声明顺序相反即可)
    def _migrate_live(self, req, id, body):
        """Permit admins to (live) migrate a server to a new 
        host.
    
        参数如下:
        req Request对象,包含请求的详细信息
        id 待迁移的云主机的uuid 57fe59d1-2566-42c1-8b17-b2a1e50c889e
        body 请求体,包含本次请求的参数 {"os-migrateLive": 
        {"disk_over_commit": false, "block_migration": false, 
        "host": null}}
        """
        #得到请求上下文
        context = req.environ["nova.context"]
    
        """根据CONF.policy_file=`/etc/nova/policy.json`文件及
        CONF.policy_dirs目录下策略文件中定义的策略认证该action在context上
        下文中是否合法,适用的规则是:
        "os_compute_api:os-migrate-server:migrate_live": 
        "rule:admin_api",如果没有定义相应的规则,则尝试执行默认规则
        (CONF.policy_default_rule, 适用的规则
        是:"default":"rule:admin_or_owner"),认证失败,抛异常
        """
        authorize(context, action='migrate_live')
    
        #根据请求体body获取配置参数
        block_migration = body["os-migrateLive"]["block_migration"]
        disk_over_commit = body["os-migrateLive"]["disk_over_commit"]
        host = body["os-migrateLive"]["host"]
    
        #参数类型转换
        block_migration = strutils.bool_from_string(block_migration,
                                                       strict=True)
        disk_over_commit = strutils.bool_from_string(disk_over_commit,
                                                       strict=True)
    
        """此处省略异常处理
        常见的异常有:找不到合适的目标主机,目标主机上`compute`服务不可用,
        hypervisor不兼容,hypervisor版本太旧,CPU不兼容,云主机处于锁定状
        态,云主机状态不对等,可以在`nova-api`的日志文件中看到具体的异常类型
        及信息
    
        先从nova.instance数据表获取id指定的实例信息(返回InstanceV2对
        象),然后调用`nova/compute/api.py/API.live_migrate`执行后续的
        操作,下文具体分析
        """
        instance = common.get_instance(self.compute_api, context, id)
        self.compute_api.live_migrate(context, instance, block_migration,
                                              disk_over_commit, host)
    
    --------------------------------------------------------------
    #接上文:`nova/compute/api.py/API.live_migrate`
    #省略装饰器:判断云主机是否锁定,状态是否正常(Active或者Paused)等
    def live_migrate(self, context, instance, block_migration,
                         disk_over_commit, host_name):
        """Migrate a server lively to a new host.
        输入参数如下:
        context 请求上下文
        instance 实例对象
        block_migration  块迁移标志,False
        disk_over_commit False
        host_name 迁移的目标主机,NULL
        """
        #修改云主机任务状态:正在迁移
        instance.task_state = task_states.MIGRATING
        instance.save(expected_task_state=[None])
    
        #在`nova.instance_actions`数据表添加`live-migration`操作记录
        self._record_action_start(context, instance,
                               instance_actions.LIVE_MIGRATION)
        #将请求转交给  #`nova/conductor/api.py/ComputeTaskAPI.live_migrate_instance`
        #处理,下文分析
        self.compute_task_api.live_migrate_instance(context,
                     instance,
                     host_name, block_migration=block_migration,
                     disk_over_commit=disk_over_commit)
    
    --------------------------------------------------------------
    #接上文:`nova/conductor/api.py/ComputeTaskAPI.live_migrate_instance`
    def live_migrate_instance(self, context, instance, host_name,
                                  block_migration, 
                                  disk_over_commit):
        """输入参数如下:
        context 请求上下文
        instance 实例对象
        block_migration  块迁移标志,False
        disk_over_commit False
        host_name 迁移的目标主机,NULL
        """
        #过滤属性,`nova-scheduler`选择目标主机的时候会用到
        scheduler_hint = {'host': host_name}
        """参数:
        True,表示在线迁移;
        False 表示非resize操作
        第一个None 表示不指定配置模板flavor
        第二个None 表示不停机
    
        发送同步`migrate_server`消息到消息队列,消费者`nova-conductor`
        会处理该消息
        """
        self.conductor_compute_rpcapi.migrate_server(
                context, instance, scheduler_hint, True, False, 
                None,
                block_migration, disk_over_commit, None)

    小结:nova-api的处理比较简单,先认证权限及转换输入参数,之后更新实例任务状态(migrating)及nova.instance_actions数据库,最后发起同步rpc请求migrate_server,由nova-conductor执行后续的处理

    nova-conductor处理阶段

    接上文,由nova-conductor服务启动时建立的路由映射我们知道migrate_server消息的,处理入口如下:

    #`nova/conductor/manager.py/ComputeTaskManager.migrate_server
    #省略装饰器定义,添加了try {} except 异常处理,相关的异常信息会返回给`nova-api`
    def migrate_server(self, context, instance, scheduler_hint, 
                live, rebuild,
                flavor, block_migration, disk_over_commit, 
                reservations=None,
                clean_shutdown=True):
        """参数说明:
        scheduler_hint 过滤属性 {'host': host_name}
        live True 在线迁移
        rebuild False 非resize
        flavor None 
        block_migration False,块迁移
        disk_over_commit False,块迁移时,计算磁盘空间时使用实际大小还是虚
        拟大小(=True表示使用实际大小,=False表示使用虚拟大小)
        reservations None
        clean_shutdown False 在线迁移,不需要停机
        """
    
        #省略instance及flavor参数的异常处理
        ......
    
        #热迁移,下文具体分析
        if live and not rebuild and not flavor:
            self._live_migrate(context, instance, scheduler_hint,
                                   block_migration, 
                                   disk_over_commit)
        #离线迁移,在之前的文章中已经分析过
        elif not live and not rebuild and flavor:
            instance_uuid = instance.uuid
            with compute_utils.EventReporter(context, 
                                                'cold_migrate',
                                                 instance_uuid):
                self._cold_migrate(context, instance, flavor,
                              scheduler_hint['filter_properties'],
                              reservations, clean_shutdown)
        else:
            raise NotImplementedError()
    
    ---------------------------------------------------------------
    #接上文:`nova/conductor/manager.py/ComputeTaskManager._live_migrate`
    def _live_migrate(self, context, instance, scheduler_hint,
                          block_migration, disk_over_commit):
        #得到热迁移的目标主机
        destination = scheduler_hint.get("host")
    
        """定义一个辅助函数:
        1. 更新实例状态,
        2. 更新`nova.instance_faults`记录异常堆栈
        3. 发送通知给ceilometer
        """
        def _set_vm_state(context, instance, ex, vm_state=None,
                              task_state=None):
            request_spec = {'instance_properties': {
                    'uuid': instance.uuid, },
            }
            scheduler_utils.set_vm_state_and_notify(context,
                    instance.uuid,
                    'compute_task', 'migrate_server',
                    dict(vm_state=vm_state,
                         task_state=task_state,
                       expected_task_state=task_states.MIGRATING,),
                    ex, request_spec, self.db)
    
        #创建一个迁移对象Migration,包含:源端主机,目的主机,实例id,迁移类
        #型,迁移状态,配置模板id
        migration = objects.Migration(context=context.elevated())
        migration.dest_compute = destination
        migration.status = 'pre-migrating'
        migration.instance_uuid = instance.uuid
        migration.source_compute = instance.host
        migration.migration_type = 'live-migration'
        if instance.obj_attr_is_set('flavor'):
            migration.old_instance_type_id = instance.flavor.id
            migration.new_instance_type_id = instance.flavor.id
        else:
            migration.old_instance_type_id = instance.instance_type_id
            migration.new_instance_type_id = instance.instance_type_id
        #在`nova.migrations`数据表添加一条迁移记录,初始状态为:pre-
        #migrating
        migration.create()
        #生成迁移任务对象LiveMigrationTask
        task = self._build_live_migrate_task(context, instance, 
                                                 destination,
                                                 block_migration, 
                                                 disk_over_commit,
                                                 migration)
    
        """此处省略异常处理
        常见的异常有:找不到合适的目标主机,目标主机上`compute`服务不可用,
        hypervisor不兼容,hypervisor版本太旧,CPU不兼容,云主机处于锁定状
        态,云主机状态不对等 - 对于上述异常,会调用上文定义的_set_vm_state辅
        助方法:还原实例状态,更新`nova.instance_faults`及发送错误类型通知
        给ceilometer,同时更新`nova.migrations`迁移状态为`error`
    
        对于其他的未指明异常,会调用上文定义的_set_vm_state辅助方法:设置实
        例状态为Error,实例任务状态为(`migrating`),更新
        `nova.instance_faults`及发送错误类型通知给ceilometer, 同时更新
        `nova.migrations`迁移状态为`failed`
    
        最后会将异常上抛给`nova-api`
    
        启动迁移任务,TaskBase.execute -> LiveMigrationTask._execute, 
        下文具体分析
        """
        task.execute()
    
    -------------------------------------------------------------
    #接上文:`nova/conductor/tasks/live_migrate.py/LiveMigrationTask._execute`
    def _execute(self):
        #判断实例的状态是否为运行或者暂停状态,
        #否则抛InstanceInvalidState异常
        self._check_instance_is_active()
    
        #从数据表`nova.services`获取源端主机上`nova-compute`服务的信息,
        #出错,抛ComputeServiceUnavailable异常,如果`nova-compute`服务
        #未启动,也抛ComputeServiceUnavailable异常
        self._check_host_is_up(self.source)
    
        """如果没有指定目标主机,则循环通过`nova-scheduler`选择目标主机
        选定一个主机后需要检查:
        1.该主机上的hypervisor是否与源主机上的兼容,过程如下:
            1.从`nova.compute_nodes`数据表获取源端和目的端节点信息,出错抛
            ComputeHostNotFound异常
            2.比较源端和目的端节点上hypervisor类型,如果不同则抛
            InvalidHypervisorType异常
            3.比较源端和目的端节点上hypervisor的版本,如果源端的比目的端的
            新,则抛DestinationHypervisorTooOld异常
    
        2.检测选择的目标主机是否支持在线迁移,过程如下:
            1. 发送同步`check_can_live_migrate_destination`消息到消息
            队列,消费者`nova-compute`会处理该消息
            2. 从`nova.compute_nodes`数据表获取源端和目的端节点信息
            3. 判断实例所使用的vcpu类型与目标主机的cpu类型是否兼容,如果不兼
            容抛InvalidCPUInfo异常
            4.发送同步`check_can_live_migrate_source`消息到消息队列,消
            费者`nova-compute`会处理该消息,以便判断实例的磁盘配置是否支持
            在线迁移,包括两种情况:
                1. 块迁移(block_migration=True),满足下述所有条件:
                    1.不能有共享磁盘,不符合抛InvalidLocalStorage异常
                    2.目标主机上有足够的磁盘空间,不足抛
                                        MigrationPreCheckError异常
                    3.不能有卷设备,不符合抛MigrationPreCheckError异常
                2. 非块迁移(block_migration=False),满足下述条件之一:
                    1.从卷启动并且没有本地磁盘,
                    2.从镜像启动并且使用的是共享磁盘
                不符合上述条件,抛InvalidSharedStorage异常
    
        上述动作如果超时,则抛MigrationPreCheckError异常
    
        选择目标主机时,会排除源主机以及前一次选择的主机,如果超过最大重试次
        数(配置了migrate_max_retries > 0),还没有得到合适的目标主机,抛
        MaxRetriesExceeded异常,如果所有的主机节点都试过了,还是没有找到合适
        的目标主机,抛NoInvalidHost异常
        """
        if not self.destination:
            self.destination = self._find_destination()
            #设置迁移模板主机,更新`nova.migrations`数据表
            self.migration.dest_compute = self.destination
            self.migration.save()
        else:
        """指定了目标主机,需要执行如下判断:
        1.源端主机与目标主机不同,不符合抛UnableToMigrateToSelf异常
        2.目标主机上的`nova-compute`存在且以启动,不符合抛
                                ComputeServiceUnavailable异常
        3.从`nova.compute_nodes`数据表获取目标主机信息,并判断是否内存
        足够完成该次迁移,不符合抛MigrationPreCheckError异常
        4.与上述没有指定目标主机情况一样,判断目标主机上的hypervisor是否与
        源主机上的兼容,具体分析如上文
        5.与上述没有指定目标主机情况一样,判断目标主机是否支持在线迁移,具体分
        析如上文
        """
            self._check_requested_destination()
    
        #发送异步`live_migration`到消息队列,消费者`nova-compute`会处理该
        #消息
        return self.compute_rpcapi.live_migration(
                    self.context,
                    host=self.source,
                    instance=self.instance,
                    dest=self.destination,
                    block_migration=self.block_migration,
                    migration=self.migration,
                    migrate_data=self.migrate_data)

    小结:nova-conductor的处理过程略显复杂,与nova-compute的交互比较多,主要是判断通过nova-scheduler选择的候选目标主机是否满足执行在线迁移的条件,另外会在数据表nova.migrations创建一条迁移记录;在迁移发生异常是也会更新nova.instance_faults数据表,最后发起异步rpc请求,由nova-compute完成后续的迁移操作

    nova-compute处理阶段

    从消息队列拿到live_migration消息后,nova-compute通过如下方法继续处理迁移请求:

    #`nova/compute/manager.py/ComputeManager.live_migration`
    #省略装饰器定义
    def live_migration(self, context, dest, instance, 
                            block_migration,
                            migration, migrate_data):
        """Executing live migration.
    
        :param context: security context
        :param dest: destination host
        param instance: a nova.objects.instance.Instance object
        :param block_migration: if true, prepare for block 
        migration
        :param migration: an nova.objects.Migration object
        :param migrate_data: implementation specific params
    
        """
    
        # NOTE(danms): Remove these guards in v5.0 of the RPC API
        #更新`nova.migrations`记录,更新迁移状态为:queued
        if migration:
            migration.status = 'queued'
            migration.save()
    
        """线程函数,使用信号量临界保护,根据配置的
        CONF.max_concurrent_live_migrations(默认1)参数使能节点上并发的
        迁移操作,如果信号量饱和了,就会等待
        """
        def dispatch_live_migration(*args, **kwargs):
            with self._live_migration_semaphore:
                self._do_live_migration(*args, **kwargs)
    
        """ NOTE(danms): We spawn here to return the RPC worker 
        thread back to the pool. Since what follows could take a 
        really long time, we don't want to tie up RPC workers.
        """
        #创建一个线程执行迁移操作,线程函数为上文定义的
        #`dispatch_live_migration`,如果成功拿到了信号量,就调用
        `_do_live_migration`继续执行迁移,否则等待
        utils.spawn_n(dispatch_live_migration,
                          context, dest, instance,
                          block_migration, migration,
                          migrate_data)

    小结:nova-compute收到live_migration消息后,更新nova.migrations记录,然后启动一个线程来执行热迁移操作,所以整个迁移操作都是在一个新的线程内完成的;详细的迁移过程,在一篇博文中分析,敬请期待!

    展开全文
  • 接上一篇Openstack liberty 云主机迁移源码分析之静态迁移1 nova-compute部分 prepare阶段 接上文,nova-compute从消息队列拿到prep_resize请求后,将由下述方法处理该请求: #/nova/compute/manager.py/...

    接上一篇Openstack liberty 云主机迁移源码分析之静态迁移1,上篇文中主要介绍了nova-apinova-conductor的处理过程,本文将重点分析在云主机迁移过程中nova-compute所做的工作,可以分为如下三个部分:

    • prepare准备阶段
    • execute执行阶段
    • complete完成阶段

    下面一起来看看具体的内容:

    prepare准备阶段

    nova-compute从消息队列拿到prep_resize请求后,将由下述方法处理该请求:

    #/nova/compute/manager.py/ComputeManager.prep_resize
    def prep_resize(self, context, image, instance, instance_type,
                        reservations, request_spec, 
                        filter_properties, node,
                        clean_shutdown):
        """Initiates the process of moving a running instance to 
        another host.Possibly changes the RAM and disk size in the 
        process.
    
        输入参数,来自`nova-conductor`,如下:
        image 云主机所使用的镜像信息
        instance InstanceV2对象,包含云主机详细信息
        instance_type, Flavor对象,云主机所使用的配置模板
        reservations = []  资源保留配额
        request_spec 请求参数,包含:镜像信息,云主机信息,配置模板
        filter_properties 过滤属性
        node 目标节点名, 这里是`devstack`
        """
    
        #如果没有指定指定node,则通过hypervisor获取(这里是
        #LibvirtDriver)
        if node is None:
            node = self.driver.get_available_nodes(refresh=True)[0]
            LOG.debug("No node specified, defaulting to %s", node,
                          instance=instance)
    
        """NOTE(melwitt): Remove this in version 5.0 of the RPC API
        Code downstream may expect extra_specs to be populated 
        since it is receiving an object, so lookup the flavor to 
        ensure this.
        """
        #如果instance_type不是合法的Flavor对象,则从nova.instance_types
        #表中获取配置模板信息
        if not isinstance(instance_type, objects.Flavor):
            instance_type = objects.Flavor.get_by_id(context,
                                           instance_type['id'])
        #从保留配额生成配额对象Quotas
        quotas = objects.Quotas.from_reservations(context,
                                                      reservations,
                                                      instance=instance)
    
        #异常上下文:迁移发生异常时回滚配额及云主机状态
        with self._error_out_instance_on_exception(context, instance,
                                                       quotas=quotas):
            #发送`compute.instance.exists`通知给ceilometer,
            #通知包含:云主机的详细配置信息;默认审计周期为(month),  
            #current_period=True,表示添加该通知到当前统计周期                                      
            compute_utils.notify_usage_exists(self.notifier, 
                                                context, instance,
                                               current_period=True)
            #发送`compute.instance.resize.prep.start`通知ceilometer,
            #通知包含:云主机详细信息
            self._notify_about_instance_usage(
                        context, instance, "resize.prep.start")
    
            try:
                #调用_prep_resize执行后续的迁移操作,下文具体分析
                self._prep_resize(context, image, instance,
                                      instance_type, quotas,
                                      request_spec, 
                                      filter_properties,
                                      node, clean_shutdown)
            # NOTE(dgenin): This is thrown in LibvirtDriver when the
            #               instance to be migrated is backed by LVM.
            #               Remove when LVM migration is implemented.
            #如果nova使用lvm作为后端存储,从镜像启动的云主机将不支持迁移
            except exception.MigrationPreCheckError:
                raise
            except Exception:
                # try to re-schedule the resize elsewhere:
                #获取具体的异常信息,如:UnableToMigrateToSelf
                exc_info = sys.exc_info()
                """重新调度:如果包含retry则执行重新调度,实际上就是通过`
                `nova/conductor/rpcapi.py/ComputeTaskAPI`重新发起
                迁移请求,再次进入云主机迁移源码分析之静态迁移1的`nova-
                conductor`部分,与前一次不同的是,重试请求包含了前一次请求
                的异常信息并且在选择目标主机的时候会排除前一次已选择的目标主
                机
                """
                self._reschedule_resize_or_reraise(context, image,
                            instance,
                            exc_info, instance_type, quotas, 
                            request_spec,
                            filter_properties)
            finally:
                #模板信息:名称及id
                extra_usage_info = dict(
                            new_instance_type=instance_type.name,
                            new_instance_type_id=instance_type.id)
                #发送`compute.instance.resize.prep.end`通知给
                #ceilometer,通知包含:云主机详细信息及配置模板名和id
                self._notify_about_instance_usage(
                        context, instance, "resize.prep.end",
                        extra_usage_info=extra_usage_info)
    
    ---------------------------------------------------------------
    #接上文:
    def _prep_resize(self, context, image, instance, instance_type,
                quotas, request_spec, filter_properties, node,
                clean_shutdown=True):
    
        if not filter_properties:
            filter_properties = {}
    
        if not instance.host:
            self._set_instance_obj_error_state(context, instance)
            msg = _('Instance has no source host')
            raise exception.MigrationError(reason=msg)
    
        #检查源主机是否和目标主机是否是同一个
        same_host = instance.host == self.host
    
        # if the flavor IDs match, it's migrate; otherwise resize
        #在同一个主机上迁移!!!,需要检查hypervisor是否支持同一主机上的迁移
        if same_host and instance_type.id == 
                                    instance['instance_type_id']:
            # check driver whether support migrate to same host
            #libvirt默认不支持同一主机上的迁移,这里会抛异常
            if not self.driver.
                    capabilities['supports_migrate_to_same_host']:
                raise exception.UnableToMigrateToSelf(
                        instance_id=instance.uuid, host=self.host)
    
        # NOTE(danms): Stash the new instance_type to avoid having 
        #to look it up in the database later
        #添加'new_flavor'属性到Instance,包含配置模板instance_type信息
        instance.set_flavor(instance_type, 'new')
    
        # NOTE(mriedem): Stash the old vm_state so we can set the
        # resized/reverted instance back to the same state later.
        #保留当前的云主机状态,用于回滚或者迁移完成后设置新主机状态
        vm_state = instance.vm_state
        LOG.debug('Stashing vm_state: %s', vm_state, instance=instance)
        instance.system_metadata['old_vm_state'] = vm_state
        instance.save()
    
        #创建节点资源跟踪器ResourceTracker
        limits = filter_properties.get('limits', {})
        rt = self._get_resource_tracker(node)
        #保存迁移上下文MigrationContext到instance,并添加一条记录到
        #nova.migrations数据表中
        #根据配置模板(instance_type)校验目标主机上资源情况(日志文件中打印
        #Attemping Claim之类的日志,如果资源不足会抛异常)
        #根据配置模板(instance_type)在目标主机上保留迁移所需的资源
        with rt.resize_claim(context, instance, instance_type,
                               image_meta=image, limits=limits) as claim:
            LOG.info(_LI('Migrating'), context=context, 
                                                instance=instance)
            #通过异步rpc发送`resize_instance`请求给`nova-compute`
            #下文具体分析                                 
            self.compute_rpcapi.resize_instance(
                        context, instance, claim.migration, image,
                        instance_type, quotas.reservations,
                        clean_shutdown)

    小结: 该阶段主要是完成一些前期的条件判断、参数设置、资源校验及保留及发送ceilometer审计通知,为后续的执行阶段做准备

    execute执行阶段

    从消息队列拿到前述的resize_instance消息后,nova-compute通过下述方法来执行该请求:

    #nova/compute/manager.py/ComputeManager.resize_instance
    def resize_instance(self, context, instance, image,
                            reservations, migration, instance_type,
                            clean_shutdown):
        """Starts the migration of a running instance to another 
         host. 
    
        migration 上文的rpc投递过来的迁移参数,包含:迁移所需的详细信息            
        """  
        #基于资源保留生成配额对象Quotas
        quotas = objects.Quotas.from_reservations(context,
                                                      reservations,
                                                      instance=instance)       
        #异常上下文:迁移发生异常时回滚配额及设置云主机状态                                                      
        with self._error_out_instance_on_exception(context, 
                                            instance,
                                             quotas=quotas):
            """ TODO(chaochin) Remove this until v5 RPC API
            Code downstream may expect extra_specs to be 
            populated since it is receiving an object, so lookup 
            """
            #如果没有指定配置模板,从migration参数中提取
            if (not instance_type or
                not isinstance(instance_type, objects.Flavor)):
                instance_type = objects.Flavor.get_by_id(
                        context, migration['new_instance_type_id'])
    
            #从neutron数据库中获取与云主机关联的所有网卡信息(VIF信息)
            network_info = self.network_api.get_instance_nw_info(context,
                                                       instance)
            #更新迁移状态(nova.migrations数据表)
            migration.status = 'migrating'
            with migration.obj_as_admin():
                migration.save()
    
            #更新云主机状态:云主机状态:重建/迁移,任务状态:正在重建或者迁移
            instance.task_state = task_states.RESIZE_MIGRATING
            instance.save(expected_task_state=
                                        task_states.RESIZE_PREP)
    
            #发送`compute.instance.resize.start`通知给ceilometer,
            #通知包含:云主机详细信息及网卡信息
            self._notify_about_instance_usage(
                    context, instance, "resize.start", 
                    network_info=network_info)
            #从nova.block_device_mapping数据表中获取云主机关联的块设备映
            #射信息
            bdms = objects.BlockDeviceMappingList.get_by_instance_uuid(
                        context, instance.uuid)
            #将上述的块设备映射信息转换成设备驱动所需要的格式,如:
            """
            {'swap' = None,
             'root_device_name' : u'/dev/vda',
             'ephemerals' = [],
             'block_device_mapping' = []
            }
            block_device_mapping包含,云主机上的卷设备信息列表
            """
            block_device_info = self._get_instance_block_device_info(
                                    context, instance, bdms=bdms)
    
            #从云主机的system_metadata中获取关机超时及重试信息(如果
            #clean_shutdown = True),否则设置为0,0
            timeout, retry_interval = self._get_power_off_values(context,
                                                instance, clean_shutdown)
            #通过LibvirtDriver执行关机迁移,下文具体分析                                 
            disk_info = self.driver.migrate_disk_and_power_off(
                        context, instance, migration.dest_host,
                        instance_type, network_info,
                        block_device_info,
                        timeout, retry_interval)   
            #通过cinder断开卷设备连接           
            self._terminate_volume_connections(context, instance, bdms)
            #迁移网卡(空操作,在`complete`结束阶段在目的主机上重新配置网卡)
            migration_p = obj_base.obj_to_primitive(migration)
            self.network_api.migrate_instance_start(context,
                                                        instance,
                                                        migration_p)
            #更新迁移状态(nova.migrations数据表)                                            
            migration.status = 'post-migrating'
            with migration.obj_as_admin():
                migration.save()
            #更新云主机的主机信息及状态:云主机状态:重建/迁移,任务状态:正在
            #完成重建或者迁移
            instance.host = migration.dest_compute
            instance.node = migration.dest_node
            instance.task_state = task_states.RESIZE_MIGRATED
            instance.save(expected_task_state=
                                    task_states.RESIZE_MIGRATING)
            #通过异步rpc发起`finish_resize`请求,`nova-compute`会处理该
            #请求,下文具体分析
            self.compute_rpcapi.finish_resize(context, 
                            instance,
                            migration, image, disk_info,
                            migration.dest_compute, 
                            reservations=quotas.reservations)
            #发送`compute.instance.resize.end`通知给ceilometer
            #通知包含:云主机详细信息及网卡信息
            self._notify_about_instance_usage(context, instance, 
                            "resize.end",network_info=network_info) 
            #移除所有的pending事件
            self.instance_events.clear_events_for_instance(
                                                        instance)
    
    ---------------------------------------------------------------
    
    #接上文:nova/virt/libvirt/driver.py/LibvirtDriver
    def migrate_disk_and_power_off(self, context, instance, dest,
                                       flavor, network_info,
                                       block_device_info=None,
                                       timeout=0, retry_interval=0):  
        LOG.debug("Starting migrate_disk_and_power_off",
                       instance=instance)
        #获取外部设备
        ephemerals =  
         driver.block_device_info_get_ephemerals(block_device_info)  
        # get_bdm_ephemeral_disk_size() will return 0 if the new
        # instance's requested block device mapping contain no
        # ephemeral devices. However, we still want to check if
        # the original instance's ephemeral_gb property was set and
        # ensure that the new requested flavor ephemeral size is 
        #greater
        eph_size = (block_device.get_bdm_ephemeral_disk_size(ephemerals) 
                                          or instance.ephemeral_gb) 
        # Checks if the migration needs a disk resize down.
        root_down = flavor.root_gb < instance.root_gb
        ephemeral_down = flavor.ephemeral_gb < eph_size
    
        #从云主机的xml配置中获取块非volume设备信息,我的例子中为[]
        disk_info_text = self.get_instance_disk_info(
                instance, block_device_info=block_device_info)
    
        #检查云主机是否从卷启动(如果instance中不包含image_ref镜像属性或者
        #块设备信息中不包含'disk'信息,就是从卷启动的)        
        booted_from_volume = self._is_booted_from_volume(instance,
                                        disk_info_text)
        #如果从镜像启动则根磁盘不支持收缩;外部设备也不支持收缩                                
        if (root_down and not booted_from_volume) or ephemeral_down:
            reason = _("Unable to resize disk down.")
            raise exception.InstanceFaultRollback(
                    exception.ResizeError(reason=reason))
    
        disk_info = jsonutils.loads(disk_info_text)
        # NOTE(dgenin): Migration is not implemented for LVM backed 
        #instances.
        #如果nova使用lvm作为后端存储,从镜像启动的云主机将不支持迁移
        if CONF.libvirt.images_type == 'lvm' and not 
                                            booted_from_volume:
            reason = _("Migration is not supported for LVM backed 
                                                    instances")
            raise exception.InstanceFaultRollback(
                    exception.MigrationPreCheckError(reason=reason))    
    
        # copy disks to destination
        # rename instance dir to +_resize at first for using
        # shared storage for instance dir (eg. NFS).
        #获取云主机本地配置路径,
        #如:/opt/stack/data/nova/instances/{uuid}
        inst_base = libvirt_utils.get_instance_path(instance)
        inst_base_resize = inst_base + "_resize"    
    
        #判断目标主机和源主机是否共享存储
        shared_storage = self._is_storage_shared_with(dest, inst_base)  
    
        # try to create the directory on the remote compute node
        # if this fails we pass the exception up the stack so we 
        #can catch failures here earlier
        #如果是共享存储,则通过ssh在目标主机上创建云主机根目录,失败则抛异常
        if not shared_storage:
            try:
                self._remotefs.create_dir(dest, inst_base)
            except processutils.ProcessExecutionError as e:
                reason = _("not able to execute ssh command: %s") % e
                raise exception.InstanceFaultRollback(
                        exception.ResizeError(reason=reason))   
        #执行迁移前,通过libvirt关闭云主机
        self.power_off(instance, timeout, retry_interval)  
    
        #从块设备信息字典中获取卷设备映射字典
        block_device_mapping = driver.block_device_info_get_mapping(
                block_device_info)
        #迁移前,通过特定类型的卷驱动卸载卷设备。对于
        #rbd(LibvirtNetVolumeDriver),什么都没有做;
        #对于iscsi(LibvirtISCSIVolumeDriver),做了两个工作:
        #1. echo '1' > /sys/block/{dev_name}/device/delete
        #2. 通过iscsiadm工具删除相关的端点信息
        for vol in block_device_mapping:
            connection_info = vol['connection_info']
            disk_dev = vol['mount_device'].rpartition("/")[2]
            self._disconnect_volume(connection_info, disk_dev)
    
        try:
            #重命名云主机配置目录
            utils.execute('mv', inst_base, inst_base_resize)
            """ if we are migrating the instance with shared 
            storage then create the directory.  If it is a remote 
            has already been created
            """
            #创建新的云主机配置目录
            if shared_storage:
                dest = None
                utils.execute('mkdir', '-p', inst_base)
    
            #创建一个作业跟踪器(添加/删除),用于后面的磁盘拷贝
            on_execute = lambda process: \
                    self.job_tracker.add_job(instance, process.pid)
            on_completion = lambda process: \
                    self.job_tracker.remove_job(instance, process.pid)
            #获取云主机配置模板        
            active_flavor = instance.get_flavor()
    
            #迁移非volume设备
            for info in disk_info:
                # assume inst_base == dirname(info['path'])
                img_path = info['path']
                fname = os.path.basename(img_path)
                from_path = os.path.join(inst_base_resize, fname)
    
                """ To properly resize the swap partition, it must 
                be re-created with the proper size.  This is 
                acceptable because when an OS is shut down, the 
                contents of the swap space are just garbage, the OS 
                doesn't bother about what is in it.
                We will not copy over the swap disk here, and rely 
                on finish_migration/_create_image to re-create it 
                for us.
                """
                if not (fname == 'disk.swap' and
                        active_flavor.get('swap', 0) != 
                            flavor.get('swap', 0)):
                    compression = info['type'] not in 
                                        NO_COMPRESSION_TYPES
                    libvirt_utils.copy_image(from_path, img_path, 
                                            host=dest,
                                            on_execute=on_execute,
                                       on_completion=on_completion,
                                         compression=compression)
            """ Ensure disk.info is written to the new path to 
            avoid disks being reinspected and potentially 
            changing format.
            """
            #拷贝disk_info到目的主机(如果有的话),我是采用ceph 作为nova
            #后端存储,在根目录上只有console.log和libvirt.xml两个文件
            src_disk_info_path = os.path.join(inst_base_resize, 
                                                       'disk.info')
            if os.path.exists(src_disk_info_path):
                dst_disk_info_path = os.path.join(inst_base, 
                                                       'disk.info')
                libvirt_utils.copy_image(src_disk_info_path,
                                             dst_disk_info_path,
                                             host=dest, 
                                             on_execute=on_execute,
                                    on_completion=on_completion)
    
        except Exception:
            #发生异常回滚上述的操作,并上抛异常
            with excutils.save_and_reraise_exception():
                self._cleanup_remote_migration(dest, inst_base,
                                                  inst_base_resize,
                                                   shared_storage)
    
        return disk_info_text

    小结:该阶段主要完成non-volume块设备的复制,同时更新云主机状态并发送ceilometer审计通知,最后通过异步rpc发起finish_resize请求,进入下一阶段

    complete完成阶段

    从消息队列拿到上述的finish_resize消息后,nova-compute通过下述方法来执行该请求:

    #nova/compute/manager.py/ComputeManager.finish_resize
    def finish_resize(self, context, disk_info, image, instance,
                          reservations, migration):
        """Completes the migration process.
        Sets up the newly transferred disk and turns on the 
        instance at its new host machine.
        """
        #生成配额对象Quotas
        quotas = objects.Quotas.from_reservations(context,
                                                      reservations,
                                                      instance=instance)
        try:
            #完成迁移的结尾操作,下文具体分析
            self._finish_resize(context, instance, migration,
                                    disk_info, image)
            #提交资源配额,更新数据库
            quotas.commit()
        except Exception:
            LOG.exception(_LE('Setting instance vm_state to ERROR'),
                              instance=instance)
            #发生异常,回滚配额
            with excutils.save_and_reraise_exception():
                try:
                    quotas.rollback()
                except Exception:
                    LOG.exception(_LE("Failed to rollback quota" 
                                     "for failed finish_resize"),
                                      instance=instance)
                #设置云主机状态为error(错误)                      
                self._set_instance_obj_error_state(context, instance)
    
    ------------------------------------------------------------
    #接上文:
    def _finish_resize(self, context, instance, migration, disk_info,
                           image):
        #默认是迁移操作
        resize_instance = False
        #从迁移参数字典中提取新旧配置模板id
        old_instance_type_id = migration['old_instance_type_id']
        new_instance_type_id = migration['new_instance_type_id']
        #从云主机实例对象提取配置模板信息
        old_instance_type = instance.get_flavor()
        """ NOTE(mriedem): Get the old_vm_state so we know if we 
        should power on the instance. If old_vm_state is not set we 
        need to default to ACTIVE for backwards compatibility
        """
        #从云主机实例对象的系统元信息中提取迁移前云主机的状态,默认为运行态
        #在`prepare`准备阶段,会设置该值
        old_vm_state = instance.system_metadata.get('old_vm_state',
                                                       vm_states.ACTIVE)
        #添加old_flavor属性,保存旧的配置模板到云主机实例对象中                                               
        instance.set_flavor(old_instance_type, 'old')
    
        #如果新旧配置模板id不同,则用新模板配置instance
        if old_instance_type_id != new_instance_type_id:
            #从new_flavor属性获取新配置模板,该值在`prepare`准备阶段设置
            instance_type = instance.get_flavor('new')
            self._set_instance_info(instance, instance_type)
            #如果新旧配置中根磁盘、交换分区、外设不同,则认为是resize操作
            for key in ('root_gb', 'swap', 'ephemeral_gb'):
                if old_instance_type[key] != instance_type[key]:
                    resize_instance = True
                    break
        #应用迁移上下文MigrationContext,上下文在`prepare`准备阶段设置
        #更新instance的numa拓扑信息
        instance.apply_migration_context()
    
        # NOTE(tr3buchet): setup networks on destination host
        #在目标主机上为新的云主机配置网卡(这是空操作,下面的
        #migrate_instance_finish才执行正在的网卡配置)
        self.network_api.setup_networks_on_host(context, instance,
                                         migration['dest_compute'])
        #迁移网卡,更新数据库(设置'binding:host_id'属性为目标主机)
        migration_p = obj_base.obj_to_primitive(migration)
        self.network_api.migrate_instance_finish(context,
                                                     instance,
                                                     migration_p)
        #获取云主机的网卡信息,发送错误则抛异常
        network_info = self.network_api.get_instance_nw_info(context, instance)
    
        #更新云主机状态: 云主机状态:重建/迁移,任务状态:完成重建或者迁移
        instance.task_state = task_states.RESIZE_FINISH
        instance.save(expected_task_state=
                                task_states.RESIZE_MIGRATED)
    
        #发送`compute_instance.finish_resize.start`通知给ceilometer
        #通知包含云主机详细信息及网卡信息
        self._notify_about_instance_usage(
                context, instance, "finish_resize.start",
                network_info=network_info)
    
        #获取云主机的块设备信息,并更新卷设备的'connection_info'信息
        block_device_info = self._get_instance_block_device_info(
                                context, instance, 
                                refresh_conn_info=True)
    
        """ NOTE(mriedem): If the original vm_state was STOPPED, we 
        don't automatically power on the instance after it's 
        migrated
        """
        #需要启动云主机
        power_on = old_vm_state != vm_states.STOPPED
        try:
            #在目标主机上配置云主机,并启动云主机(如果需要)
            """这部分的处理代码与之前写的[云主机启动过程源码分析3](http://blog.csdn.net/lzw06061139/article/details/51505514)
            中的最后一部分代码相似,都是调用相同的接口完成磁盘配置,网络配置,
            xml配置等,最后启动云主机;
            """
            self.driver.finish_migration(context, migration, 
                                            instance,
                                             disk_info,
                                             network_info,
                                             image, 
                                             resize_instance,
                                             block_device_info, 
                                             power_on)
        except Exception:
           #异常,还原云主机配置模板
            with excutils.save_and_reraise_exception():
                if old_instance_type_id != new_instance_type_id:
                    self._set_instance_info(instance,
                                                old_instance_type)
        #更新迁移状态(nova.migrations数据表)
        migration.status = 'finished'
        with migration.obj_as_admin():
            migration.save()
    
        #更新云主机状态:任务状态为:None
        instance.vm_state = vm_states.RESIZED
        instance.task_state = None
        instance.launched_at = timeutils.utcnow()
        instance.save(expected_task_state=
                                task_states.RESIZE_FINISH)
    
        #发送消息给`nova-scheduler`,更新节点上的主机信息
        self._update_scheduler_instance_info(context, instance)
    
        #发送`compute_instance.finish_resize.end`通知给ceilometer,
        #通知内容包含云主机详细信息及网卡信息
        self._notify_about_instance_usage(
                context, instance, "finish_resize.end",
                network_info=network_info)

    小结:该阶段主要是配置云主机:磁盘,网卡,生成xml文件,然后启动云主机。

    到这里云主机静态迁移的源码分析就完成了,纵观整个迁移过程:openstack的大部分工作是在处理各种资源配置,一切资源就绪后,其实就是启动一个新的云主机。

    展开全文
  • 虚拟机迁移使资源配置更加灵活,尤其是在线迁移,提供了...本文是云主机迁移系列的第一篇,将基于源码详细分析静态迁移的实现过程。 发起迁移 用户可以手动通过nova CLI发起云主机迁移动作: #nova --debug m

    虚拟机迁移使资源配置更加灵活,尤其是在线迁移,提供了虚拟机的可用性和可靠性。Openstack liberty中提供了两种类型的迁移实现:静态迁移(cold migration)和动态迁移(live migration)。在接下来的几篇文章中,我将详细分析两种迁移的实现过程,先来看静态迁移。

    限于篇幅,静态迁移的源码分析将包含两篇文章:

    • 第一篇:主要介绍迁移过程中nova-apinova-conductor所在的工作
    • 第二篇:重点介绍nova-compute的处理过程

    下面请看第一篇的内容:

    发起迁移

    用户可以手动通过nova CLI命令行发起云主机迁移动作:

    #nova --debug migrate 52e4d485-6ccf-47f3-a754-b62649e7b256

    上述命令将id=52e4d485-6ccf-47f3-a754-b62649e7b256的云主机迁移到另外一个最优的nova-compute节点上,--debug选项用来显示执行日志:

    ......
    
    curl -g -i -X POST http://controller:8774/v2/eab72784b36040a186a6b88dac9ac0b2/servers/5a7d302f-f388-4ffb-af37-f1e6964b3a51/action -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}8e294a111a5deaa45f6cb0f3c58a600d2b1b0493" -d '{"migrate": null}
    
    ......
    

    上述截取的日志表明:novaclient通过http方式将迁移请求发送给nova-api并执行migrate动作(action),由nova-api启动时建立的路由映射,很容易的知道,该动作的入口函数为
    nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate,下文具体分析。

    源码分析

    nova-api部分

    如上分析,迁移入口如下:

    #nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate, 省略装饰器定义
    def _migrate(self, req, id, body):
        """Permit admins to migrate a server to a new host.
        req 是Request对象,包含该次请求信息
        id 是待迁移的云主机id 如:52e4d485-6ccf-47f3-a754-b62649e7b256
        body 是该次请求的参数信息 {"migrate": null}
        """
    
        #从Request对象提取请求上下文
        context = req.environ['nova.context']
        """执行权限认证,默认会通过读取host节点/etc/nova/policy.json文件
        中的权限规则完成认证,如果没有定义相关的规则,则表明认证失败抛抛异
        这里对应的认证规则是:
        "os_compute_api:os_migrate_server:migrate": rule:admin_api"
        """
        authorize(context, action='migrate')
    
        #从nova数据库中获取id指向的云主机信息,返回一个InstanceV2对象
        instance = common.get_instance(self.compute_api, context, id)
        """省略异常处理代码
        如果云主机不存在,找不到合适的目标主机,云主机处于锁定状态,
        资源不足,云主机状态不对(只能是运行或者停止态)则抛异常
    
        与‘调整云主机大小’(resize)操作一样,也是调用
        `/nova/compute/api.py/API.resize`
        执行迁移操作,resize是通过判断
        是否指定了flavor_id参数来判断是执行‘调整云主机大小’还是‘迁移’操作,
        请看下文的具体分析
        """
        self.compute_api.resize(req.environ['nova.context'], instance)
    
    ---------------------------------------------------------------
    #接上文:/nova/compute/api.py/API.resize, 省略装饰器定义
    def resize(self, context, instance, flavor_id=None, 
                    clean_shutdown=True,
                   **extra_instance_updates):
        """Resize (ie, migrate) a running instance.
    
        If flavor_id is None, the process is considered a 
        migration, keeping the original flavor_id. If flavor_id is 
        not None, the instance should be migrated to a new host and 
        resized to the new flavor_id.
    
        上面的注释是说:如果flavor_id = None, 则用原有的flavor(配置)执行
        迁移操作。如果不为None,则应将云主机迁移到新的主机并应用flavor_id指
        定的配置
    
        conext 请求上下文
        instance InstanceV2实例对象,包含云主机的详细配置信息
        flavor_id 配置模板id,这里为None,因为是迁移操作
        clean_shutdown = True, 静态迁移时开启关机重试,如果未能正常关闭云
        主机会抛异常
        """
    
        #检查系统磁盘的‘自动配置磁盘’功能是否打开,否则抛异常
        #迁移完成后,云主机需要能够自动配置系统磁盘
        self._check_auto_disk_config(instance, **extra_instance_updates)
        #获取云主机配置模板信息
        current_instance_type = instance.get_flavor()
    
        # If flavor_id is not provided, only migrate the instance.
        #flavor_id = None, 执行迁移操作;打印日志并将当前配置作为迁移后云主
        #机的配置
        if not flavor_id:
            LOG.debug("flavor_id is None. Assuming migration.",
                          instance=instance)
            new_instance_type = current_instance_type
        else:
            #从nova.instance_types数据表获取flavor_id指定的配置模板信息
            #read_deleted="no",表示读取数据库时过滤掉已经删除的配置模板
            new_instance_type = flavors.get_flavor_by_flavor_id(
                        flavor_id, read_deleted="no")
            #如果云主机是从镜像启动的并且当前的配置模板中root_gb(根磁盘大
            #小)不为0,而目标配置模板中的root_gb=0,则不支持resize操作
            #因为不知道怎么分配系统磁盘大小了,抛异常
            if (new_instance_type.get('root_gb') == 0 and
                current_instance_type.get('root_gb') != 0 and
                not self.is_volume_backed_instance(context, instance)):
                reason = _('Resize to zero disk flavor is not' 
                                                        'allowed.')
                raise exception.CannotResizeDisk(reason=reason)
    
        #如果没有找到指定的配置模板,抛异常
        if not new_instance_type:
            raise exception.FlavorNotFound(flavor_id=flavor_id)
    
        #打印debug日志
        current_instance_type_name = current_instance_type['name']
        new_instance_type_name = new_instance_type['name']
        LOG.debug("Old instance type %(current_instance_type_name)s, "
                      " new instance type %(new_instance_type_name)s",
                      {'current_instance_type_name': 
                                        current_instance_type_name,
                       'new_instance_type_name': new_instance_type_name},
                                          instance=instance)
    
        #判断是否是同一配置模板,迁移操作中肯定是同一配置模板
        same_instance_type = (current_instance_type['id'] ==
                                  new_instance_type['id'])
    
        """NOTE(sirp): We don't want to force a customer to change 
        their flavor when Ops is migrating off of a failed host.
        """
        #如果是resize操作,新的配置模板被disable了,抛异常
        if not same_instance_type and new_instance_type.get('disabled'):
                raise exception.FlavorNotFound(flavor_id=flavor_id)
    
        #默认cell关闭,cell_type = None
        #这里是说resize的时候,新旧配置模板不能是相同的,因为这样做没有意义
        if same_instance_type and flavor_id and 
            self.cell_type != 'compute':
            raise exception.CannotResizeToSameFlavor()
    
        # ensure there is sufficient headroom for upsizes
        #如果是resize操作,需要先保留资源配额
        if flavor_id:
            #获取vcpu和memory的增量配额(如果有的话,新旧配置模板的差值)
            deltas = compute_utils.upsize_quota_delta(context,
                                      new_instance_type,
                                       current_instance_type)
            try:
                #为当前用户和项目保留资源(增量)配额,更新数据库
                quotas = compute_utils.reserve_quota_delta(context, 
                                                            deltas,
                                                          instance)
            except exception.OverQuota as exc:
                #统计资源不足信息,并打印日志
                quotas = exc.kwargs['quotas']
                overs = exc.kwargs['overs']
                usages = exc.kwargs['usages']
                headroom = self._get_headroom(quotas, usages, 
                                                        deltas)
                (overs, reqs, total_alloweds,
                useds) = self._get_over_quota_detail(headroom, 
                                             overs, quotas, deltas)
                LOG.warning(_LW("%(overs)s quota exceeded for %"
                        "(pid)s, tried to resize instance."),
                       {'overs': overs, 'pid': context.project_id})
                raise exception.TooManyInstances(overs=overs,
                                                     req=reqs,
                                                     used=useds,
                                          allowed=total_alloweds)
        #迁移操作,没有额外的资源需要保留
        else:
            quotas = objects.Quotas(context=context)
    
        #更新与主机状态:主机状态:重建/迁移,任务状态:准备重建或者迁移
        instance.task_state = task_states.RESIZE_PREP
        instance.progress = 0
        instance.update(extra_instance_updates)
        instance.save(expected_task_state=[None])
    
        """为nova-scheduler生成过滤选项,
        CONF.allow_resize_to_same_host = true
        表示允许迁移的目的主机与源主机相同,否则过滤掉源主机
        """
        filter_properties = {'ignore_hosts': []}
        if not CONF.allow_resize_to_same_host:
            filter_properties['ignore_hosts'].append(instance.host)
    
        #默认cell_type = None, 
        if self.cell_type == 'api':
            # Commit reservations early and create migration record.
            self._resize_cells_support(context, quotas, instance,
                                           current_instance_type,
                                           new_instance_type)
    
        #flavor_id = None, 执行迁移操作,否则执行resize
        #记录实例操作,更新nova.instance_actions数据表,迁移结束后会更新数
        #据库记录,反映迁移结果
        if not flavor_id:
            self._record_action_start(context, instance,
                                          instance_actions.MIGRATE)
        else:
            self._record_action_start(context, instance,
                                          instance_actions.RESIZE)
        """将迁移请求转发给
        `/nova/conductor/api.py/ComputeTaskAPI.resize_instance`,该
        方法直接调用
        `nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`处理
        请求,请看下文的分析
        """
        scheduler_hint = {'filter_properties': filter_properties}
        self.compute_task_api.resize_instance(context, instance,
                    extra_instance_updates, 
                    scheduler_hint=scheduler_hint,
                    flavor=new_instance_type,
                    reservations=quotas.reservations or [],
                    clean_shutdown=clean_shutdown)
    
    ------------------------------------------------------------
    #接上文:`nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`
    def migrate_server(self, context, instance, scheduler_hint, 
                      live, rebuild,
                      flavor, block_migration, disk_over_commit,
                      reservations=None, clean_shutdown=True):
    
        """输入参数如下:
        live = False, 静态迁移
        rebuild = false, 迁移,而不是resize
        block_migration = None, 不是块迁移
        disk_over_commit = None
        reservations = [] 迁移操作,没有增量保留资源
        """
        #生成请求参数字典
        kw = {'instance': instance, 'scheduler_hint': 
                                                scheduler_hint,
              'live': live, 'rebuild': rebuild, 'flavor': flavor,
              'block_migration': block_migration,
              'disk_over_commit': disk_over_commit,
              'reservations': reservations,
              'clean_shutdown': clean_shutdown}
        #根据RPCClient的版本兼容性,选择客户端版本。
        #在初始化rpc的时候会设置版本兼容特性
        version = '1.11'
        if not self.client.can_send_version(version):
            del kw['clean_shutdown']
            version = '1.10'
        if not self.client.can_send_version(version):
            kw['flavor'] = objects_base.obj_to_primitive(flavor)
            version = '1.6'
       if not self.client.can_send_version(version):
            kw['instance'] = jsonutils.to_primitive(
                        objects_base.obj_to_primitive(instance))
            version = '1.4'
       #通过同步rpc调用将`migrate_server`消息发送给rabbitmq,
       #消费者`nova-conductor`将会收到该消息
       cctxt = self.client.prepare(version=version)
       return cctxt.call(context, 'migrate_server', **kw)

    小结:nova-api主要完成实例状态、相关条件检查, 之后更新云主机状态及添加nova.instance_actions数据库记录,最后通过同步rpc将请求转发给nova-conductor处理

    nova-conductor部分

    由前述的分析,我们很容易就知道nova-conductor处理迁移请求的入口:

    #/nova/conductor/manager.py/ComputeTaskManager.migrate_server
    def migrate_server(self, context, instance, scheduler_hint, 
                live, rebuild,
                flavor, block_migration, disk_over_commit, 
                reservations=None,
                clean_shutdown=True):
        """各输入参数来自`nova-api`,如下:
        scheduler_hint 调度选项,{u'filter_properties': 
        {u'ignore_hosts': []}}
        live = False, 静态迁移
        rebuild = Flase, 迁移而不是调整云主机大小
        block_migration = None, 非块迁移
        disk_over_commit = None
        reservations = [] ,迁移操作没有增量保留资源
        """
        #如果输入的instance参数不是非法的NovaObject对象,就先从数据库获取
        #云主机信息,然后生成InstanceV2对象
        if instance and not isinstance(instance, nova_object.NovaObject):
            # NOTE(danms): Until v2 of the RPC API, we need to tolerate
            # old-world instance objects here
            attrs = ['metadata', 'system_metadata', 'info_cache',
                         'security_groups']
            instance = objects.Instance._from_db_object(
                    context, objects.Instance(), instance,
                    expected_attrs=attrs)
        # NOTE: Remove this when we drop support for v1 of the RPC API
        #如果输入的flavor参数不是合法的Flavor对象,就先从数据库提取指定id
        #的配置模板,然后生成Flavor对象
        if flavor and not isinstance(flavor, objects.Flavor):
            # Code downstream may expect extra_specs to be 
            #populated since it is receiving an object, so lookup 
            #the flavor to ensure this.
            flavor = objects.Flavor.get_by_id(context, flavor['id'])
    
        #动态迁移,在另外一篇文章中详述
        if live and not rebuild and not flavor:
            self._live_migrate(context, instance, scheduler_hint,
                                   block_migration, disk_over_commit)
        #调用_cold_migrate执行静态迁移,下文具体分析
        elif not live and not rebuild and flavor:
            instance_uuid = instance.uuid
            #with语句,在迁移前记录迁移事件记录到数据库
            #(nova.instance_actions_events),迁移后更新数据库迁移记录
            with compute_utils.EventReporter(context, 'cold_migrate',
                                                 instance_uuid):
            self._cold_migrate(context, instance, flavor,
                                scheduler_hint['filter_properties'],
                                       reservations, clean_shutdown)
        #未知类型                               
        else:
            raise NotImplementedError()
    -------------------------------------------------------------
    #接上文:
    def _cold_migrate(self, context, instance, flavor, 
                            filter_properties,
                          reservations, clean_shutdown):
        #从实例对象中获取所使用的镜像信息,示例如下:
        """
        {u'min_disk': u'20', u'container_format': u'bare', 
        u'min_ram': u'0', u'disk_format': u'raw', 'properties': 
        {u'base_image_ref': u'e0cc468f-6501-4a85-9b19-
        70e782861387'}}
        """
        image = utils.get_image_from_system_metadata(
                instance.system_metadata)
        #通过镜像属性、云主机属性、云主机配置模板生成请求参数字典,格式如下:
        """
        request_spec = {
                'image': image,
                'instance_properties': instance,
                'instance_type': flavor,
                'num_instances': 1}
        """
        request_spec = scheduler_utils.build_request_spec(
                context, image, [instance], instance_type=flavor)
        #生成迁移任务对象
        #`/nova/conductor/tasks/migrate.py/MigrationTask
        task = self._build_cold_migrate_task(context, instance, 
                                                flavor,
                                                 filter_properties, 
                                                 request_spec,
                                                 reservations, 
                                                 clean_shutdown)
    
        """省略异常处理代码
        如果未找到合适的目标主机,策略不合法等异常,则退出
        在退出前会更新数据库,设置云主机的状态并打印日志及发送
        `compute_task.migrate_server`通知
        """
        #执行迁移,下文具体分析
        task.execute()
    
    ---------------------------------------------------------------
    #接上文:`nova/conductor/tasks/migrate.py/MigrationTask._execute
    def _execute(self):
        #从请求参数中获取所使用的镜像信息
        image = self.request_spec.get('image')
        #根据self.reservations保留配额生成配额对象,
        #迁移操作没有保留配额 self.reservations = []
        self.quotas = objects.Quotas.from_reservations(self.context,
                                           self.reservations,
                                         instance=self.instance)
        #添加组(group_hosts)及组策略(group_polices)信息到过滤属性(如果有
        #的话)
        scheduler_utils.setup_instance_group(self.context, 
                                                self.request_spec,
                                                 self.filter_properties)
        """添加重试参数到过滤属性(如果配置的重试次数 
        CONF.scheduler_max_attempts 〉1的话),修改后的过滤属性如下:
        {'retry': {'num_attempts': 1, 'hosts': []}, 
        u'ignore_hosts': []}
    
        如果是`nova-compute`发送过来的重试请求,输入的filter_properties过
        滤属性中的retry字典中包含
        前一次请求的异常信息,再次选择目标主机的时候会排除`hosts`中的主机,在
        populate_retry过程中,会打印该条异常日志;如果重试超过了最大重试次
        数,也会抛异常
        """
        scheduler_utils.populate_retry(
                                        self.filter_properties,
                                           self.instance.uuid)
        #发送请求给`nova-scheduler`,根据过滤规则选择合适的目标主机,
        #如果超时会根据前文的重试参数重试。如果成功,返回合适的目标主机列表
        #如果找不到合适的目标主机,抛异常
        hosts = self.scheduler_client.select_destinations(
                self.context, self.request_spec, self.filter_properties)
        #选取第一个
        host_state = hosts[0]
        #添加目标主机到过滤属性的重试列表(重试的时候'hosts'中的主机被忽
        略),示例如下:
        """
        {'retry': {'num_attempts': 1, 'hosts': [[u'devstack', 
        u'devstack']]}, 'limits': {u'memory_mb': 11733.0, 
        u'disk_gb': 1182.0}, u'ignore_hosts': []}
        """
        scheduler_utils.populate_filter_properties(
                                            self.filter_properties,
                                                       host_state)
        # context is not serializable
        self.filter_properties.pop('context', None)
    
        #通过异步rpc调用发送`prep_resize`消息到消息队列,`nova-compute`会
        #处理该请求(`nova/compute/rpcapi.py/ComputeAPI`)
        (host, node) = (host_state['host'], host_state['nodename'])
        self.compute_rpcapi.prep_resize(
                self.context, image, self.instance, self.flavor, host,
                self.reservations, request_spec=self.request_spec,
                filter_properties=self.filter_properties, node=node,
                clean_shutdown=self.clean_shutdown)

    小结:nova-conductor主要是借助nova-scheduler选择合适的目标主机,同时也会更新nova.instance_actions_events数据表,最后发起异步rpc调用将迁移请求转交给nova-compute处理

    到这里静态迁移的前篇就介绍完成了,过程还是比较简单的:主要完成一些条件判断,更新数据库记录,通过nova-scheduler选主,最后将请求转交给nova-compute处理。敬请期待:
    Openstack liberty 云主机迁移源码分析之静态迁移2

    展开全文
  • 这是在线迁移 源码分析的第三篇,Openstack liberty 云主机迁移源码分析之在线迁移2中分析了prepare准备阶段nova-compute的处理过程,本文将会分析execute执行阶段的处理过程,下面一起来看具体内容: execute执行...

    这是在线迁移 源码分析的第三篇,Openstack liberty 云主机迁移源码分析之在线迁移2中分析了prepare准备阶段nova-compute的处理过程,本文将会分析execute执行阶段的处理过程,下面一起来看具体内容:

    execute执行阶段

    #`nova/compute/manager.py/ComputeManager._do_live_migration`
    def _do_live_migration(self, context, dest, instance, 
                                block_migration,
                                migration, migrate_data):
    
        #省略prepare准备阶段的代码部分,具体分析请查阅前一篇博文
    
        """`prepare`准备阶段完成后,返回如下的字典pre_migration_data :
        {'graphics_listen_addrs': {}, 'volume': {},
                        'serial_listen_addr': {}}
        """
        migrate_data['pre_live_migration_result'] = 
                                            pre_migration_data 
    
        #更新`nova.instance_migrations`数据库,状态改为:running
        if migration:
            migration.status = 'running'
            migration.save()
    
        migrate_data['migration'] = migration
        try:
            """调用虚拟化驱动(LibvirtDriver)执行迁移
    
            请看下文的具体分析
            """
            self.driver.live_migration(context, instance, dest,
                                   self._post_live_migration,
                                   self._rollback_live_migration,
                                   block_migration, migrate_data)
        except Exception:
            # Executing live migration
            # live_migration might raises exceptions, but
            # nothing must be recovered in this version.
            LOG.exception(_LE('Live migration failed.'), 
                                                instance=instance)
            #迁移失败,更新`nova.instance_migrations`,状态改为:
            #failed,并上抛异常
            with excutils.save_and_reraise_exception():
                if migration:
                    migration.status = 'failed'
                    migration.save()
    
    ---------------------------------------------------------------
    #接上文:`nova/virt/libvirt/driver.py/LibvirtDriver.live_migration`
     def live_migration(self, context, instance, dest,
                           post_method, recover_method, 
                           block_migration=False,
                           migrate_data=None):
    
        """Spawning live_migration operation for distributing
         high-load.
        """
    
        # 'dest' will be substituted into 'migration_uri' so ensure
        # it does't contain any characters that could be used to
        # exploit the URI accepted by libivrt
        #校验目的主机名是否合法,只能:单词字符、_、-、.、:
        if not libvirt_utils.is_valid_hostname(dest):
            raise exception.InvalidHostname(hostname=dest)
    
        #下文分析
        self._live_migration(context, instance, dest,
                                 post_method, recover_method, 
                                 block_migration,
                                 migrate_data)
    
    ---------------------------------------------------------------
    #接上文:
    def _live_migration(self, context, instance, dest, post_method,
                            recover_method, block_migration,
                            migrate_data):
        """Do live migration.
    
        This fires off a new thread to run the blocking migration
        operation, and then this thread monitors the progress of
        migration and controls its operation
        """
    
        #通过libvirt获取实例的virDomain对象,然后返回对应的Guest对象
        guest = self._host.get_guest(instance)
    
        # TODO(sahid): We are converting all calls from a
        # virDomain object to use nova.virt.libvirt.Guest.
        # We should be able to remove dom at the end.
        dom = guest._domain
    
        #启动新线程执行块迁移,下文具体分析
        opthread = utils.spawn(self._live_migration_operation,
                                         context, instance, dest,
                                         block_migration,
                                         migrate_data, dom)
    
        #创建事件并与块迁移线程关联,监视线程通过事件来了解迁移状态
        finish_event = eventlet.event.Event()
    
        def thread_finished(thread, event):
            LOG.debug("Migration operation thread notification",
                          instance=instance)
            event.send()
            opthread.link(thread_finished, finish_event)
    
        # Let eventlet schedule the new thread right away
        time.sleep(0)
    
        #省略异常处理:发生异常就上抛,见下文的具体分析
        self._live_migration_monitor(context, instance, guest, 
                                             dest,
                                             post_method, 
                                             recover_method,
                                             block_migration, 
                                             migrate_data,
                                             dom, finish_event)
        #打印日志
        LOG.debug("Live migration monitoring is all done",
                          instance=instance)

    小结:上述过程很简单:更新迁移状态及校验目标主机名,之后创建线程执行块迁移并通过事件监控迁移状态

    块迁移过程

    由上文分析可知,块迁移线程函数为:_live_migration_operation,下面来看具体内容:

    def _live_migration_operation(self, context, instance, dest,
                                      block_migration, 
                                      migrate_data, dom):
        """Invoke the live migration operation
    
         This method is intended to be run in a background thread 
         and will block that thread until the migration is finished 
         or failed.
        """
    
        guest = libvirt_guest.Guest(dom)
    
        #省略try{}except异常代码:发送异常打印日志并上抛异常
    
        """从配置中获取迁移标志,我的示例中block_migration=False
        live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRAT
        E_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST"
        """
        if block_migration:
            flaglist = CONF.libvirt.block_migration_flag.split(',')
        else:
            flaglist = CONF.libvirt.live_migration_flag.split(',')
        #转换libvirt支持的标志并计算其或值
        flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
        logical_sum = six.moves.reduce(lambda x, y: x | y, 
                                                          flagvals)
    
        #pre_live_migrate_data在`prepare`准备阶段中设置
        pre_live_migrate_data = (migrate_data or {}).get(
                                   'pre_live_migration_result', {})
        #vnc监听地址
        listen_addrs = \
                pre_live_migrate_data.get('graphics_listen_addrs')
        #串口信息
        volume = pre_live_migrate_data.get('volume')
        #串口监听地址
        serial_listen_addr = pre_live_migrate_data.get(
                                         'serial_listen_addr')
    
        #检查是否支持VIR_DOMAIN_XML_MIGRATABLE属性,
        migratable_flag = getattr(libvirt, 
                                'VIR_DOMAIN_XML_MIGRATABLE', None)
    
        #如果不支持VIR_DOMAIN_XML_MIGRATABLE属性或者vnc地址为空且没有串口
        if (migratable_flag is None or
            (listen_addrs is None and not volume)):
            # TODO(alexs-h): These checks could be moved to the
            # check_can_live_migrate_destination/source phase   
            """如果配置的vnc或者spice监听地址不属于:
            ('0.0.0.0', '127.0.0.1', '::', '::1') 就抛异常   
            """                                       
            self._check_graphics_addresses_can_live_migrate(
                                                     listen_addrs)                           
            #确保CONF.serial_console.enabled=False                                  
            self._verify_serial_console_is_disabled()
            #由libvirt完成迁移操作
            dom.migrateToURI(
                            CONF.libvirt.live_migration_uri % dest,
                            logical_sum, None,
                            CONF.libvirt.live_migration_bandwidth)
        else:
            #先转储可迁移的xml配置,然后添加卷,vnc,serial信息组成新的可迁
            #移配置
            old_xml_str = guest.get_xml_desc(dump_migratable=True)
            new_xml_str = self._update_xml(old_xml_str,
                                           volume,
                                           listen_addrs,
                                           serial_listen_addr)
            try:
                #由libvirt完成迁移操作
                dom.migrateToURI2(
                            CONF.libvirt.live_migration_uri % dest,
                            None,
                            new_xml_str,
                            logical_sum,
                            None,
                            CONF.libvirt.live_migration_bandwidth)
            except libvirt.libvirtError as ex:
                """ NOTE(mriedem): There is a bug in older versions 
                of libvirt where the VIR_DOMAIN_XML_MIGRATABLE flag 
                causes virDomainDefCheckABIStability to not compare 
                the source and target domain xml's correctly for 
                the CPU model.We try to handle that error here and 
                attempt the legacy migrateToURI path, which could 
                fail if the console addresses are not correct, but 
                in that case we have the
                _check_graphics_addresses_can_live_migrate 
                check in place to catch it.
    
                上面的意思是说:在老版本的libvirt中有个bug:
                VIR_DOMAIN_XML_MIGRATABLE 标志导致
                virDomainDefCheckABIStability 未能正确的比较源端和目的
                端的CPU模式,这里再次尝试是用migrateToURI执行迁移
                """
                # TODO(mriedem): Remove this workaround when
                # Red Hat BZ #1141838 is closed.
                #如果是VIR_ERR_CONFIG_UNSUPPORTED错误,就尝试再次迁移
                #否则抛异常
                error_code = ex.get_error_code()
                if error_code ==libvirt.VIR_ERR_CONFIG_UNSUPPORTED: 
                    LOG.warn(_LW('An error occurred trying to live'
                             'migrate. Falling back to legacy live' 
                             'migrate flow. Error: %s'), ex,
                             instance=instance)
    
                    self.
                        _check_graphics_addresses_can_live_migrate(                    
                                                      listen_addrs)
                    self._verify_serial_console_is_disabled()
                    dom.migrateToURI(
                            CONF.libvirt.live_migration_uri % dest,
                            logical_sum,
                            None,
                            CONF.libvirt.live_migration_bandwidth)
                else:
                    raise
    
        #迁移结束,打印日志
        LOG.debug("Migration operation thread has finished",
                      instance=instance)

    小结:执行参数配置和条件检查,然后由libvirt完成迁移过程

    状态监视

    def _live_migration_monitor(self, context, instance, guest,
                                    dest, post_method,
                                    recover_method, 
                                    block_migration,
                                    migrate_data, dom, 
                                    finish_event):
        """
        从配置模板获得需要迁移的内存大小+从云主机获取需要迁移的磁盘大小
        对于后端是共享存储(如:nfs,rbd)的cinder卷是不需要迁移的,只有本地
        的lvm块设备或者raw/qcow2格式的本地文件才需要迁移
        """
        data_gb = self._live_migration_data_gb(instance, guest,
                                                   block_migration)  
        #达到最大允许切换停机时间的步阶
        downtime_steps = 
                      list(self._migration_downtime_steps(data_gb)) 
        #迁移允许执行的最长时间(之后会终止迁移)
        completion_timeout = int(
          CONF.libvirt.live_migration_completion_timeout * data_gb)
        #更新迁移进度的最大等待时间
        progress_timeout = 
                       CONF.libvirt.live_migration_progress_timeout  
    
        """下面是一长串的if else条件判断,根据迁移所处的状态执行不同的操作
        """
        n = 0
        start = time.time()
        progress_time = start
        progress_watermark = None
        while True:
            #获取实例的作业信息
            info = host.DomainJobInfo.for_domain(dom)
            if info.type == libvirt.VIR_DOMAIN_JOB_NONE:
                """这个type可以表示三种状态:
                1. 迁移任务还没有开始,这可以通过判断迁移线程是否还在运
                行来分辨
                2.迁移由于失败/完成而结束了,这可以通过判断实例是否还在
                当前主机运行来分辨
                """
                #任务还没有开始
                if not finish_event.ready():
                    LOG.debug("Operation thread is still" 
                                      " running",instance=instance)
                else:
                    #如果获取实例状态出错,则抛异常
                    try:
                        #如果实例还在当前主机运行,说明迁移失败了
                        if guest.is_active():
                            LOG.debug("VM running on src," 
                                            "migration failed",
                                            instance=instance)
                            info.type = 
                                     libvirt.VIR_DOMAIN_JOB_FAILED
                        #否则就是迁移完成了
                        else:
                            LOG.debug("VM is shutoff,migration"
                                     "finished",instance=instance)
                            info.type = 
                                   libvirt.VIR_DOMAIN_JOB_COMPLETED 
                    except libvirt.libvirtError as ex:
                        LOG.debug("Error checking domain" 
                                      "status %(ex)s", ex, 
                                      instance=instance)
                        #如果错误码是实例不存在,说明迁移完成了
                        if ex.get_error_code() == 
                                         libvirt.VIR_ERR_NO_DOMAIN:
                            LOG.debug("VM is missing,migration"
                                     "finished", instance=instance)
                            info.type = 
                                   libvirt.VIR_DOMAIN_JOB_COMPLETED
                        #否则就是迁移失败了
                        else:
                            LOG.info(_LI("Error %(ex)s,"
                                         "migration failed"),
                                         instance=instance)
                            info.type = 
                                     libvirt.VIR_DOMAIN_JOB_FAILED   
            #迁移还没有开始
            if info.type == libvirt.VIR_DOMAIN_JOB_NONE:      
                LOG.debug("Migration not running yet",
                              instance=instance)
            #正在执行迁移
            elif info.type == libvirt.VIR_DOMAIN_JOB_UNBOUNDED:
                now = time.time()
                elapsed = now - start
                abort = False
    
                #如果进度发生了变化,就更新
                if ((progress_watermark is None) or
                    (progress_watermark > info.data_remaining)):
                     progress_watermark = info.data_remaining
                     progress_time = now
                #如果进度更新间隔大于配置值,就终止迁移
                if (progress_timeout != 0 and
                    (now - progress_time) > progress_timeout):
                    LOG.warn(_LW("Live migration stuck for %d"
                        " sec"),(now - progress_time), 
                        instance=instance)
                    abort = True
    
                #如果迁移时间超过了最大的允许迁移时间,就终止迁移
                if (completion_timeout != 0 and
                    elapsed > completion_timeout):
                    LOG.warn(_LW("Live migration not completed"
                         "after %d sec"), completion_timeout, 
                         instance=instance)
                    abort = True
    
                #终止迁移任务
                if abort:
                    try:
                        dom.abortJob()
                    except libvirt.libvirtError as e:
                        LOG.warn(_LW("Failed to abort migration"
                                      "%s"),e, instance=instance)
                        raise
    
                """ See if we need to increase the max downtime. We
                ignore failures, since we'd rather continue trying
                to migrate
    
                增加在线迁移的最大切换时间
                """
                if (len(downtime_steps) > 0 and
                    elapsed > downtime_steps[0][0]):
                    downtime = downtime_steps.pop(0)
                    LOG.info(_LI("Increasing downtime to %"
                    "(downtime)dms after %(waittime)d sec elapsed"
                    " time"), {"downtime": downtime[1],
                               "waittime": downtime[0]},
                               instance=instance)
    
                    try:
                        dom.migrateSetMaxDowntime(downtime[1])
                    except libvirt.libvirtError as e:
                        LOG.warn(_LW("Unable to increase max"
                         "downtime to %(time)d ms: %(e)s"),
                         {"time": downtime[1], "e": e}, 
                         instance=instance)
                #每5s记录一次debug日志
                if (n % 10) == 0:
                    #更新进度
                    remaining = 100
                    if info.memory_total != 0:
                        remaining = round(info.memory_remaining *
                                           100 / info.memory_total)
                        instance.progress = 100 - remaining
                        instance.save()
                    #每30s记录一次info日志
                    lg = LOG.debug
                    if (n % 60) == 0:
                        lg = LOG.info
    
                    #这里省略日志语句
    
                    n = n+1
            #迁移完成了
            elif info.type == libvirt.VIR_DOMAIN_JOB_COMPLETED:
                #调用ComputeManager._post_live_migration方法,执行扫尾
                #工作,请看后面的具体分析
                post_method(context, instance, dest, 
                                block_migration,
                                migrate_data)
                break
            #迁移失败了
            elif info.type == libvirt.VIR_DOMAIN_JOB_FAILED:
                #调用ComputeManager._rollback_live_migration方法,执
                #行回滚操作
                recover_method(context, instance, dest, 
                                    block_migration,
                                    migrate_data)
                break
            #迁移被取消了
            elif info.type == libvirt.VIR_DOMAIN_JOB_CANCELLED:
                #调用ComputeManager._rollback_live_migration方法,执
                #行回滚操作
                recover_method(context, instance, dest, 
                                    block_migration,
                                    migrate_data)
                break   
            else:
                LOG.warn(_LW("Unexpected migration job type: %d"),
                             info.type, instance=instance)
            #睡眠0.5s,再循环
            time.sleep(0.5)                     

    小结:一个大循环在不停的监视迁移状态,如果发生错误则退出;如果迁移完成就调用_post_live_migration 执行扫尾工作,如果迁移失败或者被取消就调用_rollback_live_migration执行回滚操作。

    下一篇博文将分析complete完成阶段,敬请期待!!!

    展开全文
  • 这是在线迁移源码分析第四篇,也是最后一篇,介绍在线迁移完成后的清理过程及迁移失败后的滚回操作,下面一起来看具体内容:complete完成阶段迁移成功后的清理过程由Openstack liberty 云主机迁移源码分析之在线迁移...
  • 这是在线迁移源码分析的第二篇,在上一篇中提到:nova-compute从消息队列拿到live_migration消息后,会启动一个线程来执行后续的迁移工作,线程函数为:_do_live_migration;下文将以该方法为入口,详细分析nova-...
  • 全新 DHPST 分销系统(原 YEP 云分销)云主机分销系统源码,后台采用 thinkphp 框架 非二次开发! 五套模板自助切换(持续更新中),QQ 登录,注册送余额,在线充值 自定义支付,主机统计,接口状态自助更新(需要...
  • 全新DHPST分销系统 YEP分销云主机分销系统源码
  • 全新DHPST分销系统 YEP分销云主机分销系统源码【五套模板自助切换】 五套模板自助切换(持续更新中),QQ登录,注册送余额,在线充值 自定义支付,主机统计,接口状态自助更新(需要监控),接口调用付付费,支持,...
  • 接口状态自助更新(需要监控),接口调用付付费,支持,接口检测 我们支持SWAPIDC、WHMCS,魔方,塔,我们的接口是通用的, 可以对接各种系统 全新改版UI前台接口等 后台接口列表UI改版! 搭建方式介绍: 宝塔...
  • 说明 今天给大家分享的是类似于淘宝商城系统详细搭建教程 ...2.上传源码,解压源码 3.绑定域名,解析域名 4.访问域名,进行安装,按照提示输入数据库账号和密码 5.安装完成后默认账号admin 密码是你自己填写的
  • 全新DHPST分销系统 YEP分销云主机分销系统源码【五套模板自助切换】 全新DHPST分销系统 YEP分销云主机分销系统源码 五套模板自助切换(持续更新中),QQ登录,注册送余额,在线充值 自定义支付,主机统计,接口状态...
  • ISP 托管控制面板 去做 与 FreeIPA 服务器集成以获得唯一的系统 uid 分配 用于跨不同存储移动网站的迁移脚本 为 nginx 反向代理生成配置文件 变更日志 来自 ISPConfig 3.0.5.4-p5 的 init 版本分支
  • 接触Openstack也有一段时间了,因为工作需要着重阅读了Glance、Nova、Cinder模块源码并通过...这是第一篇 - 云主机启动过程源码分析 - 介绍云主机启动时,接口调用在nova-api、nova-conductor、nova-sechduler及nova-co
  • 接上篇Openstack liberty源码分析 之 云主机的启动过程2, 简单回顾下:nova-conductor收到nova-scheduler返回的主机列表后,依次发送异步rpc请求给目标主机的nova-compute服务,下面继续来看nova-compute服务的处理...
  • 整合IDC代理中心是一个以php+mysql进行开发的IDC代理平台源码。 代理平台基于PHP和MySQL开发,可同时使用于Windows、Linux、Unix平台,环境需求如下: Windows平台 正式版需要PHP5.6及以上版本支持 ,MySQL5.0...
  • 搭建就能无成本开通空间IDC销售的空间商源码, 有自己的服务器就可以无限...主机空间网站源码+详细视频教程搭建好 你就可以出售主机空间赚钱了 一台服务器可以开无限个主机空间 安装说明:源码附件内附详细安装说明。
  • 首创聚合整合多品牌IDC云主机代理加盟网站源码程序一站式PHP代理平台PHP多平台云主机 【IDC源码】IDC代理,IDC代理源码,主机控制面板源码 【主要对接功能:云服务器产品系列,主机空间产品系列,VPS产品系列,...
  • MOS云主机(MCS) API规范 MOS云主机 (MCS, Meituan Compute Service) 提供EC2兼容的API访问接口,以方便用户自动化管理云主机。 为了访问MOS开放接口,MOS为每个用户分配访问的令牌和密码(ACCESS KEY ID和SECRET),...
  • 接上一篇: Openstack liberty源码分析 之 云主机的启动过程1nova-conductornova-api通过rpc发送启动云主机请求后,nova-conductor会收到该请求,根据路由映射,该请求会递交给 nova/conductor/manager.py....
  • 源码介绍: 搭建就能无成本开通空间IDC销售的空间商源码, ...主机空间网站源码+详细视频教程搭建好 你就可以出售主机空间赚钱了 一台服务器可以开无限个主机空间 安装说明:源码附件内附详细安装说明。
  • Hostry是虚拟主机云服务器租用网站模板,是通用的云服务网站解决方案。这套时尚的HTML5模板基于Bootstrap框架很容易二次编辑。使用一些SVG图标配合CSS3动画效果看起来很高级。 Bootstrap4框架 响应式布局 跨浏览器...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 17,769
精华内容 7,107
关键字:

云主机网站源码