¿ù°£ Àα⠰Խù°

°Ô½Ã¹° 708°Ç
   
xxx
±Û¾´ÀÌ : ÃÖ°í°ü¸®ÀÚ ³¯Â¥ : 2022-11-18 (±Ý) 18:07 Á¶È¸ : 153
                                
https://www.ibm.com/docs/en/cic/1.1.5?topic=kvm-build-instance-aborts-due-failing-allocate-networks



Build of instance aborts due to failing to allocate the network(s)

Last Updated: 2022-09-21

Problem

A virtual machine deployment fails with error message: "Build of instance XXX aborted: Failed to allocate the network(s), not rescheduling.".

On the compute node which the VM was scheduled to build on, you can see these logs in /var/log/nova/nova-compute.log ,

2022-01-14 01:24:34.996 1053122 WARNING nova.virt.libvirt.driver [req-41ea611f-572c-4b25-b292-91a1e3a135d0 0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9 384094db0ce1496d8f02bc760d3724df - default default] [instance: 610f1c5b-9186-447b-8b8f-36f93cbdcefd] Timeout waiting for [('network-vif-plugged', '64d1bec0-a238-4115-84c4-de6bdca951fd')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds

in /var/log/neutron/openvswitch-agent.log,

2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-22eef473-17be-4c19-a2e6-a35bcd2d331f - - - - -] Failed to communicate with the switch: RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x88b4a7e4,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 92, in _send_msg
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     result = ofctl_api.send_msg(self._app, msg, reply_cls, reply_multi)
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/os_ken/app/ofctl/api.py", line 89, in send_msg
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     reply_multi=reply_multi))()
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/os_ken/base/app_manager.py", line 279, in send_request
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     return req.reply_q.get()
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     return waiter.wait()
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     return get_hub().switch()
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     return self.greenlet.switch()
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int eventlet.timeout.Timeout: 300 seconds
...
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     raise RuntimeError(m)
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x88b4a7e4,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2022-01-14 01:24:34.919 1053115 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int

and in /var/log/openvswitch/ovs-vswitchd.log.

2022-01-14T01:21:34.115Z|35809|rconn|INFO|default0<->tcp:127.0.0.1:6633: connecting...
2022-01-14T01:21:36.123Z|35810|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connected
2022-01-14T01:21:36.123Z|35811|rconn|INFO|default0<->tcp:127.0.0.1:6633: connected
2022-01-14T01:24:28.407Z|35812|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
2022-01-14T01:24:28.407Z|35813|rconn|ERR|default0<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
2022-01-14T01:24:29.612Z|35814|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connecting...
2022-01-14T01:24:29.612Z|35815|rconn|INFO|default0<->tcp:127.0.0.1:6633: connecting...
2022-01-14T01:24:30.612Z|35816|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connection timed out

Explanation

In /etc/neutron/plugins/ml2/openvswitch_agent.ini on the IBM Cloud Infrastructure Center compute nodes, the default value of of_inactivity_probe is 10.

# The inactivity_probe interval in seconds for the local switch connection to
# the controller. A value of 0 disables inactivity probes. (integer value)
#of_inactivity_probe = 10

When the Open vSwitch does not communicate with the client for 10 seconds, it sends a probe. If no response is received for an additional 10 seconds, the Open vSwitch assumes the connection is down and attempts to reconnect.

When the workload is high and there are many logical flows in the environment, the Open vSwitch main loop can take more than 10 seconds for logical flow computation. If it takes more than 10 seconds, it breaks the connection and reconnects it and this can go in a loop.

Resolution

This problem can be mitigated by increasing the probe interval time, for example, you can increase the time to 60 seconds or a different proper value in your environment.

  1. Set of_inactivity_probe = 60 in the [ovs] section of /etc/neutron/plugins/ml2/openvswitch_agent.ini on all IBM Cloud Infrastructure Center compute nodes.

  2. A restart of neutron-openvswitch-agent service is required to make the new setting work.

systemctl restart neutron-openvswitch-agent

À̸§ Æнº¿öµå
ºñ¹Ð±Û (üũÇÏ¸é ±Û¾´À̸¸ ³»¿ëÀ» È®ÀÎÇÒ ¼ö ÀÖ½À´Ï´Ù.)
¿ÞÂÊÀÇ ±ÛÀÚ¸¦ ÀÔ·ÂÇϼ¼¿ä.
   

 



 
»çÀÌÆ®¸í : ¸ðÁö¸®³× | ´ëÇ¥ : ÀÌ°æÇö | °³ÀÎÄ¿¹Â´ÏƼ : ·©Å°´åÄÄ ¿î¿µÃ¼Á¦(OS) | °æ±âµµ ¼º³²½Ã ºÐ´ç±¸ | ÀüÀÚ¿ìÆí : mojily°ñ¹ðÀÌchonnom.com Copyright ¨Ï www.chonnom.com www.kyunghyun.net www.mojily.net. All rights reserved.