Research on container network of Kubernetes

Network provider research

provider k8s version k8s network policy pros cons thoughoutput /%direct
flannel vxlan >= 1.2 no 1) easy to configure
2) easy to span VLAN/datacenter
1) Broadcast flood to 192.168.0.0/16 since no exact ip route setting.
2) performance downgrade
3) network isolation needs extra subnet mgt. efforts
45%
flannel host-gw >= 1.2 no 1) easy to configure
2) no obvious performance downgrade
1) To span multiple subnets in a vlan, need extra steps to add routing rules
2) can’t span multiple vlans, datacenters.
3) doesn’t support network policy
4) network isolation needs extra subnet mgt. efforts
93%
calico >= 1.3 yes 1) bird agents configure routes with BGP on each node.
2) flexible subnets expansiton with ip address pool mgmt
3) support k8s network policy
4) enable -ipip to support cross L2 VLAN.
1) arch is complex, management of bird and felix, need higher learning carve for deployment, debugging, operation
2) when enabling -ipip, introduce additional packet encapsulation with significant performance downgrade.
BGP:93% vs BGP+ipip:64%
canal (calico + flannel vxlan) >= 1.3 yes 1) support vxlan to cross L2.
2) Network policy support extended from Felix of calico.
3) smooth migration from existing flannel to calico
1)significant performance downgrade due to packet encapsulation and broadcast flood.
2) Double complexity
45%
calico + IaaS IP address (SL portableIP) + hostAffirnity >= 1.3 yes 1) consist network IP address space as IaaS env.
2) Network policy support.
1) No L2 support to cross VLAN.
2) enable subnet of hostAffirnity and integrate application of ip address space from IaaS.
93%

Flannel

Summary of flannel over “vxlan” and “host-gw”

:star: vxlan is the default backend type of ubuntu k8s deployment. The container-to-container test result above (1.37 G/sec) performs ~50% of the raw host-to-host result (3.02 G/sec).
:star: “host-gw” leverage the kernel route table with “ip routes” to route traffic to target host. The container-to-container test result above (2.84 G/sec) performs ~93% of the raw host-to-host result (3.02 G/sec)

container to container over vxlan

1
2
3
4
5
6
$ docker run -it --rm networkstatic/iperf3 -c 172.31.71.4
Connecting to host 172.31.71.4, port 5201
[ 4] local 172.31.15.4 port 57807 connected to 172.31.71.4 - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 1.60 GBytes 1.37 Gbits/sec 398 sender
[ 4] 0.00-10.00 sec 1.60 GBytes 1.37 Gbits/sec receiver

container to container over host-gw

1
2
3
4
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 3.31 GBytes 2.85 Gbits/sec 284 sender
[ 4] 0.00-10.00 sec 3.31 GBytes 2.84 Gbits/sec receiver

Calico

Summary of calico BGP and calico BGP+ipip

:star: “calico BGP(node to node mesh)” is similar to “host-gw”. about 93%+ performance of direct connection.
:star: “calico BGP + -ipip(node to node mesh)” is similar to “vxlan”. it is about 64%+ performance of direct connection due to packet escapsulation, but it is better than “vxlan” without broadcast flood due to appropriate ip routes setting. Significant negative performance impact even for connection in the same VLAN.

BGP without ipip

host to host (in the same VLAN)

1
2
3
4
5
6
Connecting to host 10.177.83.70, port 5201
[ 4] local 10.177.83.83 port 39122 connected to 10.177.83.70 port 5201
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 3.71 GBytes 3.19 Gbits/sec 0 sender
[ 4] 0.00-10.00 sec 3.71 GBytes 3.18 Gbits/sec receiver

Container to container over BGP node to node mesh(same VLAN)

1
2
3
ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 3.45 GBytes 2.97 Gbits/sec 608 sender
[ 4] 0.00-10.00 sec 3.45 GBytes 2.96 Gbits/sec receiver

Container to container over BGP+-ipip (same VLAN)

1
2
3
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 2.37 GBytes 2.03 Gbits/sec 3 sender
[ 4] 0.00-10.00 sec 2.36 GBytes 2.03 Gbits/sec receiver

Container to container over BGP+-ipip (cross VLAN)

1
2
3
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 2.47 GBytes 2.12 Gbits/sec 556 sender
[ 4] 0.00-10.00 sec 2.46 GBytes 2.12 Gbits/sec receiver

host to host (cross vlan)

1
2
3
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 4.43 GBytes 3.81 Gbits/sec 1742 sender
[ 4] 0.00-10.00 sec 4.43 GBytes 3.81 Gbits/sec receiver

AWS VPC network

Summary of aws-vpc

:star: “aws-vpc” is similar to “host-gw”. about 93%+ performance of direct connection.

1
2
3
4
5
6
7
8
9
10
11
host to host
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 853 MBytes 716 Mbits/sec 0 sender
[ 4] 0.00-10.00 sec 853 MBytes 715 Mbits/sec receiver
container to container
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 831 MBytes 697 Mbits/sec 7 sender
[ 4] 0.00-10.00 sec 830 MBytes 696 Mbits/sec receiver

NetworkPolicy for isolation (TBD)

kubectl annotate ns gamestop "net.beta.kubernetes.io/network-policy={\"ingress\": {\"isolation\": \"DefaultDeny\"}}"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
name: access-gamestop
namespace: gamestop
spec:
podSelector:
matchLabels:
role: db
ingress:
- from:
- namespaceSelector:
matchLabels:
project: gamestop
ports:
- protocol: tcp
port: 50000
- protocol: tcp
port: 50001
- protocol: tcp
port: 80
- protocol: tcp
port: 443