Docker network part of the implementation of flow analysis (Libnetwork source code interpretation)

[Editor's Note] Libnetwork as a part of the Docker network dependent library, in the Docker 1.9 officially out of the experimental stage, into the main branch officially put into production use stage. With the new Networking we can create a virtual network and then add the Container to the virtual network to get the most suitable network topology for the deployed application. This article will be on the Docker network part of the source code analysis, to do a detailed description of Libnetwork and use and development of the sample.

@Container container technology conference will be held on June 4 at the Shanghai Everbright Convention and Exhibition Center International Hotel, from Rancher, Ctrip, PPTV, ants gold clothes, Jingdong, Zhejiang Mobile, Haier Electric, only goods, eBay, Siege, Tudou, Tudou, Ali Creek, Tencent games, several people cloud, point melting network, Huawei, light yuan technology, ZTE and other companies will bring practical experience to share the experience, before May 7 tickets only 438 yuan, welcome students interested in buying.

Libnetwork as a part of the Docker network dependent library, in the Docker1.9 officially out of the experimental stage, into the main branch was put into production use stage. With the new Networking we can create a virtual network and then add the Container to the virtual network to get the most suitable network topology for the deployed application.

Standardization of components

Docker implemented CNM (Container Network Model) in Libnetwork in order to standardize network development development steps and support multiple network drivers. CNM is mainly based on the following three components.

  • Sandbox: A sandbox contains information about a container network stack. The sandbox can manage the interface, routing, and DNS settings of the container. The sandbox implementation can be a Linux Network Namespace, a FreeBSD Jail, or a similar mechanism. A sandbox can have multiple endpoints and multiple networks.
  • Endpoint: An endpoint can join a sandbox and a network. The implementation of the endpoint can be veth pair, Open vSwitch internal port or similar device. An endpoint can belong to only one network and belong to only one sandbox.
  • Network A network is a set of endpoints that can communicate directly with each other. The implementation of the network can be Linux bridge, VLAN, and so on. A network can contain multiple endpoints.

2 CNM object details

After introducing the three main components of CNM, the following is a detailed introduction to each object in CNM.

  • NetworkController: NetworkController exposes libnetwork by exposing the API to the user to create and manage the network. Libnetwork supports a variety of drivers, including built-in bridge, host, container and overlay, and also supports remote drivers (that is, to support users using custom network drivers).
  • Driver: Driver is not exposed directly to the user interface, but the driver is the real implementation of the network function of the object. The NetworkController can provide specific configuration options for a specific driver whose options are transparent to libnetwork. In order to meet the needs of users and development needs, driver can be built-in type, it can be user-specified remote plug-in. Each driver can create its own unique type of network, and its management. In the future can be different plug-in management functions to integrate, in order to facilitate the management of different types of networks.
  • Network: The Network object is an implementation of CNM. NetworkController creates and manages Network objects by providing APIs. NetworkController When you need to create or update the Network, it corresponds to the Driver will be notified. Libnetwork provides communication support for each endpoint that belongs to a network through the abstraction level, while isolating it from other endpoints. This communication support can be the same host, it can be cross-host.
  • Endpoint: Endpoint represents the service endpoint. It can provide interconnection of containers in different networks. The Endpoint is created by the driver that created the corresponding Sandbox.
  • Sandbox: Sandbox represents a container such as ip address, mac address, routes, DNS and other information configuration. Sandbox creates and assigns the corresponding network resource when the user asks a network to create an Endpoint. Libnetwork will use a specific operating system mechanism (for example: linux netns) to fill the sandbox corresponding to the container in the network configuration. A sandbox can have multiple connections to a different network endpoint.

3 Introduction to Libnetwork Workflow

Here's a summary of the libnetwork's workflow.

  • 1. Specify libnetwork.New () to create a NetWorkController instance after specifying the network driver and the associated parameters. This example provides a variety of interfaces, through which the Docker can create new NetWork and Sandbox.
  • 2. Create a network of the specified type and name via controller.NewNetwork (networkType, "network1").
  • 3. Create an Endpoint via network.CreateEndpoint ("Endpoint1"). In this function, the Docker assigns IP and interface to the Endpoint, and the configuration information in the corresponding network instance is used in the Endpoint, including iptables configuration rules and port information.
  • 4. Create the Sandbox by calling controller.NewSandbox (). This function mainly calls namespace and cgroups to create a relatively independent sandbox space.
  • 5. Call ep.Join (sbx) to Endpoint to join the specified Sandbox, then the Sandbox will also join the creation of Endpoint corresponding to the Network.
    In general, Endpoint is created by the Network, belonging to the network to create it, when the Endpoint to join the Sandbox, it is equivalent to the Sandbox added to the Network. The three brief relationships are shown in Figure 1.
    Figure 1 Libnetwork in the concept of a brief relationship diagram

4 Docker uses the Libnetwork built-in driver's execution stream

Before introducing the execution flow of the Docker using the libnetwork built-in driver, let's take a look at the relationship between the various devices and containers in the default network mode (bridge mode) of Dorker. The brief relationship shown in Figure 2.
Figure 2 The relationship between each device and container in bridge mode

In the figure, Container 1 and Container 2 are two Docker containers, veth0 and veth1, veth2 and veth3 are two pairs of veth pairs (veth pairs can be used for communication between different netns), docker0 is the bridge (docker0 can be connected to different Network), eth0 is the host of the physical network card.

Both Container 1 and Container 2 in the figure have different netns. In the case of container1, veth0 in the veth pair is added to the netns of Container 1, and the basic information such as IP address is configured. Veth0 becomes a piece of card for Container 1, and veth1 corresponding to veth0 is connected to docker0 On the bridge. Docker0 bridge also connected to the host of the physical network card eth0. So that container1 can use veth0 to communicate with the Internet. Similar Container 2 is also connected to the docker0 bridge, obviously Container 1 and Container 2 can also be between the communication.

Combined with the concepts mentioned above in CNM, we can see that the docker0 bridge is the network component in CNM, which can represent a network. Container owned netns is a sandbox in CNM, it is a separate network stack. Veth pair is the endpoint in CNM. One of its endpoints can join a network by adding a bridge, and the other end can join a sandbox, which adds the container corresponding to the sandbox to the network.

After understanding the relationship between the various devices and containers and the correspondence between the various devices and the various components of the CNM, let's look at how Docker interacts with libnetwork and libcontainer and creates and configures the network for each container. Docker using the default mode (bridge mode) for the container to create and configure the network as shown in Figure 3.
Figure 3 Network creation and configuration process

Combine Figure 3, Docker source code and libcontainer source code, we have to look at Docker in bridge mode network creation and configuration process.

  1. When the Docker daemon is started, the daemon will create a net driver corresponding to the bridge driver. The controller can create the network and sandbox in the following process.
  2. The daemon then creates a network of the specified type (bridge type) and the specified name (that is, docker0) by calling controller.NewNetwork (). After receiving the create command, Libnetwork uses the system linux system call to create a bridge called docker0. We can use the ifconfig command to see the bridge called docker0 after the Docker daemon is started.
  3. After the successful start of the Docker daemon, we can use the Docker Client to create the container. The container's network stack will be created before the container is actually started. When creating a network stack for a container, it first gets the netController in the daemon, which is the set of interfaces provided by libnetwork, the NetworkController.

    Next, the Docker calls BuildCreateEndpointOptions () to create the endpoint's configuration information in this container. And then call CreateEndpoint () to use the above configuration information to create the corresponding endpoint. In bridge mode, libnetwork creates a device that is a veth pair. Libnetwork calls netlink.LinkAdd (veth) for the creation of veth pairs, get a veth device is prepared for the host, the other is prepared for the sandbox. Add the vet of the host to the bridge (docker0). And then call netlink.LinkSetUp (host), start the host side of the veth. Finally, the port mapping in the endpoint is configured.

    In essence, this part of the work done is to call the linux system call, the creation of veth pair. And then add one end of the veth pair as an interface to the docker0 bridge to the bridge.

  4. Create SandboxOptions, and then call controller.NewSandbox () to create a new sandbox that belongs to this container. After receiving the request from the Docker to create the sandbox, libnetwork uses the system call to create a new netns for the container and return the path of the netns to the Docker.
  5. Call ep.Join (sb) to join the endpoint to the container's corresponding sandbox. First add the endpoint to the container corresponding to the sandbox, and then the endpoint of the ip information and gateway and other information to configure.
  6. Docker calls libcontainer to start the container, libcontainer in the container init process initialization container when the environment, the container will be added to all the processes in the netns obtained in 4. So that the container has its own independent network stack, and then completed the network part of the creation and configuration work.

5 libnetwork support for plugins

As mentioned earlier, libnetwork not only built the bridge, host, container and overlay four drivers, but also provides support for third-party plug-ins. Third-party plug-ins In order to provide network support for libnetwork, you need to implement and provide the following API.

  • Driver.Config
  • Driver.CreateNetwork
  • Driver.DeleteNetwork
  • Driver.CreateEndpoint
  • Driver.DeleteEndpoint
  • Driver.Join
  • Driver.Leave

These drivers do not use the device name as their API parameters, but use a unique ID as their argument, such as networkid, endpointid, and so on.

When the user specifies a custom network driver, the Docker sets the driver in libnetwork to remote. Remote is actually not on the specific drive to achieve, but by calling third-party plug-ins to complete the network to create, configure and manage the work. Libnetwork and third-party plug-in interaction shown in Figure 4.
Figure 4 libnetwork interacts with third-party plug-ins

First, we need to manually start the third party plug-in libnetwork, so libnetwork to work with third-party plug-ins.

Then, when we use Docker to create a container, we need to specify a third-party plug-in as a Docker network driver. When the libnetwork initialization of which the remote start, it will be third-party plug-in in the libnetwork to register, and written into the NetworkController. Libnetwork will also be through the socket and third-party plug-ins to connect and communicate, and libnetwork in the provisions of the handshake agreement, etc., to exchange information.

This allows the Docker to invoke the various APIs of a third-party plug-in like using built-in drivers. When libnetwork receives a Docker call request, libnetwork will call the remote driver in the corresponding function to deal with. These functions are encapsulated by the request parameters, and then encoded into the form of JSON string, through the previously established socket connection to send a call request.

After receiving the request, the third-party plug-in resolves the JSON string in the request, completes the requested operation in the JSON string, then wraps the execution result into a JSON string and returns it to libnetwork through a socket. Libnetwork parses the return information sent by the third-party plug-in and returns the execution result to the Docker.


Author Gao Xianglin, Zhejiang University SEL laboratory graduate, currently in the cloud platform team engaged in scientific research and development work. Zhejiang University team PaaS, Docker, large data and mainstream open source cloud computing technology has in-depth research and secondary development experience, the team is now part of the community will contribute to the technical articles, hoping to help readers.

    Heads up! This alert needs your attention, but it's not super important.