Talk about Mesos ecosystem


Apache Mesos is a distributed resource management framework under the Apache Foundation, which is known as the kernel of distributed systems. Mesos combines containerization technology to provide an effective, cross-distributed application or framework for resource isolation and sharing mechanisms that can be used as resource management platforms for distributed applications such as Hadoop, Mpi, Hypertable, Spark, Elasticsearch. There are a lot of articles on the network Mesos architecture and distribution strategy blog, where I will put aside these basic principles, from a practical point of view, talk about Mesos ecosystem. I will use Marathon, Chronos, Jenkins, Spark and other practices in the use of Mesos to introduce how people are based on Mesos to build container application distribution management, CI / CD test, operation and maintenance management and large data platform, and I will try Dialectical analysis of the pros and cons of Mesos in the above practice.


From last year to the present, I have been developing Meso / Docker based on several people cloud products, the entire team on the Mesos / Docker and its surrounding tools are concerned about the use and use, so accumulated a certain amount of practical experience. There are already a lot of articles on the Mesos architecture and distribution strategy, and I've also shared the technology for Mesos persistence storage. Here I will no longer discuss the principles of Mesos, and I will start from a practical point of view, talk about the use of Mesos what we can do, and in the use of Mesos to build a variety of platforms, what pit, of course, these pit is not all me What they have encountered, some of the other use of Mesos partners summed up.

In addition, because the content is more, I plan to make this share made a series, this time I first briefly about what can run on the Mesos framework, and then mainly around Marathon, Mesos, Docker, Chronos these keywords Times to share.
First of all, through a map to introduce what software can or must run on the Mesos above, and then explore how to build around the Marathon ,, Mesos, Docker PaaS platform, and which encountered problems.

Frameworks On Mesos

This is my map from the Mesosphere official website on the run on the Mesos different Framework. Where different colors represent different types of applications.


• Blue: support for running Long-Running applications, we are based on these tools to build PaaS platform • Green: large data processing tools, these tools are the basis for building a large data platform • Purple: batch tools, the use of batch Tools can build a basic continuous integration platform, especially Jenkins
• Red: Large data storage tool where Cassandra is centrally distributed and Es is an open source search engine based on Apache Lucene (Tm) In addition, the more common tools are: Singularity: Also supports running Long Application and Batch Job

How to deploy the application to the Mesos platform

Whether it is open source applications or our own development of python, Java and other applications, there are three main ways to allow applications to use Mesos resource pool inside the resources. We can look at the following figure:


Frameworks-On-Mesos / Native
This method is for your application to develop the appropriate scheduling components to call Mesos Api interface (application of the Scheduler), while packaging your application Executor, it will be sent to the Mesos-Slave run, and we will generally apply the Scheduler is deployed by Marathon so that Marathon's features can be used to ensure that the application scheduler is highly available. For example, Cisco Sponsored open source project Elasticsearch On Mesos is through this so that Es as a Framework running on Mesos. And Myriad / Yarn-On-Mesos. The advantages of this approach is a high degree of customization, you can use the various features of Mesos, such as dynamic retention, persistent storage, oversold and so on. The disadvantage is that this approach is relatively large workload, an additional increase in the development burden.

dismantle the application module and Docker containerization / Long-Running
And then these modules are released independently to Marathon, of course, Docker container is not necessary, but in order to reduce the environmental dependence, to avoid the Slave machine installed too much things, we'd better container. In the case of Hdfs, we will deploy Hdfs-Namenode and Hdfs-Datanode as Marathon's two App to Mesos and set up communication between Constraints and environment variables. Compared to Mesosphere's Hdfs-On-Mesos ( Hdfs as a framework running on Mesos), this approach looks very low, no way Mesosphere way elegant, but in actual use, the operation and maintenance prefer Namenode, Datanode clear, separate deployment, after all, Need to understand the software architecture. Mesosphere the framework of the way to hide the details of Hdfs, easy to deploy, but also increased the difficulty of Debug. So that this need to weigh.

Batch <br /> This way is more intuitive and no longer introduced.

The following is a brief introduction to the structure of the Es-On-Mesos in the context of the PPT. First, we need to deploy the Es scheduler to the Mesos-Master or a machine and register it on the Mesos-Master. Will receive the resources of the Mesos-Master Offer, and the Es's scheduler will deploy the Es's executor (and the Es's node) to the corresponding Mesos-Slave machine after receiving the resources. Next, the node instance of the Es will pass Register to the Zookeeper cluster to discover each other; and then, Es formed a cluster. In order to ensure that es run to Mesos, we see that in fact the community has done a lot of transformation.

Myriad / Yarn-On-Mesos also used a similar approach to deploy Yarn to the Mesos cluster. It is important to note that after deployment, Nm and Rm can communicate directly with the Mesos-Master. In addition, if a company is only using Yarn, then this approach is very tasteless, but if the pursuit of Yarn and other applications of resources mixed, then it makes sense.

PaaS wants to solve the problem simply to let users concerned about their own applications, PaaS service provider is responsible for solving the application of high availability and failure to restart the problem. We have already mentioned that we can build a PaaS platform using Mesos, Marathon, Docker and other related technologies. At the same time, through the development of some plug-ins, combined with continuous integration, we can use this PaaS platform to solve many problems for users. In addition, we can think that this set of PaaS platform users is the developer, the maintenance side is Devops.
Here is a basic perfect Marathon-based PaaS platform that needs something:
• Marathon
• Mesos
• Private Docker mirroring warehouses, building their own private Docker warehouse is very necessary, first of all is the official foreign image source is too slow, followed by the company has security considerations.
Jenkins, we mentioned earlier, using jenkins to build a continuous integration environment, where it will be the development and operation and maintenance of the application of the place where we can think jenkins input is to develop the code submitted on the Github, Jenkins output is Docker mirror , And then through the Marathon deployment to Mesos to provide external services, Jenkins output part of the need to develop Touch.
• Github / Gitlab, or other code to control the warehouse, I believe we all understand that the development needs it to do code version control, many times Jenkins also need it to do the construction of the trigger.
Metrics, Metrics, Metrics !!! Important things say three times. Whether it is Marathon or Mesos, in the visual monitoring are also lacking. There is a lot of custom development workload, which is also Mesosphere is doing live About Metrics, we'll talk later.
• Log archiving, as well as Debug's problem, through the Marathon deployment of distributed applications and stand-alone deployment of the application is a big difference is to increase the difficulty of Debug, distributed applications because of its asynchronous, no global clock, the application log in the overall view is out of order of. This is a great challenge for log archiving, log-based auditing, and log-based debugging.

I believe that many students have learned Marathon, because it almost became the standard Mesos, I am here only a brief introduction to some of its more critical technical point. As a PaaS tool, Marathon has no commercial version of the UI, such as Heroku, Tutum and so on. When we use Marathon, we need to customize some of our own things, such as Scale for Metrics, which we'll talk about later.
• Marathon is the initiator of the Init.D, application of the cluster.
• Although Mesos has its own containerization technology, I still recommend using Docker, Mesos's own containerization technology still does not solve the problem of environmental dependencies, but also need to install something on the host. But Docker can solve this problem.
• Easy to expand, which is also borrowed from the container of the east wind. Compared to virtualization, container mirroring is smaller and faster and faster. It is important to note that for Docker, the Docker container starts fast and does not mean that the services hosted in the Docker container are fast. To a Java program, for example, although the Docker container has run up, in fact, inside the Java program also takes some time to load. The Docker container is unable to perceive the initialization state of the program.
• Marrowon can be used to achieve blue-green deployment. Where Marathon is only responsible for starting the blue and green environment, which is fault-tolerant, traffic switching, etc. also need to use their own support Caixing.
• Restfulapi, Marathon's Restful interface is perfect, I usually through it to complete the operation.
• Another is to use Haproxy to do service discovery, and load balancing.

Using Marathon, we can:
• When deploying our applications through Marathon, we no longer need Supervisord, and it would be better to use Health Check instead of Supervisord. Marathon supports a health check based on the Tcp and Http protocols, and its general logic is Marathon's periodic scan application's health check path. If the response time goes beyond the threshold, Marathon will restart the application directly. This is what we call Fastfailed, as long as the application response is too slow, we should kill it, because the response is too slow to slow down the overall service response speed. We no longer need to restart the container inside the failure, should kill the container directly.
• Constraints: With Constraints, we can restrict applications to deploy on one or more slave machines, which are typically stateful, or storage applications, where the application's data catalog must be mounted; Or the application needs to be deployed in a special configuration of the Slave machine, for example, the application needs to use Gpu resources, and only a few machines have Gpu resources.
Resources: Resources differs from Constraints in that Resources is quantifiable. So Resources's granularity is finer than Constraints, and Constraints quantify the resources by only 0 or 1. So the use of Resource will be more flexible, followed by the problem, only the CPU, memory, port, bandwidth, etc. can be quantified, other resources we may need for Mesos integrated resources to support the quantitative module.
• Force Pull Image: You can force a Docker image to be cast each time an application is deployed. Taking into account the performance loss problem, I do not recommend this approach, a better way is to pull the configuration.
• Docker Image preheating: In order to shorten the application of the deployment time, it is best to first Image Pull to Slave machine, and can avoid Marathon due to mirror download problems frequently published applications, if the use of the network when the registry of the mirror problem Can not be considered. In addition, some people mentioned that through the network storage part of the mirror to solve the problem.
• The number of application deployments comes up after bottlenecks in Zookeeper. In practice, tens of thousands of application examples in the Marathon frequent failure to restart will lead to Zookeeper blocking, the specific reasons related to the performance of Zookeeper.

Private Docker Registry
To build a private mirror warehouse is necessary, the Internet has a lot of private mirror on the way to build a warehouse, and I just talk about the need to focus on what to consider,
• The first is the storage strategy. In general, we need to prepare for our Registry data disk, and plan in advance size, in particular, should pay attention to the Inode settings larger, Docker small file more. This can be done with some open source tools, for example, Devicemapper there is a corresponding formatting tool. The other is the storage of the Driver, now more common is Devicemapper, Aufs, the pros and cons between the two can refer to several people cloud FAQ or my blog. General recommended Centos with Devicemapper, Ubuntu with Aufs.
• The second is the issue of rights management, operation and maintenance In order to control the Registry Size, we can configure the registry for the right of the Push, and no authority Pull. This also solves the problem of the deployment of Slave, while not leading to the size of Registry out of control.
The last one is that we'd better configure a Frontend for the Registry, which makes it easy to maintain, search, and delete useless Docker Images.

On Docker, there are two points to note,
• The first is how Docker mirroring is made. We can think that there are two general Docker mirror production methods, the first is to create a Docker Base Image, and then start the dynamic pull out the executable program; the second, based on the release version of the program to do Docker Image Tag The These two methods can be, the second way more intuitive, but only increase the operation and maintenance costs, the need to regularly recover the old Tag.
• The second particular note is the abnormal exit of the Docker container recycling problem. Mesos can now set the time to recycle his own container Garbage, but can not reclaim the unusually exit Docker container. Marathon's repeated retry of unusual applications will cause the disk to be inflated.

Service discovery <br /> Currently in the Mesos community common with the following three service discovery methods • Mesos-DNS
Mesos-DNS is still in full development, Mesos-DNS is stateless, free Replica, so we can be deployed for each Slave Mesos-DNS way to ensure high availability; the same time, Mesos-DNS support Restfulapi , Easy to call. Through the following figure we can see that Mesos-DNS generates DNS resolution by monitoring Mesos-Master (and synchronizing to an external DNS server if necessary), and internal service discovery can be done directly through the internal DNS resolution to ensure.


• Bamboo / Haproxy
Based on Bamboo / Haproxy service discovery technology is widely used, Bamboo through the monitoring of Marathon to dynamically update Haproxy inside the service mapping. Similar to Mesos-DNS, Haproxy is stateless and supports Restfulapi, which is a service discovery that Bamboo can only use to listen to Marathon.


• a container-IP
As a container-IP mechanism in Mesos, one of its design goals is to build a pluggable architecture that allows users to choose solutions from existing third-party network providers and as a foundation for the network. The author did not study too much, extract the official structure and components to explain:
1 is responsible for specifying the IP request frame / schedule label for the container to be enabled. This is an optional service that will introduce a container-IP capability into the existing framework without any side effects.
2 Mesos cluster built by a Mesos Master node with a Mesos Agent node.
3 A set of third-party IP address management (IPam) server, responsible for IP address allocation, and after the use of IP address to recover.
4 Third-party network isolation program The supply program is responsible for isolating different container systems and allowing operators to adjust their reachability and routing through configuration.
5 as a lightweight Mesos module is loaded into the Agent node network isolation module will be responsible for the task through the scheduler to review the task, while the use of IP address management and network isolation services for the corresponding container to provide IP address. After that, it will further deliver the IP address to the master node as well as the framework. Although the IP allocation and network isolation function can be fully implemented by a single unit, but based on the concept level, Mesos provides two different services. It is envisioned to provide IP address management and network isolation services by two separate service providers. For example, one of them uses Ubuntu Fan to implement IP address allocation, while the other uses the Calico project for network isolation.


• Inspired by this, the author is trying to integrate Docker-Weave into the existing Paas platform.

Load balancing here mainly discusses the load balancing of Mesos-DNS and Haproxy. I probably should not put Mesos-DNS as load balancing here, and DNS is a bit different from load balancing. We can think that Mesos-DNS only has a load balancing strategy of roundrobin (or random forwarding), and Haproxy's load balancing strategy is more, and Haproxy also supports minimum links, IP hashing and other strategies in addition to Roundrobin. In addition, it should be mentioned that both Mesos-DNS and Haproxy support configuration dynamic loading.

Scheduling <br /> My schedule here refers to the PaaS platform should be what strategy to deploy the application to which Slave, where I especially want to take Docker-Swarm scheduling strategy to compare with Mesos scheduling strategy, both Of the scheduling strategy entry point, or the dimension is very different. The former is from the Docker point of view, while the latter because more Generic, from the resource point of view. If you use Docker purely, Docker-Swarm's strategy is more powerful.
• Docker-Swarm's dispatch of containers has been quite rich:
◦ Publish the container to the machine with the specified label via the parameter Constraint. For example, Mysql released to the Storage == SSD machine to ensure that the database IO performance;
◦ Publish the container to the machine on which a container is already running, or on a machine that has already been driven by the Affinity;
◦ Automatically publish containers to the machine on which they depend on the parameters Volumes-From, Link, Net, etc.;
◦ The Strategy can specify Spread, Binpack, and random 3 different Ranking Node policies, where the Spread policy will dispatch the container as much as possible to multiple machines to reduce the loss of machine downtime. Instead, the binpack strategy will try to Set up to a few machines to avoid resource fragmentation, Random strategy will randomly deploy containers.
• Because Mesos is more generic, it is slightly lacking in container scheduling, and we can now schedule the container to the specified machine by setting the host attribute or Resources limit. The following are examples of Resource and Attributes.
◦ Resource: Cpus: 24; Mem: 24576; Disk: 409600; Ports: [21000-24000,30000-34000]; Bugs (Debug_Role): {A, B, C}
◦ Attributes: 'Rack: Abc; Zone: West; Os: Centos5; Level: 10; Keys: [1000-1500]'

Metrics & Auto-Scaling
For the PaaS platform, monitoring is essential, and this is exactly what Marathon / Mesos lacks and requires a lot of custom development. In general, we need to monitor from three dimensions: physical host level, Docker level and application business level. The physical host level of the monitoring program has been very mature; Docker level of monitoring is generally using Cadvisor or Sysdig and other monitoring software; application layer business logic monitoring needs to be bundled with the business.
Another problem is the automatic expansion, first of all clear under the definition of automatic expansion, we can think that automatic expansion is based on business load dynamic adjustment of resources. Automatic expansion mainly includes the following three questions:
• Automatic expansion of the trigger: Since the automatic expansion is dependent on the business, so that the trigger is also dependent on the specific scene. For example, according to the CPU or memory Metrics load, or according to App request failure rate, or if we have estimated the daily visit peak, then you can use the trigger mechanism.
• How to scale: In general, automatic expansion is divided into two layers, Iaas layer and application layer: Iaas layer generally need to call the Iaas layer of Api to increase or reduce the machine resources; and application layer is by calling Marathon's Api to expand the application Example.
• Elasticity / extension strategy: how much resources to increase or decrease is a problem that needs to be optimized. Simple point, we can according to the actual business to develop a fixed resource to expand the size, and gradually adjust; more advanced is the introduction of machine learning methods, according to the historical load dynamic resource optimization to determine the Size.

Log & Debug
As we have already mentioned, distributed applications, unlike normal applications, are distributed on multiple unknown machines. So we need to collect the log to a place to focus on management. At present, the more common log program is Elk, here mainly talk about the Docker log problem. There are two main ways to collect logs from Docker:
• Output the applied log to Stdout / Stderr. In this way, we can log through Logspout / Heka: this way we will lose some application logs when the Docker exits abnormally.
• Output the applied log to a fixed directory and mount it with the -V command. In this way, we can use Logstash to collect the log of the host fixed directory.

Author: Zhou Weitao, Digital Technology Director / R & D Director Presentation: The person in charge of Digital Technology Cloud Platform, worked for Red Hat, Red Hat Certified Engineer, Mesos contributor, Senior Python Development Engineer, International Open Source Solution Provider. Is the first batch of domestic contact with Docker, Mesos and other technology developers.

This article is from: CSDN

    Heads up! This alert needs your attention, but it's not super important.