Is the database really suitable for containerization, maybe not

[Editor's Note] This article mainly evaluates the feasibility and necessity of database containerization and concludes with recommendations and solutions.

The container concept (especially the Docker) is very hot. However, before packaging the database to a brand new container, there are some things that need to be done in your mind.

This article evaluates the feasibility of Docker and other container solutions in a database environment.

A few weeks ago, I wrote a relatively general article about the container . It describes when you should consider using Docker , rkt , LXC and other container technology. Convenient words may wish to first look at. This is a good way to understand some aspects of the need to consider before migrating to a new technology architecture. Which has also led to an internal discussion of our solution team of engineers. Your team should also have the same confusion: the customer should run the database in the container?

Before we start, we have to acknowledge the fact that Percona is using the container. Percona monitoring and management (PMM) provides all the beautiful charts and query analysis are carried by running a Docker container. We make this choice because the integration between components that we can provide users with the greatest value of the place. Docker makes it possible for us to distribute a ready element coolly. In short, it has great potential for the application of the enterprise environment.

However, for the database … there are a few suggestions here.

Temporary emergency

Decision = do not do the database container (keep the status quo)

This is not to say that every environment is like this. But by default we think the most recommended by most customers to adopt the practice. Remember, I just suggest that you do this for the database . If your application today has been micro-service , it may be more meaningful to rely on the load characteristics of the database, extend the requirements, and the existing skills set by the engineers to implement the containerization of the database.


Lack of synergy

First do not get angry, may wish to take some time to recall our original intention of it First, the container solution was designed to handle stateless applications with temporary data. The container quickly builds a micro service and then destroys it. This includes all the components of the container (including its cache and data). The instantaneous nature of the container determines that all components and services of the container are considered to be part of the container (substantially all or not). It may be challenging to provide a data volume belonging to the underlying operating system for the container by perforating the container. The existing solution is unreliable for most database systems.

The vast majority of development effort into a variety of solutions has a goal in my mind: stateless. There are many options that can help your data persist, but they are still in fast iterations. It can be said that the use of them will introduce a high degree of complexity, rising operational complexity (and risk) also negate the efficiency of the container brought about by the benefits. It is true that we have come to the conclusion that some of the "real worlds" of feedback about the use of containers (especially Dockers) are used.

They are not stable enough

These container solutions are intended for rapid development and deployment of applications that are dismantled into many tiny components: micro services . Often, these applications are very fast in those software / engineer-driven organizations. This also seems to be the cause of these container solutions (re-emphasized, especially Docker). The new features are pushed after a small amount of testing and design. The main focus seems to be on the latest feature set, and the first to the market. They are no longer to the user "license", replaced by the "begging for forgiveness". In addition, they will be backward compatibility (from what we have said before we can know this) priority row far (or even exaggerated). This means that you will have to plan to build a mature environment for continuous delivery and testing, as well as a well-known and tested mirror repository for containers.

There are some cool tools on the market used in the correct use cases, but they have time, money, resources and experience. For most of our customers, as a business this is not where they should consider. Their business is not designed around the software development, they do not have enough money to support the operation of these machines to maintain the necessary resources. On the contrary, they just want to come up with a stable and high-performance services, allowing their users 7 * 24 hours happy with the service.

I know that if we peel the database out of the container, we can give them a high-performance, high-availability environment and do not have to worry about it.

Is there hope

Of course, in fact, it is not just hope. Today there are many companies already running large containers (including databases)! The typical characteristics of these companies are a very mature process. Their software development is the core of business planning and dominates the value orientation. I said of these you probably know: Uber , Google , Facebook (there are many, here is only part). There is also a good choice, that is, you can use Joyent to get the container data persistence . But as I said before, it is too high to ensure that the necessary data retention and availability (the vast majority of the database's most basic uses) are brought about by the complexity. My personal view is that when the container has a better and more stable long-lasting storage volume solution when they are only one step away from the success of the bar. Even so, in most organizations, there may be no need for a containerized database if it does not support large-scale deployments (more than 50 nodes) and the workload varies widely.

Do not hang our appetite …

I know that "you may not be ready for a containerized database" does not form a solution. So here: The solution engineer team (referred to as SoIEng) gives the solution. Dimitri Vanoverbeke is writing a series of blogs about configuration management. Configuration management solutions can greatly improve the repeatability of the infrastructure and ensure that your IT / App development process is also repeatable in the physical configuration of your environment. Automate this process can bring huge benefits. The mature development / testing process should be part of the entire application development lifecycle. The combination of processes and technology can create a stable application to make customers happy.

In addition to the configuration management as an improvement program, there are some services can also make the operation and maintenance team's day better. The first thought is service discovery and health testing. My favorite is Consul , we use PMM to do the expansion, complete configuration and service metadata management. Consul ensures that the service state of the front-end and back-end infrastructure is a snapshot at all times.

in conclusion

There are a lot of things to consider when managing an environment, especially when applying fast iterations. By using an optional solution, you can reduce the overhead of each release. In addition, you can also improve the flexibility and usability of the application.

Original link: Is Docker Good for Your Database? (Probably Not) (translation: Wu Jiaxing)

Heads up! This alert needs your attention, but it's not super important.