Docker & MongoDB Combat (2) Performance and Fault Tolerance

[Editor's Note] Docker in the operation of the database Many people have thought, but the IO and the operation of the physical machine will be different? How should data be stored? Mapping to the host or use a separate storage of data containers? Which way is better? What about performance? This article gives answers to these questions. The first article is here .

In front, we already know how to create and run a simple CentOS-based MongoDB instance. This is no better for development or testing, but it does not address some performance and fault tolerance issues. In this article, we'll look at Docker-related disk storage options and the considerations for running a database on it (such as MongoDB).

File system hierarchy

docker-filesystems-multilayer.png
Docker's most important feature (and my favorite feature) is the hierarchical file system . Each base layer is read-only, these base layers are superimposed on each other to form a real file system, and the top layer is a read-write layer (readable and writable). They are very easy to version management, at the same time, you can cache (we do not need to start every time from scratch).

Compared to the traditional way of mirroring, this is really a huge change, before the entire file system image or virtual machine templates are hand-built, we do not know what it contains and why should contain. Recently we noticed that there were a lot of configuration management tools, such as Puppet , Chef and Ansible , but it was time consuming to build a complex and flexible image from scratch. Docker hierarchical approach can speed up the build time because it only needs to be refactored to make only the modified file layer.

However, this approach is not without the drawbacks: such a hierarchical file system runtime performance is quite poor. It also depends on the storage modules used by Docker, including AUFS, which was previously used, and later OverlayFS, BTRFS and device mapper. For best I / O performance, we need to use Docker's data volumes . The data volume is outside the Docker container, and therefore also bypasses the tiered file system. There are two main types of data volumes: the host directory and the container that stores only the data.

Data volume: Host directory

Screen-Shot-2015-02-01-at-17.45_.39_.png
A host directory data volume is a simple directory that is mounted on the original container. As we mentioned in the first part : Create a directory in the Docker host and use it as a dbpath for MongoDB (to hold data and categorize files). E.g:

  $ Docker run -d -P -v ~ / db: / data / db mongod --smallfiles 

Verify that the MongoDB container was started by checking the log file:

  $ Docker ps 

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Efca3b637a75 mongod: latest "mongod --smallfiles 9 minutes ago Up 9 minutes 0.0.0.0:49160->27017/tcp prickly_sammet

$ Docker logs efca3b637a75

2015-08-01T18: 35: 02.279 + 0000 [initandlisten] MongoDB starting: pid = 1 port = 27017 dbpath = / data / db 64-bit host = efca3b637a75

2015-02-01T18: 35: 02.279 + 0000 [initandlisten] db version v2.6.7

2015-02-01T18: 35: 02.279 + 0000 [initandlisten] git version: a7d57ad27c382de82e9cb93bf983a80fd9ac9899

2015-02-01T18: 35: 02.279 + 0000 [initandlisten] build info: Linux build7.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 # 1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION = 1_49

2015-02-01T18: 35: 02.279 + 0000 [initandlisten] allocator: tcmalloc

2015-02-01T18: 35: 02.279 + 0000 [initandlisten] options: {storage: {smallFiles: true}}

2015-02-01T18: 35: 02.282 + 0000 [initandlisten] journal dir = / data / db / journal

2015-02-01T18: 35: 02.283 + 0000 [initandlisten] recover: no journal files present, no recovery needed

2015-02-01T18: 35: 02.454 + 0000 [initandlisten] allocating new ns file /data/db/local.ns, filling with zeroes ...

2015-02-01T18: 35: 02.510 + 0000 [FileAllocator] allocating new datafile /data/db/local.0, filling with zeroes ...

2015-02-01T18: 35: 02.510 + 0000 [FileAllocator] creating directory / data / db / _tmp

2015-02-01T18: 35: 02.513 + 0000 [FileAllocator] done allocating datafile /data/db/local.0, size: 16MB, took 0.001 secs

2015-02-01T18: 35: 02.514 + 0000 [initandlisten] build index on: local.startup_log properties: {v: 1, key: {_id: 1}, name: "_id_", ns: "local.startup_log"}

2015-02-01T18: 35: 02.514 + 0000 [initandlisten] added index to empty collection

2015-02-01T18: 35: 02.514 + 0000 [initandlisten] waiting for connections on port 27017

2015-02-01T18: 36: 02.481 + 0000 [clientcursormon] mem (MB) res: 36 virt: 246

2015-02-01T18: 36: 02.481 + 0000 [clientcursormon] mapped (incl journal view): 64

2015-02-01T18: 36: 02.481 + 0000 [clientcursormon] connections: 0

2015-02-01T18: 41: 02.571 + 0000 [clientcursormon] mem (MB) res: 36 virt: 246

2015-02-01T18: 41: 02.571 + 0000 [clientcursormon] mapped (incl journal view): 64

2015-02-01T18: 41: 02.571 + 0000 [clientcursormon] connections: 0

Make sure that the data file is already created in the specified host directory ~ / db.

  $ Ls -l ~ / db 

Total 32776

Drwxr-xr-x. 2 root root 17 Feb 1 18:35 journal

-rw -------. 1 root root 16777216 Feb 1 18:35 local.0

-rw -------. 1 root root 16777216 Feb 1 18:35 local.ns

-rwxr-xr-x. 1 root root 2 Feb 1 18:35 mongod.lock

Drwxr-xr-x. 2 root root 6 Feb 1 18:35 _tmp

Quick benchmarking

Is the host directory data volume faster than the default tiered file system? Which of course depends on your environment, of course, this article will not be too much about performance testing things. However, there is a quick way to use mongoperf to test.

  # Mongoperf process on latest CentOS 
# See https://docs.docker.com/articles/dockerfile_best-practices/
FROM centos
MAINTAINER James Tan <james.tan@mongodb.com>
COPY mongodb.repo / etc / yum.repos.d /
RUN yum install-ymongodb-org-tools
WORKDIR / tmp
ENTRYPOINT ["mongoperf"]

Here we use the same mongodb.repo as in the first part of the case, in order to facilitate here again:

  [Mongodb] 

Name = MongoDB Repository

Baseurl = http: //downloads-distro.mongodb.org/repo/redhat/os/x86_64/

Gpgcheck = 0

Enabled = 1

Use the two files in your current directory above to build the mirror that will be running:

  $ Docker build -t mongoperf. 

Now measure the hierarchical root directory file system by running:

  $ Echo "{nThreads: 32, fileSizeMB: 1000, r: true, w: true}" | docker run -i --sig-proxy = false mongoperf 

You should see similar output as follows:

  Mongoperf 

Use -h for help

Parsed options:

{NThreads: 32, fileSizeMB: 1000, r: true, w: true}

Creating test file size: 1000MB ...

Testing

Optoins: {nThreads: 32, fileSizeMB: 1000, r: true, w: true}

Wthr 32

New thread, total running: 1

Read: 1 write: 1

877 ops / sec 3 MB / sec

928 ops / sec 3 MB / sec

920 ops / sec 3 MB / sec

...

New thread, total running: 2

Read: 1 write: 1

1211 ops / sec 4 MB / sec

1158 ops / sec 4 MB / sec

1172 ops / sec 4 MB / sec

...

New thread, total running: 4

Read: 1 write: 1

Read: 1 write: 1

1194 ops / sec 4 MB / sec

1163 ops / sec 4 MB / sec

1162 ops / sec 4 MB / sec

...

New thread, total running: 8

Read: 1 write: 1

...

1112 ops / sec 4 MB / sec

1161 ops / sec 4 MB / sec

1174 ops / sec 4 MB / sec

...

New thread, total running: 16

Read: 1 write: 1

...

1156 ops / sec 4 MB / sec

1178 ops / sec 4 MB / sec

1160 ops / sec 4 MB / sec

...

New thread, total running: 32

Read: 1 write: 1

...

1244 ops / sec 4 MB / sec

1205 ops / sec 4 MB / sec

1211 ops / sec 4 MB / sec

...

mongoperf will always run, you can exit the terminal via Ctrl + C. The container will always run in the background, so let's end it.

  $ Docker ps 

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

C1366d08b543 mongoperf: latest "mongoperf" 4 minutes ago Up 3 minutes boring_kirch

$ Docker rm -f c1366d08b543

C1366d08b543

Now run the host directory data volume:

  $ Mkdir ~ / tmp 

$ Echo "{nThreads: 32, fileSizeMB: 1000, r: true, w: true}" | docker run -i --sig-proxy = false -v ~ / tmp: / tmp mongoperf

Starting from our setup, this has the same output:

  Mongoperf 

Use -h for help

Parsed options:

{NThreads: 32, fileSizeMB: 1000, r: true, w: true}

Creating test file size: 1000MB ...

Testing

Optoins: {nThreads: 32, fileSizeMB: 1000, r: true, w: true}

Wthr 32

New thread, total running: 1

Read: 1 write: 1

1273 ops / sec 4 MB / sec

1242 ops / sec 4 MB / sec

1178 ops / sec 4 MB / sec

...

New thread, total running: 2

Read: 1 write: 1

2437 ops / sec 9 MB / sec

2702 ops / sec 10 MB / sec

2546 ops / sec 9 MB / sec

...

New thread, total running: 4

Read: 1 write: 1

Read: 1 write: 1

2575 ops / sec 10 MB / sec

2465 ops / sec 9 MB / sec

2558 ops / sec 9 MB / sec

...

New thread, total running: 8

Read: 1 write: 1

...

2471 ops / sec 9 MB / sec

3081 ops / sec 12 MB / sec

3027 ops / sec 11 MB / sec

...

New thread, total running: 16

Read: 1 write: 1

...

3031 ops / sec 11 MB / sec

3376 ops / sec 13 MB / sec

3384 ops / sec 13 MB / sec

...

New thread, total running: 32

Read: 1 write: 1

...

3272 ops / sec 12 MB / sec

3196 ops / sec 12 MB / sec

3385 ops / sec 13 MB / sec

...

Stop and remove the previous container.

With the results of the last set of 32 parallel read and write threads, we saw an increase in the number of operations per second by 180% from 1211 to 3385 ops / s. There is also an increase in throughput of 225% from 4 to 13 MB / s.

Container portability

Although the performance of the above method has been improved, but this is not conducive to the container migration, because now our Docker containers need to rely on external Docker host directory, and this directory is not managed by the Docker, so we can not simply Run or migrate it. The best solution is to use the data-only container, and then we will elaborate.

Data volume: A container with only data

Screen-Shot-2015-02-01-at-17.47_.40_.png
A data-only container is the recommended Docker data storage mode, which decouples depend on the host.

In order to create a data container as a measure, we re-use the existing mongoperf image:

  $ Docker create -v / tmp --name mongoperf-data mongoperf 

7d476bb9d3ca0cf282e2d3b9cf54e18d7bbe9b561be5d34646947032b64b4b9c

Use the data container to rerun the test criteria using the --volume-from mongoperf-data parameter.

  $ Echo "{nThreads: 32, fileSizeMB: 1000, r: true, w: true}" | docker run -i --sig-proxy = false --volumes-from mongoperf-data mongoperf 

This process is as follows:

  Mongoperf 

Use -h for help

Parsed options:

{NThreads: 32, fileSizeMB: 1000, r: true, w: true}

Creating test file size: 1000MB ...

Testing

Optoins: {nThreads: 32, fileSizeMB: 1000, r: true, w: true}

Wthr 32

New thread, total running: 1

Read: 1 write: 1

1153 ops / sec 4 MB / sec

1146 ops / sec 4 MB / sec

1151 ops / sec 4 MB / sec

...

New thread, total running: 2

Read: 1 write: 1

1857 ops / sec 7 MB / sec

2489 ops / sec 9 MB / sec

2459 ops / sec 9 MB / sec

...

New thread, total running: 4

Read: 1 write: 1

Read: 1 write: 1

2518 ops / sec 9 MB / sec

2477 ops / sec 9 MB / sec

2451 ops / sec 9 MB / sec

...

New thread, total running: 8

Read: 1 write: 1

...

2812 ops / sec 10 MB / sec

2837 ops / sec 11 MB / sec

2793 ops / sec 10 MB / sec

...

Ew thread, total running: 16

Read: 1 write: 1

...

3111 ops / sec 12 MB / sec

3319 ops / sec 12 MB / sec

3263 ops / sec 12 MB / sec

...

New thread, total running: 32

Read: 1 write: 1

...

2919 ops / sec 11 MB / sec

3274 ops / sec 12 MB / sec

3306 ops / sec 12 MB / sec

...

Performance and it is consistent with the host directory data volume. Even if the reference container is removed, only the container containing the data still exists. We continue to run View:

  $ Docker ps -a 

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

7d476bb9d3ca mongoperf: latest "mongoperf" 9 minutes ago mongoperf-data

End

Looking back at our mongod container, we now use the data-only container to store the data for better performance.

  $ Docker create -v / data / db --name mongod-data mongod 

$ Docker run -d -P --volumes-from mongod-data mongod --smallfiles

Remember that you can run the docker ps see the mapped local port values. E.g:

  $ Docker ps 

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

08245e631171 mongod: latest "mongod --smallfiles 40 seconds ago Up 39 seconds 0.0.0.0:49165->27017/tcp gloomy_meitner

$ Mongo --port 49165

MongoDB shell version: 2.6.7

Connecting to: 127.0.0.1:49165/test

>

The data volume will eventually become the first citizen of the Docker. At the same time, consider using community tools, like docker-volume to manage them more easily.

Next

In the next section, we will examine a variety of Docker network parameters, and elaborate on which is more suitable for multi-host MongoDB replica set. Please look forward!

Original link: MONGODB & DOCKER – PART 2 (translated: Liu Hong proofreading: Li Yingjie)

===========================
Translator introduction <br /> Liu Hong, only graduated undergraduate niche, usually like amateur to learn some of their own interested in the technology or framework, is currently learning Docker, also began to translate Docker official documents, if the translation of Docker official documents interested Friends can contact translator yo.

    Heads up! This alert needs your attention, but it's not super important.