How to create a Mongo Docker Image with default collections and data?

27,789

Solution 1

The problem was that information could not be saved on /db/data, so I've created a solution creating my own data directory.

# Parent Dockerfile https://github.com/docker-library/mongo/blob/982328582c74dd2f0a9c8c77b84006f291f974c3/3.0/Dockerfile
FROM mongo:latest

# Modify child mongo to use /data/db2 as dbpath (because /data/db wont persist the build)
RUN mkdir -p /data/db2 \
    && echo "dbpath = /data/db2" > /etc/mongodb.conf \
    && chown -R mongodb:mongodb /data/db2

COPY . /data/db2

RUN mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db2 --smallfiles \
    && CREATE_FILES=/data/db2/scripts/*-create.js \
    && for f in $CREATE_FILES; do mongo 127.0.0.1:27017 $f; done \
    && INSERT_FILES=/data/db2/scripts/*-insert.js \
    && for f in $INSERT_FILES; do mongo 127.0.0.1:27017 $f; done \
    && mongod --dbpath /data/db2 --shutdown \
    && chown -R mongodb /data/db2

# Make the new dir a VOLUME to persists it 
VOLUME /data/db2

CMD ["mongod", "--config", "/etc/mongodb.conf", "--smallfiles"]

Thanks to @yosifkit from the docker-library/mongo Github project for pointing that the volume would store the data in the resulting image. I missed that on the documentation.

Solution 2

During a docker image build, each build command like RUN is launched in it's own docker container and then when the command completes the data is committed as an image. If you run dockviz images --tree while doing a build you will get the idea.

In your case mongod has started and stopped long before you need it. You need to start mongo and run your scripts all in the one RUN step. You can achieve that by using a shell script that launches mongod and inserts your data.

Your Dockerfile will run:

RUN mongo_create_insert.sh

Then mongo_create_insert.sh contains all your mongo dependent steps:

#!/usr/bin/env bash

mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db/

FILES=scripts/*-create.js
for f in $FILES; do mongo mydb $f; done

FILES=scripts/*-insert.js
for f in $FILES; do mongo mydb $f; done

mongod --shutdown

As a side note, I tend to install Ansible in my base image and use that to provision Docker images in single RUN command rather than doing lots of shell RUN steps in a Dockerfile (which is just a glorified shell script in the end). You lose some of the build caching niceness but we've moved on from provisioning with shell scripts for a reason.

Solution 3

According to the description of the image on DockerHub, there is a much cleaner and simpler solution for this.

When a container is started for the first time it will execute files with extensions .sh and .js that are found in /docker-entrypoint-initdb.d. Files will be executed in alphabetical order. .js files will be executed by mongo using the database specified by the MONGO_INITDB_DATABASE variable, if it is present, or test otherwise. You may also switch databases within the .js script.

First, the Dockerfile is as simple as

FROM mongo:4
COPY setup.sh /docker-entrypoint-initdb.d/
COPY scripts /

Then, in the setup.sh, add your user/collection creation script, for example

mongo=( mongo --host 127.0.0.1 --port 27017 --quiet )
mongo+=(
    --username="$MONGO_INITDB_ROOT_USERNAME"
    --password="$MONGO_INITDB_ROOT_PASSWORD"
    --authenticationDatabase="$rootAuthDatabase"
)

CREATE_FILES=/scripts/*-create.js 
for f in $CREATE_FILES; do "${mongo[@]}" "$MONGO_INITDB_DATABASE" $f; done 

INSERT_FILES=/scripts/*-insert.js 
for f in $INSERT_FILES; do "${mongo[@]}" "$MONGO_INITDB_DATABASE" $f; done
Share:
27,789
Felipe Plets
Author by

Felipe Plets

I’m a FullStack developer, addicted to JavaScript, open hardware enthusiast with entrepreneur spirit, have an authentic restless mind. I am a practitioner and coach of agile methodologies and design thinking. Co-founder of menvia.com.

Updated on July 09, 2022

Comments

  • Felipe Plets
    Felipe Plets almost 2 years

    I need support here to build my own mongo docker image.

    I have a list of scripts to create and insert data into the MongoDB that shall be called in my Dockerfile to deliver a docker image with default collections and data.

    Here is how my Dockerfile looks like currently:

    FROM mongo:latest
    
    RUN mkdir -p /data/scripts
    
    COPY . /data/scripts
    
    RUN mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db/
    
    RUN FILES=scripts/*-create.js
    RUN for f in $FILES; do mongo mydb $f; done
    
    RUN FILES=scripts/*-insert.js
    RUN for f in $FILES; do mongo mydb $f; done
    
    RUN mongod --shutdown
    

    I've tried different options to start and stop mongod and always one of the two fail, the current script raise the following error:

    There doesn't seem to be a server running with dbpath: /data/db
    

    Update

    After @Matt answer I could run successfully the command chain, but can't still see my database (called my-db), collections and data there.

    The current Dockerfile:

    FROM mongo:latest
    
    RUN mkdir -p /data/db/scripts
    
    COPY . /data/db
    
    RUN mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db \
        && CREATE_FILES=/data/db/scripts/*-create.js \
        && for f in $CREATE_FILES; do mongo 127.0.0.1:27017 $f; done \
        && INSERT_FILES=/data/db/scripts/*-insert.js \
        && for f in $INSERT_FILES; do mongo 127.0.0.1:27017 $f; done \
        && mongod --shutdown 
    

    The output from the docker build command:

    Sending build context to Docker daemon 10.24 kB
    Step 1 : FROM mongo:latest
     ---> c08c92f4cb13
    Step 2 : RUN mkdir -p /data/db/scripts
     ---> Running in a7088943bb57
     ---> 373c7319927d
    Removing intermediate container a7088943bb57
    Step 3 : COPY . /data/db
     ---> 8fa84884edb7
    Removing intermediate container ae43e2c24fee
    Step 4 : RUN mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db     && CREATE_FILES=/data/db/scripts/*-create.js    && for f in $CREATE_FILES; do mongo 127.0.0.1:27017 $f; done    && INSERT_FILES=/data/db/scripts/*-insert.js    && for f in $INSERT_FILES; do mongo 127.0.0.1:27017 $f; done    && mongod --shutdown
     ---> Running in 33970b6865ee
    about to fork child process, waiting until server is ready for connections.
    forked process: 10
    child process started successfully, parent exiting
    MongoDB shell version: 3.0.7
    connecting to: 127.0.0.1:27017/test
    MongoDB shell version: 3.0.7
    connecting to: 127.0.0.1:27017/test
    killing process with pid: 10
     ---> 8451e43b7749
    Removing intermediate container 33970b6865ee
    Successfully built 8451e43b7749
    

    But as I said, I still can't see the database, collections and data in my database using mongo shell. Also I connected to the running container and got the mongodb.log:

    2015-11-06T16:15:14.562+0000 I JOURNAL  [initandlisten] journal dir=/data/db/journal
    2015-11-06T16:15:14.562+0000 I JOURNAL  [initandlisten] recover : no journal files present, no recovery needed
    2015-11-06T16:15:14.698+0000 I JOURNAL  [initandlisten] preallocateIsFaster=true 2.36
    2015-11-06T16:15:14.746+0000 I JOURNAL  [durability] Durability thread started
    2015-11-06T16:15:14.746+0000 I JOURNAL  [journal writer] Journal writer thread started
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] MongoDB starting : pid=10 port=27017 dbpath=/data/db 64-bit host=9c05d483673a
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] 
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] 
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] 
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] 
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] db version v3.0.7
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] git version: 6ce7cbe8c6b899552dadd907604559806aa2e9bd
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] build info: Linux ip-10-183-78-195 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 BOOST_LIB_VERSION=1_49
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] allocator: tcmalloc
    2015-11-06T16:15:14.747+0000 I CONTROL  [initandlisten] options: { processManagement: { fork: true }, storage: { dbPath: "/data/db" }, systemLog: { destination: "file", path: "/var/log/mongodb.log" } }
    2015-11-06T16:15:14.748+0000 I INDEX    [initandlisten] allocating new ns file /data/db/local.ns, filling with zeroes...
    2015-11-06T16:15:14.802+0000 I STORAGE  [FileAllocator] allocating new datafile /data/db/local.0, filling with zeroes...
    2015-11-06T16:15:14.802+0000 I STORAGE  [FileAllocator] creating directory /data/db/_tmp
    2015-11-06T16:15:14.804+0000 I STORAGE  [FileAllocator] done allocating datafile /data/db/local.0, size: 64MB,  took 0 secs
    2015-11-06T16:15:14.807+0000 I NETWORK  [initandlisten] waiting for connections on port 27017
    2015-11-06T16:15:14.830+0000 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:49641 #1 (1 connection now open)
    2015-11-06T16:15:14.832+0000 I INDEX    [conn1] allocating new ns file /data/db/my-db.ns, filling with zeroes...
    2015-11-06T16:15:14.897+0000 I STORAGE  [FileAllocator] allocating new datafile /data/db/my-db.0, filling with zeroes...
    2015-11-06T16:15:14.898+0000 I STORAGE  [FileAllocator] done allocating datafile /data/db/my-db.0, size: 64MB,  took 0 secs
    2015-11-06T16:15:14.904+0000 I NETWORK  [conn1] end connection 127.0.0.1:49641 (0 connections now open)
    2015-11-06T16:15:14.945+0000 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:49642 #2 (1 connection now open)
    2015-11-06T16:15:14.958+0000 I NETWORK  [conn2] end connection 127.0.0.1:49642 (0 connections now open)
    2015-11-06T16:15:14.982+0000 I CONTROL  [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
    2015-11-06T16:15:14.982+0000 I CONTROL  [signalProcessingThread] now exiting
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] closing listening socket: 6
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] closing listening socket: 7
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] removing socket file: /tmp/mongodb-27017.sock
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] shutdown: going to flush diaglog...
    2015-11-06T16:15:14.982+0000 I NETWORK  [signalProcessingThread] shutdown: going to close sockets...
    2015-11-06T16:15:14.982+0000 I STORAGE  [signalProcessingThread] shutdown: waiting for fs preallocator...
    2015-11-06T16:15:14.982+0000 I STORAGE  [signalProcessingThread] shutdown: final commit...
    2015-11-06T16:15:15.008+0000 I JOURNAL  [signalProcessingThread] journalCleanup...
    2015-11-06T16:15:15.008+0000 I JOURNAL  [signalProcessingThread] removeJournalFiles
    2015-11-06T16:15:15.009+0000 I JOURNAL  [signalProcessingThread] Terminating durability thread ...
    2015-11-06T16:15:15.088+0000 I JOURNAL  [journal writer] Journal writer thread stopped
    2015-11-06T16:15:15.088+0000 I JOURNAL  [durability] Durability thread stopped
    2015-11-06T16:15:15.088+0000 I STORAGE  [signalProcessingThread] shutdown: closing all files...
    2015-11-06T16:15:15.090+0000 I STORAGE  [signalProcessingThread] closeAllFiles() finished
    2015-11-06T16:15:15.090+0000 I STORAGE  [signalProcessingThread] shutdown: removing fs lock...
    2015-11-06T16:15:15.090+0000 I CONTROL  [signalProcessingThread] dbexit:  rc: 0
    

    I also checked the folder /data/db content:

    root@fbaf17233182:/data/db# ls -al
    total 16
    drwxr-xr-x 3 mongodb mongodb 4096 Nov  6 16:15 .
    drwxr-xr-x 4 root    root    4096 Nov  6 16:15 ..
    drwxr-xr-x 2 root    root    4096 Nov  5 18:55 scripts
    

    May help: