Apache NiFi is a powerful data processing and integration platform that provides a user-friendly interface for designing data flows. Docker is a popular tool for containerization, allowing you to package and run applications in lightweight, portable containers. Docker Compose, on the other hand, simplifies the management of multi-container Docker applications by defining and running them with a single YAML file.

In this guide, we’ll walk through the process of setting up Apache NiFi using Docker Compose.

Prerequisites:

  • Docker installed on your system. You can download and install Docker from the official website.

Step 1: Create a Docker Compose file

Create a new directory for your NiFi project and navigate into it. Inside this directory, create a file named

1
docker-compose.yml
:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
version: "3"
services
:
   # configuration manager for NiFi
    zookeeper
:
        hostname
: myzookeeper
        container_name
: zookeeper_container_persistent
        image
: 'bitnami/zookeeper:3.7.0'
        restart
: on-failure
        environment
:
           - ALLOW_ANONYMOUS_LOGIN=yes
        networks
:
           - my_persistent_network
# version control for nifi flows
    registry
:
        hostname
: myregistry
        container_name
: registry_container_persistent
        image
: 'apache/nifi-registry:1.24.0'  
        restart
: on-failure
        ports
:
           - "18080:18080"
        environment
:
           - LOG_LEVEL=INFO
            - NIFI_REGISTRY_DB_DIR=/opt/nifi-registry/nifi-registry-current/database
            - NIFI_REGISTRY_FLOW_PROVIDER=file
            - NIFI_REGISTRY_FLOW_STORAGE_DIR=/opt/nifi-registry/nifi-registry-current/flow_storage
        volumes
:
           - .\nifi_registry\database:/opt/nifi-registry/nifi-registry-current/database
            - .\nifi_registry\flow_storage:/opt/nifi-registry/nifi-registry-current/flow_storage
        networks
:
           - my_persistent_network
# data extraction, transformation and load service
    nifi
:
        hostname
: mynifi
        container_name
: nifi_container_persistent
        image
: 'apache/nifi:1.24.0'
        restart
: on-failure
        ports
:
           - '8091:8080'
        environment
:
           - NIFI_WEB_HTTP_PORT=8080
            - NIFI_CLUSTER_IS_NODE=true
            - NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
            - NIFI_ZK_CONNECT_STRING=myzookeeper:2181
            - NIFI_ELECTION_MAX_WAIT=30 sec
            - NIFI_SENSITIVE_PROPS_KEY='12345678901234567890A'
        healthcheck
:
            test
: "${DOCKER_HEALTHCHECK_TEST:-curl localhost:8091/nifi/}"
            interval
: "60s"
            timeout
: "3s"
            start_period
: "5s"
            retries
: 5
        volumes
:
           - .\nifi\database_repository:/opt/nifi/nifi-current/database_repository
            - .\nifi\flowfile_repository:/opt/nifi/nifi-current/flowfile_repository
            - .\nifi\content_repository:/opt/nifi/nifi-current/content_repository
            - .\nifi\provenance_repository:/opt/nifi/nifi-current/provenance_repository
            - .\nifi\state:/opt/nifi/nifi-current/state
            - .\nifi\conf:/opt/nifi/nifi-current/conf
            - .\nifi\logs:/opt/nifi/nifi-current/logs
            - .\nifi\input_directory:/opt/nifi/nifi-current/input_directory
            - .\nifi\output_directory:/opt/nifi/nifi-current/output_directory
            # uncomment the next line after copying the /conf directory from the container to your local directory to persist NiFi flows
            #- ./nifi/conf:/opt/nifi/nifi-current/conf
        networks
:
           - my_persistent_network
networks
:
  my_persistent_network
:
    driver
: bridge

This Docker Compose file defines a single service named

1
nifi
. It uses the official Apache NiFi Docker image, exposes port
1
8080
for web access, mounts a local directory
1
./data
to persist NiFi data, and sets the environment variable
1
NIFI_WEB_HTTP_PORT
to
1
8080
.

Volumes: These volumes define paths on the host machine mapped to directories within the Apache NiFi container. Here’s a breakdown of each volume definition:

  1. 1
    .\nifi\database_repository:/opt/nifi/nifi-current/database_repository
    :
    • This volume maps the host directory
      1
      .\nifi\database_repository
      to the container directory
      1
      /opt/nifi/nifi-current/database_repository
      . It allows NiFi to store its database repository data persistently on the host machine.
  2. 1
    .\nifi\flowfile_repository:/opt/nifi/nifi-current/flowfile_repository
    :
    • Similar to the first volume, this maps the host directory
      1
      .\nifi\flowfile_repository
      to the container directory
      1
      /opt/nifi/nifi-current/flowfile_repository
      . It allows NiFi to store its flowfile repository data persistently on the host machine.
  3. 1
    .\nifi\content_repository:/opt/nifi/nifi-current/content_repository
    :
    • This volume maps the host directory
      1
      .\nifi\content_repository
      to the container directory
      1
      /opt/nifi/nifi-current/content_repository
      . It allows NiFi to store its content repository data persistently on the host machine.
  4. 1
    .\nifi\provenance_repository:/opt/nifi/nifi-current/provenance_repository
    :
    • Similar to the previous volumes, this maps the host directory
      1
      .\nifi\provenance_repository
      to the container directory
      1
      /opt/nifi/nifi-current/provenance_repository
      . It allows NiFi to store its provenance repository data persistently on the host machine.
  5. 1
    .\nifi\state:/opt/nifi/nifi-current/state
    :
    • This volume maps the host directory
      1
      .\nifi\state
      to the container directory
      1
      /opt/nifi/nifi-current/state
      . It allows NiFi to store its state data persistently on the host machine.
  6. 1
    .\nifi\conf:/opt/nifi/nifi-current/conf
    :
    • This volume maps the host directory
      1
      .\nifi\conf
      to the container directory
      1
      /opt/nifi/nifi-current/conf
      . It allows you to provide custom NiFi configuration files on the host machine, which will be used by NiFi running inside the container.
  7. 1
    .\nifi\logs:/opt/nifi/nifi-current/logs
    :
    • This volume maps the host directory
      1
      .\nifi\logs
      to the container directory
      1
      /opt/nifi/nifi-current/logs
      . It allows NiFi to store its log files persistently on the host machine.

These volume mappings enable Apache NiFi to store its various data repositories, configuration files, and logs outside the container, ensuring persistence and easy access to these files even if the container is stopped or removed.

Apache NiFi Registry serves as a centralized repository for storing and managing Apache NiFi flow configurations. It allows users to version control their data flows, ensuring reproducibility, traceability, and collaboration across different environments. NiFi Registry enables users to create, manage, and share reusable components, such as processors, controller services, and templates, making it easier to maintain consistency and standardization in data integration workflows. With support for fine-grained access control and role-based permissions, NiFi Registry provides a secure platform for sharing and deploying data flows across teams and organizations. Additionally, NiFi Registry integrates seamlessly with Apache NiFi, allowing users to import, export, and synchronize flow versions between NiFi instances, facilitating continuous integration and deployment pipelines. Overall, Apache NiFi Registry enhances the scalability, reliability, and manageability of Apache NiFi deployments, empowering organizations to streamline their data processing pipelines effectively.

Step 2: Start Apache NiFi

Save the

1
docker-compose.yml
file and run the following command in the terminal to start Apache NiFi:


1
docker compose up -d

This command will download the NiFi Docker image if it’s not already available and start a container named

1
nifi-container
. The
1
-d
flag runs the containers in detached mode, meaning they will run in the background.

Step 3: Access Apache NiFi

Once the container is running, you can access Apache NiFi by opening your web browser and navigating to

1
http://localhost:8080
. You should see the NiFi user interface, where you can design and manage data flows.

Step 4: Stop and Remove Apache NiFi

To stop and remove the Apache NiFi container, run the following command:


1
docker compose down

This command will stop and remove the containers defined in the

1
docker-compose.yml
file. Any data stored in the
1
./data
directory will persist across container restarts.

Conclusion

In this guide, you learned how to run Apache NiFi on Docker using Docker Compose. Docker Compose makes it easy to define and manage multi-container applications, allowing you to quickly spin up and tear down NiFi instances for your data processing needs. Experiment with different configurations and scale your NiFi deployment as needed to handle various data integration tasks.


Feel free to adjust the Docker Compose file according to your specific requirements, such as configuring additional environment variables or volumes.

Views: 761

Leave a Reply

Your email address will not be published. Required fields are marked *