Skip to main content

NIFI

INTRODUCTION

NiFi (short form for “Niagara Files”) is a powerful enterprise-grade dataflow management tool that can collect, route, enrich, transform and process data in a reliable and scalable manner. NiFi is developed by the National Security Agency (NSA), and now it’s a top-level Apache Project under open-source license, strongly backed by Hortonworks. NiFi is based on the concepts of Flow-Based Programming.

Essentially Apache NiFi is a comprehensive platform that is:

  • For data acquisition, transportation, and guaranteed data delivery.
  • For data-based event processing with buffering and prioritized queuing.
  • Designed to accommodate highly complex and diverse data flows.
  • A user-friendly visual interface for development, configuration, and control.

FEATURES OF NIFI

Some of the high-level features of NiFi includes:

  • Web based UI
  • Seamless experience between design, control, feedback and monitoring
  • High Configurable
  • Loss tolerance vs guaranteed delivery
  • Low latency vs high throughput
  • Dynamic prioritization
  • Runtime flow modification
  • Back pressure handling
  • Data provenance
  • Dataflow lineage tracking from start to end
  • Designed for extension
  • Build own custom processors
  • Rapid development and effective testing
  • Secure
  • SSL, SSH, HTTPS, encrypted content, etc.
  • Multi-tenant authorization & internal authorization / policy management

NIFI ARCHITECTURE

Primary components of NiFi on JVM are:

  • Web Server: Purpose of the web server is to host the HTTP based command & control APIs

  • Flow Controller: It is the brain of operations. Provides threads for extensions to run on and manages the schedule of when extensions receive resources to execute

  • Extensions: These are various types of extensions supported in NiFi. The critical point is that they operate and run within the JVM

  • Flow File Repository: Component, where NiFi keeps track of the state of a Flow File that is currently active, is a flow. Implementation of the repository is pluggable. The default approach is a persistent Write-Ahead Log located on a specified disk partition.

  • Content Repository: Place where the actual content byte of a given Flow File resides. The method is relatively simple, stores the blocks of data in the file system.

  • Provenance Repository: Place where all the provenance event data is stored. Repository construct is pluggable with the default implementation being to use one or more physical disk volumes.

PROJECT STRUCTURE

Each Nifi processor includes two main projects namely:

  • The NAR folder
  • The processor folder

  • In the above, NAR folder contains the nar of the processor and the processor folder contains the code of it.

NIFI NAR PROJECT STRUCTURE

  • NIFI nar project contains the nar need to deploy the processor in NIFI UI

Processor Project Structure

  • In this project all the business logic is written for the processor. The folder structure of the processor folder looks like

  • Processor file – Processor file contains a class that extends AbstractProcessor and the default code needed for a NIFI processor.
  • Service file – Service file contains the business logic of the processor.
  • Test file - Test file used to check and debug the logics of the service using main method.

NIFI API FLOW

  • When a processor is called first the call goes to the onTrigger method in the Processor.
  • In onTrigger method the configured environmental variables and the inputs are received, then passed to the service.
  • In service the business logics are performed using the inputs and output is returned.
  • Then the output is mapped to appropriate relationship in the onTrigger method.
  • We use the test file to check the functionality of the service.

Sample API Flow For NIFI:

To add two numbers:

  • The input numbers are received in onTrigger method of processor
  • The logic to add number are executed in the Service file
  • Then the output is returned in onTrigger method

LOGGERS

  • NIFI itself provides a logging mechanism. Hence, we don’t use any external logging tool
  • When we use the logger provided by NIFI we can easily check the logs in NIFI UI itself, we don’t need check the logs of the gateway or server.
  • An example of how to use NIFI logger is:

The default logger provided by NIFI is:

CODING STANDARDS

In NIFI most of the coding standards are followed against the Java Coding Standard document(Please refer). Here We added some additional coding standards in NIFI project. Please follow the same.

Method Name

  • The method name should follow lowerCamelCase ie should start with small case and second word should be start with caps.
  • Method name should be meaningful and explain the functionality in short
  • Method name preferrable start with verb.

Ex: updateStatusForMultipleOrder, buildExchequerInput, callExchequer etc.

Variable Name

  • The variable name should follow lowerCamelCase ie should start with small case and second word should be start with caps.
  • Variable names should be meaningful and denote what value is stored in it
  • Avoid variable names like I,j,k,action

Ex: orderLineArray, processorInputArray, isCompoundMedicine etc.

Method Definitions

  • Divide the big methods to smaller methods.
  • Create a method for repeated code or logic.
  • Avoid throwing exceptions, handle the exceptions using try and catch block.
  • Every method should handle negative scenarios also and have error handling mechanism.
  • Avoid passing many parameters to the methods, if you have to pass many parameters then use model or constant file
  • Method should have comments explaining what functionality is implemented in that method, inputs and outputs of the method

Class Name

  • The class name should follow upper CamelCase ie start with caps and each word starts with caps.
  • The file name and class name should be the same.
  • If the class is bigger, divide it into smaller classes

Ex: MultiSelectActionsUpdateService.java, ATPPerformUserService.java etc.

Package Name

  • The package name should be in lowercase format.
  • Please prefer to set the package name as com.ainqa.nifi to distinguish from default processor

Ex: com.ainqa.nifi.processors

Processor File:

  • Avoid writing business logic in processor, have a separate service file for writing business logic.
  • Provide proper tags and description for the processor.
  • Provide proper name and description for the property descriptors and relationships.
  • Provide a meaning name for the processor.

Ex: ATPPerformUserAction, MultiSelectActionsUpdateProcessor

SOFTWARE REQUIREMENTS

TechnologyVersion
JavaJdk 8 or above
MavenMaven 2.1 or above
Spring Boot2.3.4