ArkFlow: High-Performance Rust Engine for Data Stream Processing with Modular Design and Low Latency

GitHub - arkflow-rs/arkflow: High-Performance Rust Stream Processing Engine ArkFlow English | 中文 ArkFlow is a high-performance stream processing engine developed in Rust, designed to handle complex data streams with ease. It leverages the Tokio asynchronous runtime, ensuring excellent performance and low latency. The engine supports a variety of input and output sources, making it versatile for different applications. Additionally, it offers a range of powerful processing capabilities and is highly extensible through its modular design. Features High Performance: Built on Rust and the Tokio async runtime, ensuring top-tier performance and minimal latency. Multiple Data Sources: Supports various input and output sources, including Kafka, MQTT, HTTP, and files. Powerful Processing Capabilities: Equipped with built-in SQL queries, JSON processing, Protobuf encoding/decoding, and batch processing. Extensibility: Modular architecture allows for easy addition of new input, output, and processor components. Installation Building from Source To get started with ArkFlow, follow these steps: Create a Configuration File: Create a config.yaml file to define your input, output, and processing components. Run ArkFlow: Execute the engine using the configuration file. Configuration Guide ArkFlow uses YAML-formatted configuration files to manage its settings. Here are the primary configuration items: Top-Level Configuration Input Components: Define the sources from which ArkFlow will ingest data. Example: yaml inputs: - type: kafka topic: my-topic bootstrap_servers: localhost:9092 Processors: Specify the data processing tasks you want to perform. Example: yaml processors: - type: sql query: "SELECT * FROM input WHERE condition" - type: json transform: "path.to.field" Output Components: Determine where the processed data will be sent. Example: yaml outputs: - type: kafka topic: processed-topic bootstrap_servers: localhost:9092 Error Output Components: Configure how errors should be handled and logged. Example: yaml error_outputs: - type: file path: /var/log/arkflow-errors.log Buffer Components: Set up buffers to manage backpressure and temporarily store messages. Example: yaml buffers: - type: memory capacity: 10000 Examples Kafka to Kafka Data Processing One common use case for ArkFlow is processing data from one Kafka topic to another. Here’s a simple example: Generate Test Data: First, create some test data to be ingested by ArkFlow. Configure ArkFlow: Set up the config.yaml file to read from the source Kafka topic, process the data, and send it to the target Kafka topic. Example Configuration: ```yaml inputs: - type: kafka topic: source-topic bootstrap_servers: localhost:9092 processors: - type: sql query: "SELECT * FROM input WHERE condition" outputs: - type: kafka topic: target-topic bootstrap_servers: localhost:9092 error_outputs: - type: file path: /var/log/arkflow-errors.log buffers: - type: memory capacity: 10000 ``` Community Join the ArkFlow community on Discord to connect with developers, share ideas, and get support: Discord Link If you find this project useful or are using it to develop your own solutions, please consider giving it a star on GitHub. Your support helps us continue improving and expanding the functionality of ArkFlow. License ArkFlow is released under the Apache License 2.0. For more details, see the LICENSE file in the repository. ArkFlow Plugin Examples Check out the documentation for examples of how to extend ArkFlow with custom plugins and components. This includes detailed instructions on creating and integrating new input, output, and processor modules, making the engine even more adaptable to your specific needs.

ArkFlow: High-Performance Rust Engine for Data Stream Processing with Modular Design and Low Latency

Related Links