FLOSS Weekly with Doc Searls

Nov 8th 2017

FLOSS Weekly 458

Crail

Open source user-level I/O architecture for the Apache data processing ecosystem

Records live every Wednesday at 12:30pm Eastern / 9:30am Pacific / 16:30 UTC.
Category: News

Crail is a storage platform for sharing performance critical data in distributed data processing jobs at very high speed. Crail is built entirely upon principles of user-level I/O and specifically targets data center deployments with fast network and storage hardware (e.g., 100Gbps RDMA, plenty of DRAM, NVMe flash, etc.) as well as new modes of operation such resource disaggregation or server-less computing. Crail is written in Java and integrates seamlessly with the Apache data processing ecosystem (e.g., Spark, Hadoop, Flink). It can be used as (i) a backbone to accelerate high-level data operations such as shuffle, reduce, or broadcast; (ii) a cache to store hot data that is queried repeatedly; (iii) a storage platform for sharing inter-job data in complex multi-job pipelines. Last week, Crail has been voted in to become an Apache Incubator project.

Links