Spark Cyclone is an Apache Spark plug-in that accelerates the performance of Spark by using the SX-Aurora TSUBASA “Vector Engine” (VE). The plugin enables Spark users to accelerate their existing jobs by generating optimized C++ code and executing it on the VE, with minimal or no effort.
Spark Cyclone offers three pathways to accelerate Spark on the VE:
- Spark SQL: The plugin leverages Spark SQL’s extensibility to rewrite SQL queries on the fly and executes dynamically-generated C++ code on the VE with no user code changes necessary.
- RDD: For
more direct control, the plugin’s VERDD API API provides Scala macros that can
be used to transpile normal Scala code into C++ and thus execute common RDD
operations such as
map()
on the VE. - MLlib: CycloneML is a fork of MLlib that uses Spark Cyclone to accelerate many of the ML algorithms using either the VE or CPU.