Back to Projects
Dec 2021
2 min read

Spark Cyclone

Apache Spark plug-in that enables Spark execution on the SX-Aurora TSUBASA Vector Engine (VE)

Spark Cyclone is an Apache Spark plug-in that accelerates the performance of Spark by using the SX-Aurora TSUBASA “Vector Engine” (VE). The plugin enables Spark users to accelerate their existing jobs by generating optimized C++ code and executing it on the VE, with minimal or no effort.

Spark Cyclone offers three pathways to accelerate Spark on the VE:

  • Spark SQL: The plugin leverages Spark SQL’s extensibility to rewrite SQL queries on the fly and executes dynamically-generated C++ code on the VE with no user code changes necessary.
  • RDD: For more direct control, the plugin’s VERDD API API provides Scala macros that can be used to transpile normal Scala code into C++ and thus execute common RDD operations such as map() on the VE.
  • MLlib: CycloneML is a fork of MLlib that uses Spark Cyclone to accelerate many of the ML algorithms using either the VE or CPU.