The RAPIDS Accelerator for Apache Spark is a plugin that transparently intercepts Apache Spark SQL queries and executes them on NVIDIA GPUs using the cuDF library. The plugin integrates into Spark's Catalyst optimizer through the org.apache.spark.sql.SparkSessionExtensions mechanism, replacing CPU-based physical plan operators with GPU-accelerated equivalents when possible.
This page provides a high-level introduction to the accelerator's architecture, capabilities, and integration points. For detailed information about specific subsystems, see:
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala1-100 docs/download.md1-112
The accelerator transforms Spark SQL operations to run on GPUs by:
columnarRules extension pointTypeSig and TypeChecks systemsHostColumnarToGpu and GpuColumnarToRowExecThe plugin does not modify Spark's logical plan or optimizer. It operates exclusively on physical plans after standard Spark optimizations complete.
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala461-750 docs/supported_ops.md1-70
Figure 1: Plugin Integration and Transformation Flow
The diagram shows how user queries flow through Spark's standard pipeline, then get intercepted by the RAPIDS plugin via SQLPlugin.columnar(). The GpuOverrides class orchestrates transformation by wrapping plan nodes in RapidsMeta, validating them with TypeChecks, and converting compatible nodes to GPU operators that execute via cuDF.
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala461-567 sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala1-60
Figure 2: Plan Transformation Components
The transformation uses a three-phase approach:
GpuOverrides.wrapAndTagPlan() traverses the Spark plan and wraps each node in a RapidsMeta subclass (SparkPlanMeta, BaseExprMeta, ScanMeta, etc.) based on registered ReplacementRule instancesRapidsMeta calls tagForGpu() to determine GPU compatibility using TypeChecks validation and RapidsConf settingsconvertIfNeeded() generates GPU operators for tagged nodes or inserts CPU/GPU transition operators where neededSources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala461-523 sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala120-260
| Component | Class/Package | Purpose |
|---|---|---|
| Plugin Entry | com.nvidia.spark.SQLPlugin | Implements SparkPlugin interface to register columnar rules |
| Plan Replacement | GpuOverrides | Orchestrates plan transformation with 854.08 importance score |
| Metadata Wrappers | RapidsMeta hierarchy | Wraps Spark nodes for GPU compatibility analysis |
| Type Validation | TypeSig, TypeChecks | Validates data type support for operations |
| Configuration | RapidsConf | Manages 200+ configuration parameters |
| GPU Operators | GpuExec subclasses | GPU implementations of Spark physical operators |
| Transitions | HostColumnarToGpu, GpuColumnarToRowExec | Handle CPU↔GPU data movement |
| Execution Library | ai.rapids.cudf | NVIDIA cuDF library for GPU DataFrame operations |
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala78-318 sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala124-167
The plugin defines GPU implementations through ReplacementRule instances registered in GpuOverrides:
Figure 3: Replacement Rule Type Hierarchy
Each ReplacementRule contains:
doWrap: Function to create the appropriate RapidsMeta wrapperdesc: Human-readable description of the operationchecks: Optional TypeChecks defining supported type signaturestag: ClassTag identifying the Spark class to replaceconfKey: Configuration key like spark.rapids.sql.expression.AddSources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala78-318 sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala794-861
The accelerator supports most common SQL operations across multiple categories:
| Category | Examples | Importance Score | Supported Types |
|---|---|---|---|
| Type Casting | GpuCast | 241.62 | Most primitives, decimals, strings (see limitations) |
| Math Operations | GpuAdd, GpuSubtract, GpuMultiply, GpuUnaryMath | 233.37 | Numeric types, with ANSI mode support |
| String Operations | GpuUpper, GpuSubstring, GpuRegExpReplace | 320.40 | String operations with regex transpilation |
| Aggregations | GpuSum, GpuAvg, GpuCount, GpuMin, GpuMax | Varies | Numeric types, with overflow detection |
| Joins | GpuBroadcastHashJoinExec, GpuShuffledHashJoinExec | 190.89 | All join types, AST condition support |
| File I/O | GpuParquetScan, GpuOrcScan, GpuCsvScan | 245.71 | Parquet, ORC, CSV, Avro formats |
| Collections | GpuArrayTransform, GpuMapKeys, GpuGetStructField | 213.67 | Arrays, maps, structs with lambdas |
Compatibility Notes:
spark.sql.session.timeZone=UTC)Sources: docs/supported_ops.md1-110 tools/generated_files/supportedExprs.csv1-100 tools/generated_files/operatorsScore.csv1-50
The RapidsConf class manages configuration through a builder pattern:
spark.rapids.sql.enabled=true # Master enable/disable
spark.rapids.sql.explain=NOT_ON_GPU # Explain why operations didn't GPU-accelerate
spark.rapids.sql.expression.Add=true # Per-operation control
spark.rapids.sql.castFloatToIntegral.enabled=true
spark.rapids.memory.gpu.allocFraction=0.9
Common configuration patterns:
spark.rapids.sql.{expression|exec|input}.ClassNamespark.rapids.sql.castFloatToIntegral.enabled for potentially unsafe castsspark.rapids.memory.gpu.* for GPU memory allocationspark.rapids.sql.incompatibleOps.enabled for operations with behavior differencesSee Configuration System for detailed parameter documentation.
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala322-656 docs/configs.md1-57
The TypeSig class represents sets of supported data types:
TypeSig.BOOLEAN + TypeSig.integral + TypeSig.fp // Primitives
TypeSig.DECIMAL_128 // Decimal(precision ≤ 38)
TypeSig.ARRAY.nested(TypeSig.STRING + TypeSig.INT) // Array<String|Int>
TypeSig.STRUCT.nested() // Struct with any nested types
Type checking occurs in tagForGpu() via TypeChecks objects:
ExprChecks: Validates expression parameter typesExecChecks: Validates executor input/output typesPartChecks: Validates partitioning key typesScanChecks: Validates file format typesWhen types don't match, willNotWorkOnGpu() marks the operation for CPU fallback.
Sources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala95-260 docs/supported_ops.md73-104
For a simple query like SELECT sum(amount) FROM sales WHERE region = 'WEST':
HashAggregateExec → FilterExec → FileSourceScanExecSQLPlugin.columnar() calls GpuOverrides.apply()wrapAndTagPlan() creates:
HashAggregateExecMeta wrapping the aggregateFilterExecMeta wrapping the filterFileSourceScanExecMeta wrapping the scantagForGpu():
TypeChecksRapidsConf settingsconvertIfNeeded() generates:
GpuHashAggregateExec for aggregationGpuFilterExec for filteringGpuBatchScanExec for file readingGpuTransitionOverrides inserts any needed HostColumnarToGpu transitionsGpuExec operators calling cuDF APIsSources: sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala528-750
The accelerator includes comprehensive testing:
| Test Type | Location | Purpose |
|---|---|---|
| Integration Tests | integration_tests/ (importance 224.38) | Python pytest comparing CPU vs GPU results |
| Unit Tests | tests/src/test/scala/ | Scala tests for individual components |
| Data Generation | data_gen.py | Deterministic test data with seeds |
| Assertion Utilities | asserts.py | CPU/GPU result comparison with tolerance |
| Compatibility Docs | docs/supported_ops.md | Known behavioral differences |
Integration tests use markers:
@approximate_float: Allows floating-point tolerance@allow_non_gpu: Permits partial CPU fallback@ignore_order: Handles non-deterministic result orderingSources: integration_tests/src/main/python/hash_aggregate_test.py1-50 integration_tests/src/main/python/data_gen.py1-100 integration_tests/README.md1-50
The plugin uses a multi-version build system supporting Spark 3.2.x through 4.0.x:
src/main/spark{330,340,350}/ directories-Dbuildver=330 or similar for target Spark versionspark-shared/ packageDistribution artifacts:
rapids-4-spark_2.12-VERSION.jar (Scala 2.12)rapids-4-spark_2.13-VERSION.jar (Scala 2.13)com.nvidia:rapids-4-spark_2.12:VERSIONSources: docs/download.md1-112 jenkins/printJarVersion.sh1-38
To use the accelerator:
Install Requirements:
Add JAR to Spark:
spark-submit --jars rapids-4-spark_2.12-VERSION.jar \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.rapids.sql.enabled=true
Configure for Environment:
spark.rapids.memory.gpu.allocFraction based on GPU memoryspark.rapids.sql.incompatibleOps.enabled=true if neededspark.rapids.sql.explain=ALL output for optimization opportunitiesMonitor Execution:
explain() to see physical plansspark.rapids.sql.explain=NOT_ON_GPU to identify CPU fallbacksFor detailed configuration, see Configuration System. For operation support details, see Supported Operations Matrix.
Sources: docs/download.md1-112 docs/configs.md1-57
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.