sparklyr
  • Get Started
  • Guides
  • Deployment
  • Providers
    • Databricks
    • Snowflake
  • Packages
  • Learn more

R interface to Apache Spark ™

  • Interact with Spark using familiar R interfaces, such as dplyr, broom, and DBI.

  • Gain access to Spark’s distributed Machine Learning libraries, Structure Streaming,and ML Pipelines from R.

  • Extend your toolbox by adding XGBoost, MLeap, H2O and Graphframes to your Spark plus R analysis.

  • Connect R wherever Spark runs: Databricks, Snowflake, YARN (Hadoop), Kubernetes, Stand Alone, and Spark Connect.

  • Run distributed R code inside Spark

Get Started

Welcome new users! Start here to learn how to install and use sparklyr.

Guides

“How-to” articles to help you learn how to do things such as: connect AWS S3 buckets, handling Streaming Data, create ML Pipelines and others.

Deployment

Articles on Spark environments. Including AWS EMR, Databricks, and Standalone clusters.

Source Code
---
format:
  html:
    toc: false
    page-layout: custom
---

::: whitebox
::: {style="padding-left: 100px; padding-right: 100px; display: inline-block;"}

::: {layout-ncol="2"}

::: {style="text-align: left;"}

# R interface to Apache Spark ™

- Interact with Spark using familiar R interfaces, such as [`dplyr`](/guides/dplyr.qmd),
`broom`, and [`DBI`](/get-started/prepare-data.qmd#using-sql).  

- Gain access to Spark's distributed [Machine Learning](/guides/mlib.qmd) libraries, [Structure Streaming](/guides/streaming.qmd),and [ML Pipelines](/guides/pipelines.qmd) from R.

- Extend your toolbox by adding [XGBoost](/packages/sparkxgb/index.md), [MLeap](/packages/mleap/index.md), [H2O](/guides/h2o.qmd) and [Graphframes](/packages/graphframes/index.md) to  your 
Spark plus R analysis. 

- Connect R wherever Spark runs: [Databricks, Snowflake, YARN (Hadoop), 
Kubernetes, Stand Alone, and Spark Connect](/get-started/index.qmd#clusters).

- Run [distributed R code](/guides/distributed-r.qmd) inside Spark

:::

::: {style="text-align: center;"}
![](/images/homepage/sparklyr-diagram.svg){style="align: right;" width="500"}
:::

:::
:::
:::

::: mainbox
::: {style="padding-left: 100px; padding-right: 100px; display: inline-block;"}
::: {layout-ncol="3"}
::: {style="text-align: center;"}
### [Get Started](/get-started/index.qmd)

**Welcome new users!** Start here to learn how to install and use `sparklyr`.
:::

::: {style="text-align: center;"}
### [Guides](/guides/index.qmd)

"How-to" articles to help you learn how to do things such as: connect AWS S3 buckets, handling Streaming Data, create ML Pipelines and others.
:::

::: {style="text-align: center;"}
### [Deployment](/deployment/index.qmd)

Articles on Spark environments. Including AWS EMR, Databricks, and Standalone clusters.\
\
:::
:::
:::
:::