Skip to content
Snippets Groups Projects
user avatar
Dongjoon Hyun authored
## What changes were proposed in this pull request?

We can build Python API docs by `cd ./python/docs && make html for Python` and R API docs by `cd ./R && sh create-docs.sh for R` separately. However, `jekyll` fails in some environments.

This PR aims to support `SKIP_PYTHONDOC` and `SKIP_RDOC` for documentation build in `docs` folder. Currently, we can use `SKIP_SCALADOC` or `SKIP_API`. The reason providing additional options is that the Spark documentation build uses a number of tools to build HTML docs and API docs in Scala, Python and R. Specifically, for Python and R,

- Python API docs requires `sphinx`.
- R API docs requires `R` installation and `knitr` (and more others libraries).

In other words, we cannot generate Python API docs without R installation. Also, we cannot generate R API docs without Python `sphinx` installation. If Spark provides `SKIP_PYTHONDOC` and `SKIP_RDOC` like `SKIP_SCALADOC`, it would be more convenient.

## How was this patch tested?

Manual.

**Skipping Scala/Java/Python API Doc Build**
```bash
$ cd docs
$ SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 jekyll build
$ ls api
DESCRIPTION R
```

**Skipping Scala/Java/R API Doc Build**
```bash
$ cd docs
$ SKIP_SCALADOC=1 SKIP_RDOC=1 jekyll build
$ ls api
python
```

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #16336 from dongjoon-hyun/SPARK-18923.
ba4468bb
History
Name Last commit Last update
..
_data
_includes
_layouts
_plugins
css
img
js
README.md
_config.yml
api.md
building-spark.md
cluster-overview.md
configuration.md
contributing-to-spark.md
ec2-scripts.md
graphx-programming-guide.md
hadoop-provided.md
hardware-provisioning.md
index.md
java-programming-guide.md
job-scheduling.md
ml-advanced.md
ml-ann.md
ml-classification-regression.md
ml-clustering.md
ml-collaborative-filtering.md
ml-decision-tree.md
ml-ensembles.md
ml-features.md
ml-guide.md
ml-linear-methods.md
ml-migration-guides.md
ml-pipeline.md
ml-survival-regression.md
ml-tuning.md
mllib-classification-regression.md
mllib-clustering.md
mllib-collaborative-filtering.md
mllib-data-types.md
mllib-decision-tree.md
mllib-dimensionality-reduction.md
mllib-ensembles.md
mllib-evaluation-metrics.md
mllib-feature-extraction.md
mllib-frequent-pattern-mining.md
mllib-guide.md
mllib-isotonic-regression.md
mllib-linear-methods.md
mllib-migration-guides.md
mllib-naive-bayes.md
mllib-optimization.md
mllib-pmml-model-export.md
mllib-statistics.md
monitoring.md
programming-guide.md
python-programming-guide.md
quick-start.md
running-on-mesos.md
running-on-yarn.md
scala-programming-guide.md
security.md
spark-standalone.md
sparkr.md
sql-programming-guide.md
storage-openstack-swift.md
streaming-custom-receivers.md
streaming-flume-integration.md
streaming-kafka-0-10-integration.md
streaming-kafka-0-8-integration.md
streaming-kafka-integration.md
streaming-kinesis-integration.md
streaming-programming-guide.md
structured-streaming-kafka-integration.md
structured-streaming-programming-guide.md
submitting-applications.md
tuning.md