Skip to content
Snippets Groups Projects
user avatar
Bijay Pathak authored
[SPARK-14761][SQL] Reject invalid join methods when join columns are not specified in PySpark DataFrame join.

## What changes were proposed in this pull request?

In PySpark, the invalid join type will not throw error for the following join:
```df1.join(df2, how='not-a-valid-join-type')```

The signature of the join is:
```def join(self, other, on=None, how=None):```
The existing code completely ignores the `how` parameter when `on` is `None`. This patch will process the arguments passed to join and pass in to JVM Spark SQL Analyzer, which will validate the join type passed.

## How was this patch tested?
Used manual and existing test suites.

Author: Bijay Pathak <bkpathak@mtu.edu>

Closes #15409 from bkpathak/SPARK-14761.
8880fd13
History
Name Last commit Last update