- Jun 02, 2017
-
-
Wenchen Fan authored
## What changes were proposed in this pull request? REPL module depends on SQL module, so we should run REPL tests if SQL module has code changes. ## How was this patch tested? N/A Author: Wenchen Fan <wenchen@databricks.com> Closes #18191 from cloud-fan/test.
-
Zhenhua Wang authored
## What changes were proposed in this pull request? Usually when using explain cost command, users want to see the stats of plan. Since stats is only showed in optimized plan, it is more direct and convenient to include only optimized plan and physical plan in the output. ## How was this patch tested? Enhanced existing test. Author: Zhenhua Wang <wzh_zju@163.com> Closes #18190 from wzhfy/simplifyExplainCost.
-
Xiao Li authored
[MINOR][SQL] Update the description of spark.sql.files.ignoreCorruptFiles and spark.sql.columnNameOfCorruptRecord ### What changes were proposed in this pull request? 1. The description of `spark.sql.files.ignoreCorruptFiles` is not accurate. When the file does not exist, we will issue the error message. ``` org.apache.spark.sql.AnalysisException: Path does not exist: file:/nonexist/path; ``` 2. `spark.sql.columnNameOfCorruptRecord` also affects the CSV format. The current description only mentions JSON format. ### How was this patch tested? N/A Author: Xiao Li <gatorsmile@gmail.com> Closes #18184 from gatorsmile/updateMessage.
-
Shixiong Zhu authored
## What changes were proposed in this pull request? In [this line](https://github.com/apache/spark/blob/f7cf2096fdecb8edab61c8973c07c6fc877ee32d/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L128), it uses the `executorId` string received from executors and finally it will go into `TaskUIData`. As deserializing the `executorId` string will always create a new instance, we have a lot of duplicated string instances. This PR does a String interning for TaskUIData to reduce the memory usage. ## How was this patch tested? Manually test using `bin/spark-shell --master local-cluster[6,1,1024]`. Test codes: ``` for (_ <- 1 to 10) { sc.makeRDD(1 to 1000, 1000).count() } Thread.sleep(2000) val l = sc.getClass.getMethod("jobProgressListener").invoke(sc).asInstanceOf[org.apache.spark.ui.jobs.JobProgressListener] org.apache.spark.util.SizeEstimator.estimate(l.stageIdToData) ``` This PR reduces the size of `stageIdToData` from 3487280 to 3009744 (86.3%) in the above case. Author: Shixiong Zhu <shixiong@databricks.com> Closes #18177 from zsxwing/SPARK-20955.
-
Wenchen Fan authored
## What changes were proposed in this pull request? The current conf setting logic is a little complex and has duplication, this PR simplifies it. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #18172 from cloud-fan/session.
-
Wenchen Fan authored
## What changes were proposed in this pull request? `SharedState.externalCatalog` is marked as a `lazy val` but actually it's not lazy. We access `externalCatalog` while initializing `SharedState` and thus eliminate the effort of `lazy val`. When creating `ExternalCatalog` we will try to connect to the metastore and may throw an error, so it makes sense to make it a `lazy val` in `SharedState`. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #18187 from cloud-fan/minor.
-
guoxiaolong authored
## What changes were proposed in this pull request? 1.The title style about field is error. fix before:  fix after:   executor-page style:  2.Title text description, 'the application' should be changed to 'this application'. 3.Analysis of code: $('#history-summary [data-toggle="tooltip"]').tooltip(); The id of 'history-summary' is not there. We only contain id of 'history-summary-table'. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. Author: guoxiaolong <guo.xiaolong1@zte.com.cn> Author: 郭小龙 10207633 <guo.xiaolong1@zte.com.cn> Author: guoxiaolongzte <guo.xiaolong1@zte.com.cn> Closes #18170 from guoxiaolongzte/SPARK-20942.
-
hyukjinkwon authored
## What changes were proposed in this pull request? Currently, if we run `./python/run-tests.py` and they are aborted without cleaning up this directory, it fails pep8 check due to some Python scripts generated. For example, https://github.com/apache/spark/blob/7387126f83dc0489eb1df734bfeba705709b7861/python/pyspark/tests.py#L1955-L1968 ``` PEP8 checks failed. ./work/app-20170531190857-0000/0/test.py:5:55: W292 no newline at end of file ./work/app-20170531190909-0000/0/test.py:5:55: W292 no newline at end of file ./work/app-20170531190924-0000/0/test.py:3:1: E302 expected 2 blank lines, found 1 ./work/app-20170531190924-0000/0/test.py:7:52: W292 no newline at end of file ./work/app-20170531191016-0000/0/test.py:5:55: W292 no newline at end of file ./work/app-20170531191030-0000/0/test.py:5:55: W292 no newline at end of file ./work/app-20170531191045-0000/0/test.py:3:1: E302 expected 2 blank lines, found 1 ./work/app-20170531191045-0000/0/test.py:7:52: W292 no newline at end of file ``` For me, it is sometimes a bit annoying. This PR proposes to exclude these (assuming we want to skip per https://github.com/apache/spark/blob/master/.gitignore#L73). Also, it moves other pep8 configurations in the script into ini configuration file in pep8. ## How was this patch tested? Manually tested via `./dev/lint-python`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #18161 from HyukjinKwon/work-exclude-pep8.
-
- Jun 01, 2017
-
-
Bogdan Raducanu authored
## What changes were proposed in this pull request? SQL hint syntax: * support expressions such as strings, numbers, etc. instead of only identifiers as it is currently. * support multiple hints, which was missing compared to the DataFrame syntax. DataFrame API: * support any parameters in DataFrame.hint instead of just strings ## How was this patch tested? Existing tests. New tests in PlanParserSuite. New suite DataFrameHintSuite. Author: Bogdan Raducanu <bogdan@databricks.com> Closes #18086 from bogdanrdc/SPARK-20854.
-
Marcelo Vanzin authored
Blindly deserializing classes using Java serialization opens the code up to issues in other libraries, since just deserializing data from a stream may end up execution code (think readObject()). Since the launcher protocol is pretty self-contained, there's just a handful of classes it legitimately needs to deserialize, and they're in just two packages, so add a filter that throws errors if classes from any other package show up in the stream. This also maintains backwards compatibility (the updated launcher code can still communicate with the backend code in older Spark releases). Tested with new and existing unit tests. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #18166 from vanzin/SPARK-20922.
-
Li Yichao authored
In Spark on YARN, when configuring "spark.yarn.jars" with local jars (jars started with "local" scheme), we will get inaccurate classpath for AM and containers. This is because we don't remove "local" scheme when concatenating classpath. It is OK to run because classpath is separated with ":" and java treat "local" as a separate jar. But we could improve it to remove the scheme. Updated `ClientSuite` to check "local" is not in the classpath. cc jerryshao Author: Li Yichao <lyc@zhihu.com> Author: Li Yichao <liyichao.good@gmail.com> Closes #18129 from liyichao/SPARK-20365.
-
Xiao Li authored
### What changes were proposed in this pull request? Before this PR, Subquery reuse does not work. Below are three issues: - Subquery reuse does not work. - It is sharing the same `SQLConf` (`spark.sql.exchange.reuse`) with the one for Exchange Reuse. - No test case covers the rule Subquery reuse. This PR is to fix the above three issues. - Ignored the physical operator `SubqueryExec` when comparing two plans. - Added a dedicated conf `spark.sql.subqueries.reuse` for controlling Subquery Reuse - Added a test case for verifying the behavior ### How was this patch tested? N/A Author: Xiao Li <gatorsmile@gmail.com> Closes #18169 from gatorsmile/subqueryReuse.
-
John Compitello authored
## What changes were proposed in this pull request? - ~~I added the method `toBlockMatrixDense` to the IndexedRowMatrix class. The current implementation of `toBlockMatrix` is insufficient for users with relatively dense IndexedRowMatrix objects, since it assumes sparsity.~~ EDIT: Ended up deciding that there should be just a single `toBlockMatrix` method, which creates a BlockMatrix whose blocks may be dense or sparse depending on the sparsity of the rows. This method will work better on any current use case of `toBlockMatrix` and doesn't go through `CoordinateMatrix` like the old method. ## How was this patch tested? ~~I used the same tests already written for `toBlockMatrix()` to test this method. I also added a new additional unit test for an edge case that was not adequately tested by current test suite.~~ I ran the original `IndexedRowMatrix` tests, plus wrote more to better handle edge cases ignored by original tests. Author: John Compitello <johnc@broadinstitute.org> Closes #17459 from johnc1231/johnc-fix-ir-to-block.
-
Yuming Wang authored
## What changes were proposed in this pull request? Add build-int SQL function - UUID. ## How was this patch tested? unit tests Author: Yuming Wang <wgyumg@gmail.com> Closes #18136 from wangyum/SPARK-20910.
-
Yuming Wang authored
## What changes were proposed in this pull request? Fix a few function description error. ## How was this patch tested? manual tests  Author: Yuming Wang <wgyumg@gmail.com> Closes #18157 from wangyum/DescIssues.
-
Dongjoon Hyun authored
## What changes were proposed in this pull request? Since [SPARK-9263](https://issues.apache.org/jira/browse/SPARK-9263), `resolveMavenCoordinates` ignores Spark and Spark's dependencies by using `addExclusionRules`. This PR aims to make [addExclusionRules](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L956-L974) up-to-date to neglect correctly because it fails to neglect some components like the following. **mllib (correct)** ``` $ bin/spark-shell --packages org.apache.spark:spark-mllib_2.11:2.1.1 ... --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 0 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- ``` **mllib-local (wrong)** ``` $ bin/spark-shell --packages org.apache.spark:spark-mllib-local_2.11:2.1.1 ... --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 15 | 2 | 2 | 0 || 15 | 2 | --------------------------------------------------------------------- ``` ## How was this patch tested? Pass the Jenkins with a updated test case. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #17947 from dongjoon-hyun/SPARK-20708.
-
jerryshao authored
## What changes were proposed in this pull request? Hadoop FileSystem's statistics in based on thread local variables, this is ok if the RDD computation chain is running in the same thread. But if child RDD creates another thread to consume the iterator got from Hadoop RDDs, the bytesRead computation will be error, because now the iterator's `next()` and `close()` may run in different threads. This could be happened when using PySpark with PythonRDD. So here building a map to track the `bytesRead` for different thread and add them together. This method will be used in three RDDs, `HadoopRDD`, `NewHadoopRDD` and `FileScanRDD`. I assume `FileScanRDD` cannot be called directly, so I only fixed `HadoopRDD` and `NewHadoopRDD`. ## How was this patch tested? Unit test and local cluster verification. Author: jerryshao <sshao@hortonworks.com> Closes #17617 from jerryshao/SPARK-20244.
-
- May 31, 2017
-
-
Shixiong Zhu authored
## What changes were proposed in this pull request? `IllegalAccessError` is a fatal error (a subclass of LinkageError) and its meaning is `Thrown if an application attempts to access or modify a field, or to call a method that it does not have access to`. Throwing a fatal error for AccumulatorV2 is not necessary and is pretty bad because it usually will just kill executors or SparkContext ([SPARK-20666](https://issues.apache.org/jira/browse/SPARK-20666) is an example of killing SparkContext due to `IllegalAccessError`). I think the correct type of exception in AccumulatorV2 should be `IllegalStateException`. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixiong@databricks.com> Closes #18168 from zsxwing/SPARK-20940.
-
Shixiong Zhu authored
[SPARK-20894][SS] Resolve the checkpoint location in driver and use the resolved path in state store ## What changes were proposed in this pull request? When the user runs a Structured Streaming query in a cluster, if the driver uses the local file system, StateStore running in executors will throw a file-not-found exception. However, the current error is not obvious. This PR makes StreamExecution resolve the path in driver and uses the full path including the scheme part (such as `hdfs:/`, `file:/`) in StateStore. Then if the above error happens, StateStore will throw an error with this full path which starts with `file:/`, and it makes this error obvious: the checkpoint location is on the local file system. One potential minor issue is that the user cannot use different default file system settings in driver and executors (e.g., use a public HDFS address in driver and a private HDFS address in executors) after this change. However, since the batch query also has this issue (See https://github.com/apache/spark/blob/4bb6a53ebd06de3de97139a2dbc7c85fc3aa3e66/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L402), it doesn't make things worse. ## How was this patch tested? The new added test. Author: Shixiong Zhu <shixiong@databricks.com> Closes #18149 from zsxwing/SPARK-20894.
-
gatorsmile authored
### What changes were proposed in this pull request? This PR does the following tasks: - Added since - Added the Python API - Added test cases ### How was this patch tested? Added test cases to both Scala and Python Author: gatorsmile <gatorsmile@gmail.com> Closes #18147 from gatorsmile/createOrReplaceGlobalTempView.
-
Liu Shaohui authored
## What changes were proposed in this pull request? Explicitly handle the FetchFailedException in FileFormatWriter, so it does not get wrapped. Note that this is no longer strictly necessary after SPARK-19276, but it improves error messages and also will help avoid others stumbling across this in the future. ## How was this patch tested? Existing unit tests. Closes https://github.com/apache/spark/pull/17893 Author: Liu Shaohui <liushaohui@xiaomi.com> Closes #18145 from squito/SPARK-20633.
-
jinxing authored
## What changes were proposed in this pull request? ShuffleId is determined before job submitted. But it's hard to predict stageId by shuffleId. Stage is created in DAGScheduler( https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L381), but the order is n ot determined in `HashSet`. I added a log(println(s"Creating ShufflMapStage-$id on shuffle-${shuffleDep.shuffleId}")) after (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L331), when testing BasicSchedulerIntegrationSuite:"multi-stage job". It will print: Creating ShufflMapStage-0 on shuffle-0 Creating ShufflMapStage-1 on shuffle-2 Creating ShufflMapStage-2 on shuffle-1 Creating ShufflMapStage-3 on shuffle-3 or Creating ShufflMapStage-0 on shuffle-1 Creating ShufflMapStage-1 on shuffle-3 Creating ShufflMapStage-2 on shuffle-0 Creating ShufflMapStage-3 on shuffle-2 It might be better to avoid generating the MapStatus by stageId. Author: jinxing <jinxing6042@126.com> Closes #17603 from jinxing64/SPARK-20288.
-
David Eis authored
## What changes were proposed in this pull request? Revert the handling of negative values in ALS with implicit feedback, so that the confidence is the absolute value of the rating and the preference is 0 for negative ratings. This was the original behavior. ## How was this patch tested? This patch was tested with the existing unit tests and an added unit test to ensure that negative ratings are not ignored. mengxr Author: David Eis <deis@bloomberg.net> Closes #18022 from davideis/bugfix/negative-rating.
-
Jacek Laskowski authored
## What changes were proposed in this pull request? Minor changes to scaladoc ## How was this patch tested? Local build Author: Jacek Laskowski <jacek@japila.pl> Closes #18074 from jaceklaskowski/scaladoc-fixes.
-
Felix Cheung authored
## What changes were proposed in this pull request? to investigate how long they run ## How was this patch tested? Jenkins, AppVeyor Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #18104 from felixcheung/rtimetest.
-
- May 30, 2017
-
-
Wenchen Fan authored
This reverts commit 8ce0d8ff.
-
jerryshao authored
## What changes were proposed in this pull request? Current HistoryServer will display completed date of in-progress application as `1969-12-31 23:59:59`, which is not so meaningful. Instead of unnecessarily showing this incorrect completed date, here propose to make this column invisible for in-progress applications. The purpose of only making this column invisible rather than deleting this field is that: this data is fetched through REST API, and in the REST API the format is like below shows, in which `endTime` matches `endTimeEpoch`. So instead of changing REST API to break backward compatibility, here choosing a simple solution to only make this column invisible. ``` [ { "id" : "local-1491805439678", "name" : "Spark shell", "attempts" : [ { "startTime" : "2017-04-10T06:23:57.574GMT", "endTime" : "1969-12-31T23:59:59.999GMT", "lastUpdated" : "2017-04-10T06:23:57.574GMT", "duration" : 0, "sparkUser" : "", "completed" : false, "startTimeEpoch" : 1491805437574, "endTimeEpoch" : -1, "lastUpdatedEpoch" : 1491805437574 } ] } ]% ``` Here is UI before changed: <img width="1317" alt="screen shot 2017-04-10 at 3 45 57 pm" src="https://cloud.githubusercontent.com/assets/850797/24851938/17d46cc0-1e08-11e7-84c7-90120e171b41.png"> And after: <img width="1281" alt="screen shot 2017-04-10 at 4 02 35 pm" src="https://cloud.githubusercontent.com/assets/850797/24851945/1fe9da58-1e08-11e7-8d0d-9262324f9074.png"> ## How was this patch tested? Manual verification. Author: jerryshao <sshao@hortonworks.com> Closes #17588 from jerryshao/SPARK-20275.
-
Wenchen Fan authored
## What changes were proposed in this pull request? Currently the `DataFrameWriter` operations have several problems: 1. non-file-format data source writing action doesn't show up in the SQL tab in Spark UI 2. file-format data source writing action shows a scan node in the SQL tab, without saying anything about writing. (streaming also have this issue, but not fixed in this PR) 3. Spark SQL CLI actions don't show up in the SQL tab. This PR fixes all of them, by refactoring the `ExecuteCommandExec` to make it have children. close https://github.com/apache/spark/pull/17540 ## How was this patch tested? existing tests. Also test the UI manually. For a simple command: `Seq(1 -> "a").toDF("i", "j").write.parquet("/tmp/qwe")` before this PR: <img width="266" alt="qq20170523-035840 2x" src="https://cloud.githubusercontent.com/assets/3182036/26326050/24e18ba2-3f6c-11e7-8817-6dd275bf6ac5.png"> after this PR: <img width="287" alt="qq20170523-035708 2x" src="https://cloud.githubusercontent.com/assets/3182036/26326054/2ad7f460-3f6c-11e7-8053-d68325beb28f.png"> Author: Wenchen Fan <wenchen@databricks.com> Closes #18064 from cloud-fan/execution.
-
Tathagata Das authored
## What changes were proposed in this pull request? A bunch of changes to the StateStore APIs and implementation. Current state store API has a bunch of problems that causes too many transient objects causing memory pressure. - `StateStore.get(): Option` forces creation of Some/None objects for every get. Changed this to return the row or null. - `StateStore.iterator(): (UnsafeRow, UnsafeRow)` forces creation of new tuple for each record returned. Changed this to return a UnsafeRowTuple which can be reused across records. - `StateStore.updates()` requires the implementation to keep track of updates, while this is used minimally (only by Append mode in streaming aggregations). Removed updates() and updated StateStoreSaveExec accordingly. - `StateStore.filter(condition)` and `StateStore.remove(condition)` has been merge into a single API `getRange(start, end)` which allows a state store to do optimized range queries (i.e. avoid full scans). Stateful operators have been updated accordingly. - Removed a lot of unnecessary row copies Each operator copied rows before calling StateStore.put() even if the implementation does not require it to be copied. It is left up to the implementation on whether to copy the row or not. Additionally, - Added a name to the StateStoreId so that each operator+partition can use multiple state stores (different names) - Added a configuration that allows the user to specify which implementation to use. - Added new metrics to understand the time taken to update keys, remove keys and commit all changes to the state store. These metrics will be visible on the plan diagram in the SQL tab of the UI. - Refactored unit tests such that they can be reused to test any implementation of StateStore. ## How was this patch tested? Old and new unit tests Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #18107 from tdas/SPARK-20376.
-
Xiao Li authored
### What changes were proposed in this pull request? We are unable to call the function registered in the not-current database. ```Scala sql("CREATE DATABASE dAtABaSe1") sql(s"CREATE FUNCTION dAtABaSe1.test_avg AS '${classOf[GenericUDAFAverage].getName}'") sql("SELECT dAtABaSe1.test_avg(1)") ``` The above code returns an error: ``` Undefined function: 'dAtABaSe1.test_avg'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 7 ``` This PR is to fix the above issue. ### How was this patch tested? Added test cases. Author: Xiao Li <gatorsmile@gmail.com> Closes #18146 from gatorsmile/qualifiedFunction.
-
Josh Rosen authored
-
jinxing authored
## What changes were proposed in this pull request? Fix test "don't submit stage until its dependencies map outputs are registered (SPARK-5259)" , "run trivial shuffle with out-of-band executor failure and retry", "reduce tasks should be placed locally with map output" in DAGSchedulerSuite. Author: jinxing <jinxing6042@126.com> Closes #17634 from jinxing64/SPARK-20333.
-
Arman authored
## What changes were proposed in this pull request? Added the createOrReplaceGlobalTempView method for dataset Author: Arman <arman.yazdani.10@gmail.com> Closes #16598 from arman1371/patch-1.
-
actuaryzhang authored
## What changes were proposed in this pull request? PySpark supports stringIndexerOrderType in RFormula as in #17967. ## How was this patch tested? docstring test Author: actuaryzhang <actuaryzhang10@gmail.com> Closes #18122 from actuaryzhang/PythonRFormula.
-
Liang-Chi Hsieh authored
## What changes were proposed in this pull request? We changed the parser to reject unaliased subqueries in the FROM clause in SPARK-20690. However, the error message that we now give isn't very helpful: scala> sql("""SELECT x FROM (SELECT 1 AS x)""") org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'FROM' expecting {<EOF>, 'WHERE', 'GROUP', 'ORDER', 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 1, pos 9) We should modify the parser to throw a more clear error for such queries: scala> sql("""SELECT x FROM (SELECT 1 AS x)""") org.apache.spark.sql.catalyst.parser.ParseException: The unaliased subqueries in the FROM clause are not supported.(line 1, pos 14) ## How was this patch tested? Modified existing tests to reflect this change. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #18141 from viirya/SPARK-20916.
-
Yuming Wang authored
## What changes were proposed in this pull request? Fix some indent issues. ## How was this patch tested? existing tests. Author: Yuming Wang <wgyumg@gmail.com> Closes #18133 from wangyum/IndentIssues.
-
Yuming Wang authored
## What changes were proposed in this pull request? Add build-int SQL function - DAYOFWEEK ## How was this patch tested? unit tests Author: Yuming Wang <wgyumg@gmail.com> Closes #18134 from wangyum/SPARK-20909.
-
- May 29, 2017
-
-
Prashant Sharma authored
## What changes were proposed in this pull request? In summary, cost of recreating a KafkaProducer for writing every batch is high as it starts a lot threads and make connections and then closes them. A KafkaProducer instance is promised to be thread safe in Kafka docs. Reuse of KafkaProducer instance while writing via multiple threads is encouraged. Furthermore, I have performance improvement of 10x in latency, with this patch. ### These are times that addBatch took in ms. Without applying this patch  ### These are times that addBatch took in ms. After applying this patch  ## How was this patch tested? Running distributed benchmarks comparing runs with this patch and without it. Added relevant unit tests. Author: Prashant Sharma <prashsh1@in.ibm.com> Closes #17308 from ScrapCodes/cached-kafka-producer.
-
Yuming Wang authored
## What changes were proposed in this pull request? Add additional function description for weekofyear. ## How was this patch tested? manual tests  Author: Yuming Wang <wgyumg@gmail.com> Closes #18132 from wangyum/SPARK-8184.
-