Skip to content
Snippets Groups Projects
  1. Nov 25, 2015
    • wangt's avatar
      [SPARK-11880][WINDOWS][SPARK SUBMIT] bin/load-spark-env.cmd loads... · 9f3e59a1
      wangt authored
      [SPARK-11880][WINDOWS][SPARK SUBMIT] bin/load-spark-env.cmd loads spark-env.cmd from wrong directory
      
      * On windows the `bin/load-spark-env.cmd` tries to load `spark-env.cmd` from `%~dp0..\..\conf`, where `~dp0` points to `bin` and `conf` is only one level up.
      * Updated `bin/load-spark-env.cmd` to load `spark-env.cmd` from `%~dp0..\conf`, instead of `%~dp0..\..\conf`
      
      Author: wangt <wangtao.upc@gmail.com>
      
      Closes #9863 from toddwan/master.
      9f3e59a1
    • Alex Bozarth's avatar
      [SPARK-10864][WEB UI] app name is hidden if window is resized · 83653ac5
      Alex Bozarth authored
      Currently the Web UI navbar has a minimum width of 1200px; so if a window is resized smaller than that the app name goes off screen. The 1200px width seems to have been chosen since it fits the longest example app name without wrapping.
      
      To work with smaller window widths I made the tabs wrap since it looked better than wrapping the app name. This is a distinct change in how the navbar looks and I'm not sure if it's what we actually want to do.
      
      Other notes:
      - min-width set to 600px to keep the tabs from wrapping individually (will need to be adjusted if tabs are added)
      - app name will also wrap (making three levels) if a really really long app name is used
      
      Author: Alex Bozarth <ajbozart@us.ibm.com>
      
      Closes #9874 from ajbozarth/spark10864.
      83653ac5
    • Jeff Zhang's avatar
      [DOCUMENTATION] Fix minor doc error · 67b67320
      Jeff Zhang authored
      Author: Jeff Zhang <zjffdu@apache.org>
      
      Closes #9956 from zjffdu/dev_typo.
      67b67320
    • Yu ISHIKAWA's avatar
      [MINOR] Remove unnecessary spaces in `include_example.rb` · 0dee44a6
      Yu ISHIKAWA authored
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #9960 from yu-iskw/minor-remove-spaces.
      0dee44a6
    • Davies Liu's avatar
      [SPARK-11969] [SQL] [PYSPARK] visualization of SQL query for pyspark · dc1d324f
      Davies Liu authored
      Currently, we does not have visualization for SQL query from Python, this PR fix that.
      
      cc zsxwing
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #9949 from davies/pyspark_sql_ui.
      dc1d324f
    • Zhongshuai Pei's avatar
      [SPARK-11974][CORE] Not all the temp dirs had been deleted when the JVM exits · 6b781576
      Zhongshuai Pei authored
      deleting the temp dir like that
      
      ```
      
      scala> import scala.collection.mutable
      import scala.collection.mutable
      
      scala> val a = mutable.Set(1,2,3,4,7,0,8,98,9)
      a: scala.collection.mutable.Set[Int] = Set(0, 9, 1, 2, 3, 7, 4, 8, 98)
      
      scala> a.foreach(x => {a.remove(x) })
      
      scala> a.foreach(println(_))
      98
      ```
      
      You may not modify a collection while traversing or iterating over it.This can not delete all element of the collection
      
      Author: Zhongshuai Pei <peizhongshuai@huawei.com>
      
      Closes #9951 from DoingDone9/Bug_RemainDir.
      6b781576
    • felixcheung's avatar
      [SPARK-11984][SQL][PYTHON] Fix typos in doc for pivot for scala and python · faabdfa2
      felixcheung authored
      Author: felixcheung <felixcheung_m@hotmail.com>
      
      Closes #9967 from felixcheung/pypivotdoc.
      faabdfa2
    • Marcelo Vanzin's avatar
      [SPARK-11956][CORE] Fix a few bugs in network lib-based file transfer. · c1f85fc7
      Marcelo Vanzin authored
      - NettyRpcEnv::openStream() now correctly propagates errors to
        the read side of the pipe.
      - NettyStreamManager now throws if the file being transferred does
        not exist.
      - The network library now correctly handles zero-sized streams.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #9941 from vanzin/SPARK-11956.
      c1f85fc7
    • Mark Hamstra's avatar
      [SPARK-10666][SPARK-6880][CORE] Use properties from ActiveJob associated with a Stage · 0a5aef75
      Mark Hamstra authored
      This issue was addressed in https://github.com/apache/spark/pull/5494, but the fix in that PR, while safe in the sense that it will prevent the SparkContext from shutting down, misses the actual bug.  The intent of `submitMissingTasks` should be understood as "submit the Tasks that are missing for the Stage, and run them as part of the ActiveJob identified by jobId".  Because of a long-standing bug, the `jobId` parameter was never being used.  Instead, we were trying to use the jobId with which the Stage was created -- which may no longer exist as an ActiveJob, hence the crash reported in SPARK-6880.
      
      The correct fix is to use the ActiveJob specified by the supplied jobId parameter, which is guaranteed to exist at the call sites of submitMissingTasks.
      
      This fix should be applied to all maintenance branches, since it has existed since 1.0.
      
      kayousterhout pankajarora12
      
      Author: Mark Hamstra <markhamstra@gmail.com>
      Author: Imran Rashid <irashid@cloudera.com>
      
      Closes #6291 from markhamstra/SPARK-6880.
      0a5aef75
    • Jeff Zhang's avatar
      [SPARK-11860][PYSAPRK][DOCUMENTATION] Invalid argument specification … · b9b6fbe8
      Jeff Zhang authored
      …for registerFunction [Python]
      
      Straightforward change on the python doc
      
      Author: Jeff Zhang <zjffdu@apache.org>
      
      Closes #9901 from zjffdu/SPARK-11860.
      b9b6fbe8
    • Ashwin Swaroop's avatar
      [SPARK-11686][CORE] Issue WARN when dynamic allocation is disabled due to... · 63850026
      Ashwin Swaroop authored
      [SPARK-11686][CORE] Issue WARN when dynamic allocation is disabled due to spark.dynamicAllocation.enabled and spark.executor.instances both set
      
      Changed the log type to a 'warning' instead of 'info' as required.
      
      Author: Ashwin Swaroop <Ashwin Swaroop>
      
      Closes #9926 from ashwinswaroop/master.
      63850026
    • Reynold Xin's avatar
      [SPARK-11981][SQL] Move implementations of methods back to DataFrame from Queryable · a0f1a118
      Reynold Xin authored
      Also added show methods to Dataset.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #9964 from rxin/SPARK-11981.
      a0f1a118
    • gatorsmile's avatar
      [SPARK-11970][SQL] Adding JoinType into JoinWith and support Sample in Dataset API · 2610e061
      gatorsmile authored
      Except inner join, maybe the other join types are also useful when users are using the joinWith function. Thus, added the joinType into the existing joinWith call in Dataset APIs.
      
      Also providing another joinWith interface for the cartesian-join-like functionality.
      
      Please provide your opinions. marmbrus rxin cloud-fan Thank you!
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #9921 from gatorsmile/joinWith.
      2610e061
    • Tathagata Das's avatar
      [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and... · 21698868
      Tathagata Das authored
      [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file
      
      This solves the following exception caused when empty state RDD is checkpointed and recovered. The root cause is that an empty OpenHashMapBasedStateMap cannot be deserialized as the initialCapacity is set to zero.
      ```
      Job aborted due to stage failure: Task 0 in stage 6.0 failed 1 times, most recent failure: Lost task 0.0 in stage 6.0 (TID 20, localhost): java.lang.IllegalArgumentException: requirement failed: Invalid initial capacity
      	at scala.Predef$.require(Predef.scala:233)
      	at org.apache.spark.streaming.util.OpenHashMapBasedStateMap.<init>(StateMap.scala:96)
      	at org.apache.spark.streaming.util.OpenHashMapBasedStateMap.<init>(StateMap.scala:86)
      	at org.apache.spark.streaming.util.OpenHashMapBasedStateMap.readObject(StateMap.scala:291)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
      	at org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:181)
      	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
      	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:921)
      	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:921)
      	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
      	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
      	at org.apache.spark.scheduler.Task.run(Task.scala:88)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      ```
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #9958 from tdas/SPARK-11979.
      21698868
  2. Nov 24, 2015
  3. Nov 23, 2015
    • Stephen Samuel's avatar
      Updated sql programming guide to include jdbc fetch size · 026ea2ea
      Stephen Samuel authored
      Author: Stephen Samuel <sam@sksamuel.com>
      
      Closes #9377 from sksamuel/master.
      026ea2ea
    • Bryan Cutler's avatar
      [SPARK-10560][PYSPARK][MLLIB][DOCS] Make StreamingLogisticRegressionWithSGD... · 10574564
      Bryan Cutler authored
      [SPARK-10560][PYSPARK][MLLIB][DOCS] Make StreamingLogisticRegressionWithSGD Python API equal to Scala one
      
      This is to bring the API documentation of StreamingLogisticReressionWithSGD and StreamingLinearRegressionWithSGC in line with the Scala versions.
      
      -Fixed the algorithm descriptions
      -Added default values to parameter descriptions
      -Changed StreamingLogisticRegressionWithSGD regParam to default to 0, as in the Scala version
      
      Author: Bryan Cutler <bjcutler@us.ibm.com>
      
      Closes #9141 from BryanCutler/StreamingLogisticRegressionWithSGD-python-api-sync.
      10574564
Loading