Skip to content
Snippets Groups Projects
  • Davies Liu's avatar
    b660de7a
    [SPARK-4562] [MLlib] speedup vector · b660de7a
    Davies Liu authored
    This PR change the underline array of DenseVector to numpy.ndarray to avoid the conversion, because most of the users will using numpy.array.
    
    It also improve the serialization of DenseVector.
    
    Before this change:
    
    trial	| trainingTime | 	testTime
    -------|--------|--------
    0	| 5.126 | 	1.786
    1	|2.698	|1.693
    
    After the change:
    
    trial	| trainingTime |	testTime
    -------|--------|--------
    0	|4.692	|0.554
    1	|2.307	|0.525
    
    This could partially fix the performance regression during test.
    
    Author: Davies Liu <davies@databricks.com>
    
    Closes #3420 from davies/ser2 and squashes the following commits:
    
    0e1e6f3 [Davies Liu] fix tests
    426f5db [Davies Liu] impove toArray()
    44707ec [Davies Liu] add name for ISO-8859-1
    fa7d791 [Davies Liu] address comments
    1cfb137 [Davies Liu] handle zero sparse vector
    2548ee2 [Davies Liu] fix tests
    9e6389d [Davies Liu] bugfix
    470f702 [Davies Liu] speed up DenseMatrix
    f0d3c40 [Davies Liu] speedup SparseVector
    ef6ce70 [Davies Liu] speed up dense vector
    b660de7a
    History
    [SPARK-4562] [MLlib] speedup vector
    Davies Liu authored
    This PR change the underline array of DenseVector to numpy.ndarray to avoid the conversion, because most of the users will using numpy.array.
    
    It also improve the serialization of DenseVector.
    
    Before this change:
    
    trial	| trainingTime | 	testTime
    -------|--------|--------
    0	| 5.126 | 	1.786
    1	|2.698	|1.693
    
    After the change:
    
    trial	| trainingTime |	testTime
    -------|--------|--------
    0	|4.692	|0.554
    1	|2.307	|0.525
    
    This could partially fix the performance regression during test.
    
    Author: Davies Liu <davies@databricks.com>
    
    Closes #3420 from davies/ser2 and squashes the following commits:
    
    0e1e6f3 [Davies Liu] fix tests
    426f5db [Davies Liu] impove toArray()
    44707ec [Davies Liu] add name for ISO-8859-1
    fa7d791 [Davies Liu] address comments
    1cfb137 [Davies Liu] handle zero sparse vector
    2548ee2 [Davies Liu] fix tests
    9e6389d [Davies Liu] bugfix
    470f702 [Davies Liu] speed up DenseMatrix
    f0d3c40 [Davies Liu] speedup SparseVector
    ef6ce70 [Davies Liu] speed up dense vector