Skip to content
Snippets Groups Projects
  • Yanbo Liang's avatar
    f4f39981
    [SPARK-6827] [MLLIB] Wrap FPGrowthModel.freqItemsets and make it consistent with Java API · f4f39981
    Yanbo Liang authored
    Make PySpark ```FPGrowthModel.freqItemsets``` consistent with Java/Scala API like ```MatrixFactorizationModel.userFeatures```
    It return a RDD with each tuple is composed of an array and a long value.
    I think it's difficult to implement namedtuples to wrap the output because items of freqItemsets can be any type with arbitrary length which is tedious to impelement corresponding SerDe function.
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    
    Closes #5614 from yanboliang/spark-6827 and squashes the following commits:
    
    da8c404 [Yanbo Liang] use namedtuple
    5532e78 [Yanbo Liang] Wrap FPGrowthModel.freqItemsets and make it consistent with Java API
    f4f39981
    History
    [SPARK-6827] [MLLIB] Wrap FPGrowthModel.freqItemsets and make it consistent with Java API
    Yanbo Liang authored
    Make PySpark ```FPGrowthModel.freqItemsets``` consistent with Java/Scala API like ```MatrixFactorizationModel.userFeatures```
    It return a RDD with each tuple is composed of an array and a long value.
    I think it's difficult to implement namedtuples to wrap the output because items of freqItemsets can be any type with arbitrary length which is tedious to impelement corresponding SerDe function.
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    
    Closes #5614 from yanboliang/spark-6827 and squashes the following commits:
    
    da8c404 [Yanbo Liang] use namedtuple
    5532e78 [Yanbo Liang] Wrap FPGrowthModel.freqItemsets and make it consistent with Java API
fpm.py 3.01 KiB