Skip to content
Snippets Groups Projects
  • 0x0FFF's avatar
    6cd98c18
    [SPARK-10417] [SQL] Iterating through Column results in infinite loop · 6cd98c18
    0x0FFF authored
    `pyspark.sql.column.Column` object has `__getitem__` method, which makes it iterable for Python. In fact it has `__getitem__` to address the case when the column might be a list or dict, for you to be able to access certain element of it in DF API. The ability to iterate over it is just a side effect that might cause confusion for the people getting familiar with Spark DF (as you might iterate this way on Pandas DF for instance)
    
    Issue reproduction:
    ```
    df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}']))
    for i in df["name"]: print i
    ```
    
    Author: 0x0FFF <programmerag@gmail.com>
    
    Closes #8574 from 0x0FFF/SPARK-10417.
    6cd98c18
    History
    [SPARK-10417] [SQL] Iterating through Column results in infinite loop
    0x0FFF authored
    `pyspark.sql.column.Column` object has `__getitem__` method, which makes it iterable for Python. In fact it has `__getitem__` to address the case when the column might be a list or dict, for you to be able to access certain element of it in DF API. The ability to iterate over it is just a side effect that might cause confusion for the people getting familiar with Spark DF (as you might iterate this way on Pandas DF for instance)
    
    Issue reproduction:
    ```
    df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}']))
    for i in df["name"]: print i
    ```
    
    Author: 0x0FFF <programmerag@gmail.com>
    
    Closes #8574 from 0x0FFF/SPARK-10417.