spooq.transformer.mapper_transformations.as_is

as_is(source_column: Optional[Union[str, Column]] = None, name: Optional[str] = None, **kwargs) Union[partial, Column][source]

Returns a renamed column without any casting. This is especially useful if you need to keep a complex data type (f.e. array, list or struct).

Parameters
  • source_column (str or Column) – Input column. Can be a name, pyspark column or pyspark function

  • name (str, default -> derived from input column) – Name of the output column. (.alias(name))

Keyword Arguments
  • alt_src_cols (str, default -> no coalescing, only source_column) – Coalesce with source_column and columns from this parameter.

  • cast (T.DataType(), default -> no casting, same return data type as input data type) – Applies provided datatype on output column (.cast(cast))

Examples

>>> input_df = spark.createDataFrame([
...     Row(friends=[Row(first_name="Gianni", id=3993, last_name="Weber"),
...                  Row(first_name="Arielle", id=17484, last_name="Greaves")]),
... ])
>>>
>>> input_df.select(spq.as_is("friends.first_name")).show(truncate=False)
+-----------------+
|[Gianni, Arielle]|
+-----------------+
>>>
>>> mapping = [("my_friends", "friends", spq.as_is)]
>>> output_df = Mapper(mapping).transform(input_df)
>>> output_df.show(truncate=False)
+--------------------------------------------------+
|my_friends                                        |
+--------------------------------------------------+
|[[Gianni, 3993, Weber], [Arielle, 17484, Greaves]]|
+--------------------------------------------------+
Returns

This method returns a suitable type depending on how it was called. This ensures compability with Spooq’s mapper transformer - with or without explicit parameters - as well as direct calls via select, withColumn, where, …

Return type

partial or Column