spooq.transformer.mapper_transformations.to_bool
- to_bool(source_column=None, name=None, **kwargs: Any) partial [source]
More robust conversion to BooleanType. This method is able to additionally handle (compared to implicit Spark conversion):
Preceding and/or trailing whitespace
Define additional strings for true/false values (“on”/”off”, “enabled”/”disabled” are added by default)
- Parameters
- Keyword Arguments
case_sensitive (Bool, default -> False) – Defines whether the case for the additional true/false lookup values is considered
true_values (list, default -> ["on", "enabled"]) – A list of values that should result in a
True
value if they are found in the source columnfalse_values (list, default -> ["off", "disabled"]) – A list of values that should result in a
False
value if they are found in the source columnreplace_default_values (Bool, default -> False) – Defines whether additionally provided true/false values replace or extend the default list
alt_src_cols (str, default -> no coalescing, only source_column) – Coalesce with source_column and columns from this parameter.
cast (T.DataType(), default -> T.BooleanType()) – Applies provided datatype on output column (
.cast(cast)
)
Warning
Spark (and Spooq) handles number to boolean conversions depending on the input datatype! Please see this table for clarification:
Input
Result
Value
Datatype
Cast to Boolean
spq.to_bool
-1
int
True
NULL
-1
str
NULL
NULL
0
int
False
False
0
str
False
False
1
int
True
True
1
str
True
True
100
int
True
NULL
100
str
NULL
NULL
Examples
>>> input_df = spark.createDataFrame( ... [ ... Row(input_string=" false "), ... Row(input_string="123"), ... Row(input_string="1"), ... Row(input_string="Enabled"), ... Row(input_string="?"), ... Row(input_string="n") ... ], schema="input_key string" ... ) >>> >>> input_df.select(spq.to_bool("input_key", false_values=["?"])).show(truncate=False) +---------+ |false | |null | |true | |false | |false | +---------+ >>> >>> mapping = [ ... ("original_value", "input_key", spq.as_is), ... ("transformed_value", "input_key", spq.to_bool(false_values=["?"])) ... ] >>> output_df = Mapper(mapping).transform(input_df) >>> output_df.show(truncate=False) +--------------+-----------------+ |original_value|transformed_value| +--------------+-----------------+ | false |false | |123 |null | |1 |true | |Enabled |true | |? |false | |n |false | +--------------+-----------------+
- Returns
This method returns a suitable type depending on how it was called. This ensures compability with Spooq’s mapper transformer - with or without explicit parameters - as well as direct calls via select, withColumn, where, …
- Return type
partial or Column