Transformers¶
Transformers take a pyspark.sql.DataFrame
as an input, transform it accordingly
and return a PySpark DataFrame.
Each Transformer class has to have a transform method which takes no arguments and returns a PySpark DataFrame.
Possible transformation methods can be Selecting the most up to date record by id, Exploding an array, Filter (on an exploded array), Apply basic threshold cleansing or Map the incoming DataFrame to at provided structure.
Class Diagram of Transformer Subpackage¶
Create your own Transformer¶
Please see the Create your own Transformer for further details.