Cleaner Spark UDF definitions with a little decorator
Posted on Thu 16 November 2017 • Tagged with spark, python, data, snippets • 3 min read
One of the handy features that makes (Py)Spark more flexible than database tools like Hive even for just transforming tabular data is the ease of creating User Defined Functions (UDFs). However, one thing that still remains a little annoying is that you have to separately define a function and declare it as a UDF. With four lines of code you can clean those definitions right up.