Here is a function I used for my baseline statistical features in my model:
It takes about 5 seconds to compute the features for the training set, generates 141 features but you can select these down to much less. CV score with non-tuned tree models is ~84%.
I haven’t added much commentary on why these work or how they were derived because private leader board is still pending and I guess some people higher ranked than me will present their full solutions after that anyway.