Last updated
Last updated
Creating New Features to get most out of data, can be complex topic
length title
Lets make some histagrams
Err on side if leaving feature in model to see if its good
Why?
If left skewed, log transformed data pulls it to the middle. Model might dig too much into a tail inside of exploring the differences of the majority.
Prime candiates dramatic skew with long tail or few outliers
Bimodal isn't heavily skewed without clear outliers
Usually use exponents, y^x => y is value x is exponent.
Aim for normal distrubution, dont worry about 0
Test range of exponents, get measurement criteria