Shortlisting large number of labelling functionsJanuary 14, 2021 at 4:23am
I am trying to use snorkel in the context of credit risk modelling (binary classification). I have written lots of labeling functions. I do have some ground truth available so I want to inform labeling functions somehow. The question is: which ground-truth related metrics is it recommended to look at (balanced accuracy, F-score)? Secondly, should I strictly remove any function that will not pass the threshold? I should also mention that in my dataset, positive class is rare (prevalence<4%). Thanks.
January 18, 2021 at 11:42pm
When writing labeling functions in the binary setting, two metrics that may be helpful:
- Coverage (# non-abstain votes / # total votes)
- Class-specific precision (assuming unipolar LFs)
It's hard to say exactly how to balance these metrics without a closer look at your application! (Pruning based on empirical precision is one option, or you might consider trading off coverage for higher precision as well!)