menu

snorkel

Building and managing training datasets for machine learning

Channels
# All channels
view-forward
# announcements
view-forward
# api
view-forward
# applications
view-forward
# help
view-forward
# projects
view-forward
# tutorials
view-forward
Team
Posts
Members
Info

What is the best redundant labeling strategy to maximize the performance of…

I have 50,000 sentences I need labeled for a custom NER model, with triple redudancy. I can have as many workers as I want do the labeling. I will combine their labels using Snorkel's LabelModel. What is the best strategy to employ to give the LabelModel the best performance…

thumbsup
2
message-simple
2

Repository that lists the successful applications that use Snorkel

Is there a repository that lists the successful applications that use Snorkel?

thumbsup
1
message-simple
0

Strategies for finding the best n-grams for keyword functions?

I'm working like hell to find enough LFs for the GENERAL label (as in general purpose open source library) compared to API (as in API-specific open source library) for Amazon's Github repositories as my dataset, to create labels for a discriminative classifier for Amazon Github…

thumbsup
1
message-simple
2

Learning to denoise rules to obtain more accurate classifiers!

Hi Everyone, In case you have an application where rules are much noisy and labeled data is very limited, please consider checking out ICLR 2020 spotlight paper Learning from Rules Generalizing Labeled Exemplars …

thumbsup
0
message-simple
1

What are best practices to combine noisy and "golden" labels?

Hello, I went through tutorials and so much amazed by Snorkel. I train the model on Snorkel labels and then run another training (just few epochs) with hand-labelled data. Are there any better ways to combine gold and noisy labels? Since I already have gold labels think they…

thumbsup
0
message-simple
5

Has snorkel been applied to any type of cybersecurity data use case?

thumbsup
0
message-simple
6