menu

snorkel

Building and managing training datasets for machine learning

Channels
# All channels
view-forward
# announcements
view-forward
# api
view-forward
# applications
view-forward
# help
view-forward
# projects
view-forward
# tutorials
view-forward
Team
Posts
Members
Info
down-caret

Announcing Snorkel v0.9!

We’re excited to announce the release of Snorkel v0.9 today! Snorkel v0.9 integrates our recent research advances and Snorkel-based open source projects into one modern Python library for building and managing training datasets. Alongside the release is a new homepage at…

thumbsup
10
message-simple
0

weighting labeling functions

Is is possible to assign weights for labeling functions. In a scenario wherein we have L1 to L10 labeling functions generating noisy results. If L1 to L9 are labeling functions written by a novice domain SME and L10 is actually written by an experienced domain expert, can we use…

thumbsup
0
message-simple
0

Understanding buckets key in error_analysis.py

In error_analysis.py class documentation for get_label_buckets method, it's mentioned that "The returned buckets[(i, j)] is a NumPy array of data point indices with predicted label i and true label j." Don't you think that i and j are true labels and predicted labels,…

thumbsup
0
message-simple
0

How could snorkel be applied to the data like KDD Cup 1999 Data?

Hello, everyone. As a beginner, i have learned some tutorials of snorkel, but the data I care is kdd99. How could I label this data with snorkel? Can you give me some idea?

thumbsup
1
message-simple
2

# Help

I'm having trouble with loading the unlabeled comments on macOS, I'm getting this error message from: can't read /var/mail/snorkel.labeling What did I do wrong?

thumbsup
0
message-simple
6

Where can I find the code for older papers?

The LabelModel in the latest version of Snorkel code is based on the new AAAI 2019 paper. I wanted to ask where I can find the code for the following papers: 1. Snorkel: Rapid Training Data Creation with Weak Supervision 2. Learning the structure of Generative Models without…

thumbsup
0
message-simple
1

LabelModel cardinality limit?

Is there a practical limit to number of classes a model can use? I'm trying to fit a LabelModel with 16 classes and it never seems to return. There is a 'permutations' call in 'label_model.py' that seems to be where it gets stuck.

thumbsup
0
message-simple
2

How to use the `PandasParallelApplier`

Is there any trick to using the PandasParallelApplier? The results from the normal PandasLFApplier (commented out above) results are great. However, the results from the PandasParallelLFApplier seem a bit shuffled.... Am I doing it right? Are the rows in L returned from apply in…

thumbsup
1
message-simple
3

Snorkel w/ Dask using Large KBs

I'm using the Snorkel Dask interface w/ a Local Cluster. My heuristic labelers run very fast (parallelism), however, when I add my labeler which uses a large (~50MB) knowledge base (dict) the job doesn't even seem to start and just sits w/ 1 CPU at 100%. I suspect it's trying to…

thumbsup
0
message-simple
3

sklearn Model for probabilistic labels

Anybody know a sklearn model that supports LabelModel's probabilist labels?

thumbsup
0
message-simple
0

Snorkel for NER

Hi everyone, glad to be part of the Snorkel community! I wanted to ask if there were any guides to how Snorkel can be best utilized for NER modeling scenarios?

thumbsup
0
message-simple
1