Hello, I’m very new to using spacy, ray, and negspacy. Any help would be appreciated. I’m unable to use negspacy as a pipe with spacy to run within a ray cluster.
deps:
python3.10
spacy==3.6.1
spacy-alignments==0.9.1
spacy-legacy==3.0.12
spacy-loggers==1.0.5
spacy-transformers==1.1.9
ray==2.8.0
negspacy==1.0.4
runtime_env = {"pip": ["spacy", "scispacy", "negspacy",, "dask", "en_core_sci_scibert", "s3fs"]}
# enable spacy gpu
spacy.prefer_gpu()
# load the spacy model
nlp = spacy.load("en_core_med7_trf")
ts = termset("en")
nlp.add_pipe('negex', config={ 'neg_termset': ts.get_patterns() })
receives error message
ValueError: [E002] Can't find factory for 'negex' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. If you're using a Transformer, make sure to install 'spacy-transformers'. If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).
From what I can tell negspacy
is already a component. When using with spacy version 3.6 all that is needed to add negex
to pipeline is add_pipe('negex')
but for some reason I can’t get that to work.
I’ve tried:
- reverting to python3.10
- following installation instructions for spacy
- installing trf model from GitHub - kormilitzin/med7 (using different model in code, same error)
- reverting spacy to 3.4, 3.6, 3.7
The error message seems like the spacy-transformer
lib is missing but it has been installed and imported into the script, however, maybe I’m not using it correctly with Ray?
Once again, any help with my lack of understanding is greatly appreciated.