Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AllenNLP will be unmaintained in December (github.com/allenai)
62 points by lgessler on July 11, 2022 | hide | past | favorite | 18 comments


It's one of the overly abstracted libraries. Too hard to tweak something. HuggingFace Transformers did a better job at keeping things simpler.


AllenNLP started before transformers, and so it provided high level abstractions to experiment with model architectures, which is where much of NLP research was happening at the time. Transformers definitely changed the playing field, as it became the basis for most models!


I think you are missing the point.

The hackability quotient of AllenNLP is way low.

I'll give you specific examples where AllenNLP overdid it, while HuggingFace was better just by keeping it simple.

Vocabulary class. HuggingFace just used a python dictionary. I can't think of one person who said they needed higher level abstraction. Turns out a python dictionary is pickle-able, saving to a text file is one line code, while the AbstractSinglettonProxyVocabulary is not and no one wants to care in the first place.

Tokenizer class. HuggingFace just used a python dictionary to return strings and integers. I can't think of one person frustrated by it. It's printable, picklable, and everything in between people can fiddle with. And boy where do I start about AllenNLP's overdoing of Tokenizers.

Trainer class. vs. HuggingFace example scripts. The scripts are just much more readable, tweakable, debuggable etc. HF didn't bother with AbstractBaseTrainer class bs.

It just shows they never understood the playing field.

- First, I don't think anyone thought AllenNLP was a good choice for high performance production systems. Again HuggingFace clearly understood the problem and built a fast tokenizer in Rust.

- A math, physics, linguistics, or even CS PhD student who know basics of coding would prefer bare bone scripts. They just want to hack it off and focus on research. Writing good code is not their objective.

Just my opinion.


AllenNLP was written for research, not for production. Many of the design choices reflect that.

As far as the vocabulary goes, a lot of AllenNLP components are about experimenting with ways to turn text into vectors. Constructing the vocabulary is part of that. When pre-trained transformers became a thing, this wasn't needed anymore. That's part of why we decided to deprecate the library: Very few people experiment with how to construct vocabularies anymore, so we don't want to live with the complexity anymore.


Hugging Faces APIs really aren't that great, I hear lots of people complain about them. All HF did was make transformers very accessible and sharable with a neat UI.


Last night I was running run_translator.py script and found that their scripts not actually allow people training models from scratch.

But hey, I was able to read the code, fix that small thing that needed to work for my case and ran my experiment.

I could never do that in AllenNLP. Go figure.



which is a build script that only integrates with torch. So they've switched to plain torch.


When we started AllenNLP, PyTorch was just starting to emerge as a competitor to Tensorflow and we made the difficult decision to support PyTorch. In hindsight this was a great decision as the majority of top research is done in PyTorch today.

Tango primarily supports PyTorch, but unlike AllenNLP, is flexible enough to support other deep learning libraries as well. For example, we're adding support for JAX so we can easily leverage TPUs.


For what I've seen Tango is a general dag/pipeline that happens to have some facilities for PyTorch. I don't see any deep learning specific. You could execute sklearn or whatever.


Maybe we need to re-work the docs if the DAG aspects stick out to you so much. The main functionality is the cache. If you have a complex experiment, you can still write the code as if all the steps were fast, and let them be slow only the first time you run it. The DAG stuff is also nice, but less important.

That said, you could execute sklearn. If that's what your experiment needs, it's the right thing to do. This is why it gives us the flexibility to also support Jax: https://github.com/allenai/tango/pull/313

The DL-specific stuff is in the components we supply. Like the trainer, dataset handling stuff, file formats, and increasingly, https://github.com/allenai/catwalk.


AllenNLP has only ever supported Torch. At the moment, Tango only supports Torch as well, but Jax support is well underway.

And yeah, Tango is a lot like a build script. In fact, I used to manage my experiments with Makefiles. Tango is better though. Results don't have to be single files, and they don't have to live in one filesystem either, so I can run the GPU-heavy parts of my experiments on one machine, and the CPU-heavy parts on another. The way you version your code is better than what Makefiles can do. You have actual control beyond file modification time. And of course, there is the whole Python integration stuff.


Did AllenNLP ever support any other engine? For the past couple years at least I think they've only supported PyTorch


Worth adding a notice to your website [1] as well.

[1] https://allenai.org/allennlp


What's a good alternative?


By very far most of the work in nlp now uses pretrained models. So people use HuggingFace Transformers now. https://huggingface.co/docs/transformers/main/en/index

HuggingFace Transformers is a huge high quality open source repo of pre trained models & associated code. People combine that with Pytorch-Lightning or Fairseq most of the time afaik.


I think spaCy (https://spacy.io/) is a great library for NLP


It depends on what you use AllenNLP for. AllenNLP has a ton of functionality for vectorizing text. Most of the tokenizer/indexer/embedder stuff is about that. But these days we all use transformers for that, so there isn't much of a need to experiment with ways to vectorize.

If you like the trainer, or the configuration language, or some of the other components you should check out Tango (https://github.com/allenai/tango). One of Tango's origins is the question "What if AllenNLP supported workflow steps other than read -> train -> evaluate?". We noticed that a lot of work in NLP no longer fit that simple pattern, so we needed a new tool that can support more complex experiments.

If you like the metrics, try torchmetrics. Torchmetrics has almost exactly the same API as AllenNLP metrics.

If you like any of the nn components, please get in touch with the Tango team (on GitHub). We recently had some discussion around rescuing a few of those, since there seems to be some excitement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: