sts

semantic-text-similarity

an easy-to-use interface to fine-tuned BERT models for computing semantic similarity in clinical and web text. that's it.

Showing:

Popularity

Downloads/wk

0

GitHub Stars

150

Maintenance

Last Commit

2yrs ago

Contributors

1

Package

Dependencies

5

License

MIT

Categories

Readme

semantic-text-similarity

an easy-to-use interface to fine-tuned BERT models for computing semantic similarity. that's it.

This project contains an interface to fine-tuned, BERT-based semantic text similarity models. It modifies pytorch-transformers by abstracting away all the research benchmarking code for ease of real-world applicability.

ModelDatasetDev. Correlation
Web STS BERTSTS-B0.893
Clinical STS BERTMED-STS0.854

Installation

Install with pip:

pip install semantic-text-similarity

or directly:

pip install git+https://github.com/AndriyMulyar/semantic-text-similarity

Use

Maps batches of sentence pairs to real-valued scores in the range [0,5]

from semantic_text_similarity.models import WebBertSimilarity
from semantic_text_similarity.models import ClinicalBertSimilarity

web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction

clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction

web_model.predict([("She won an olympic gold medal","The women is an olympic champion")])

More examples.

Notes

  • You will need a GPU to apply these models if you would like any hint of speed in your predictions.
  • Model downloads are cached in ~/.cache/torch/semantic_text_similarity/. Try clearing this folder if you have issues.

Acknowledgement

Clinical models in this project were submitted to the 2019 N2C2 Shared Task Track 1. Implementation and model training in this project was supported by funding from the Mark Dredze Lab at Johns Hopkins University.

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100