--- library_name: span-marker tags: - span-marker - token-classification - ner - named-entity-recognition - generated_from_span_marker_trainer metrics: - precision - recall - f1 widget: - text: 'In the 2017 publication "The Routledge Handbook of Collective Intentionality", edited by Kirk Ludwig and Marija Jankovic, and released by Routledge, leading scholars explored the complex concept of collective intentionality and its implications for various disciplines, including philosophy, cognitive science, and social theory . A thought-provoking 2015 article titled "The Uncivilization Thesis: A Critique" by Megan Gittins, published in Environmental Ethics, offered a critical examination of the controversial "uncivilization thesis" and its implications for our understanding of the relationship between civilization and environmental sustainability.' - text: In the "The Selfish Gene", renowned biologist Richard Dawkins introduced the revolutionary concept of the "selfish gene" in 1976, published by Oxford University Press . This influential work challenged traditional views of evolution and sparked widespread discussions about the nature of altruism and cooperation . Fans of science writing might appreciate "A Short History of Nearly Everything" by Bill Bryson, a captivating exploration of the vast realms of scientific knowledge published by Broadway Books in 2003. - text: '"The Pragmatic Turn" (2020, University of Pennsylvania Press) provides key insights into pragmatist philosophy, edited by John J. Stuhr . For provocative science, try "Introducing Consciousness", Alex Westrin and Vidyut Lokhande''s 2018 work published via Icon Books, challenging dominant models of self-awareness.' - text: Have you read "The Selfish Gene" by Richard Dawkins? Published in 1976 by Oxford University Press, this seminal work introduced the gene-centric view of evolution and proposed the controversial concept of the "extended phenotype ." Dawkins' ideas sparked intense debates and influenced diverse fields like evolutionary biology, psychology, and memetics . Daniel C. Dennett's "Darwin's Dangerous Idea" (1995, Simon & Schuster) is another must-read that explores the far-reaching implications of evolutionary theory, from the origins of life to the nature of human consciousness and free will. - text: '"The Sociology of Philosophies", an insightful book penned by Randall Collins and published in 1998 by Harvard University Press, examined the social factors that influence the development and trajectory of philosophical thought throughout history . Collins'' analysis shed light on how philosophical ideas are shaped by the broader cultural, political, and intellectual contexts in which they emerge . In a 2012 article from Philosophy of the Social Sciences, titled "The Relevance of the Sociology of Philosophy", Isaac Reed further expounded on the importance of this interdisciplinary approach, highlighting its potential to deepen our understanding of the dynamics that shape human knowledge and inquiry.' pipeline_tag: token-classification model-index: - name: SpanMarker results: - task: type: token-classification name: Named Entity Recognition dataset: name: Unknown type: unknown split: eval metrics: - type: f1 value: 0.0 name: F1 - type: precision value: 0.0 name: Precision - type: recall value: 0.0 name: Recall --- # SpanMarker This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. ## Model Details ### Model Description - **Model Type:** SpanMarker - **Maximum Sequence Length:** 512 tokens - **Maximum Entity Length:** 16 words ### Model Sources - **Repository:** [SpanMarker on GitHub](https://github.com/tomaarsen/SpanMarkerNER) - **Thesis:** [SpanMarker For Named Entity Recognition](https://raw.githubusercontent.com/tomaarsen/SpanMarkerNER/main/thesis.pdf) ### Model Labels | Label | Examples | |:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | person | "Barney Glaser", "Malcolm Gladwell", "Charles Duhigg" | | publication_date | "2000", "1967", "2018" | | publisher | "Little , Brown and Company", "Sociology Press", "Avery" | | work_of_art | "`` The Tipping Point : How Little Things Can Make a Big Difference ''", "`` The Power of Habit ''", "`` The Discovery of Grounded Theory ''" | ## Evaluation ### Metrics | Label | Precision | Recall | F1 | |:-----------------|:----------|:-------|:----| | **all** | 0.0 | 0.0 | 0.0 | | person | 0.0 | 0.0 | 0.0 | | publication_date | 0.0 | 0.0 | 0.0 | | publisher | 0.0 | 0.0 | 0.0 | | work_of_art | 0.0 | 0.0 | 0.0 | ## Uses ### Direct Use for Inference ```python from span_marker import SpanMarkerModel # Download from the 🤗 Hub model = SpanMarkerModel.from_pretrained("span_marker_model_id") # Run inference entities = model.predict("\"The Pragmatic Turn\" (2020, University of Pennsylvania Press) provides key insights into pragmatist philosophy, edited by John J. Stuhr . For provocative science, try \"Introducing Consciousness\", Alex Westrin and Vidyut Lokhande's 2018 work published via Icon Books, challenging dominant models of self-awareness.") ``` ### Downstream Use You can finetune this model on your own dataset.
Click to expand ```python from span_marker import SpanMarkerModel, Trainer # Download from the 🤗 Hub model = SpanMarkerModel.from_pretrained("span_marker_model_id") # Specify a Dataset with "tokens" and "ner_tag" columns dataset = load_dataset("conll2003") # For example CoNLL2003 # Initialize a Trainer using the pretrained model & dataset trainer = Trainer( model=model, train_dataset=dataset["train"], eval_dataset=dataset["validation"], ) trainer.train() trainer.save_model("span_marker_model_id-finetuned") ```
## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:----------------------|:----|:---------|:----| | Sentence length | 47 | 104.6034 | 200 | | Entities per sentence | 3 | 4.0036 | 5 | ### Training Hyperparameters - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Training Results | Epoch | Step | Validation Loss | Validation Precision | Validation Recall | Validation F1 | Validation Accuracy | |:-----:|:----:|:---------------:|:--------------------:|:-----------------:|:-------------:|:-------------------:| | 1.0 | 563 | 0.0206 | 0.0 | 0.0 | 0.0 | 0.8513 | | 2.0 | 1126 | 0.0173 | 0.0 | 0.0 | 0.0 | 0.8513 | | 3.0 | 1689 | 0.0162 | 0.0 | 0.0 | 0.0 | 0.8513 | ### Framework Versions - Python: 3.10.13 - SpanMarker: 1.5.1.dev - Transformers: 4.39.3 - PyTorch: 2.1.2 - Datasets: 2.16.0 - Tokenizers: 0.15.0 ## Citation ### BibTeX ``` @software{Aarsen_SpanMarker, author = {Aarsen, Tom}, license = {Apache-2.0}, title = {{SpanMarker for Named Entity Recognition}}, url = {https://github.com/tomaarsen/SpanMarkerNER} } ```