Holocaust NER
Collection
4 items
•
Updated
This is the Placing the Holocaust's finetuned GliNER small model. GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.
Category | Definition | Examples |
---|---|---|
building | Includes references to physical structures and places of labor or employment like factories. Institutions such as the "Judenrat" or "Red Cross" are also included. | school, home, house, hospital, factory, station, office, store, synagogue, barracks |
country | Mostly country names, also includes "earth," "country," and "world." Distinguished from Region and Environmental feature based on context. | germany, poland, states, israel, united, country, america, england, france, russia |
dlf (distinct landscape feature) | Places not large enough to be a geographic or populated region but too large to be an Object, includes parts of buildings like "roof" or "chimney." | street, door, border, line, farm, window, streets, road, wall, field |
env feature (environmental feature) | Any named or unnamed environmental feature, including bodies of water and landforms. General references like "nature" and "water" are included. | woods, forest, river, mountains, ground, trees, water, tree, mountain, sea |
interior space | References to distinct rooms within a building, or large place features of a building like a "factory floor." | room, apartment, floor, kitchen, rooms, gas, basement, bathroom, chambers, bunker |
imaginary | Difficult terms that are context-dependent like "inside," "outside," or "side." Also includes unspecified locations like "community," and conceptual places like "hell" or "heaven." | place, outside, places, side, inside, hiding, hell, heaven, part, spot |
populated place | Includes cities, towns, villages, and hamlets or crossroads settlements. Names of places can be the same as a ghetto, camp, city, or district. | camp, ghetto, town, city, auschwitz, camps, new, york, concentration, village |
region | Sub-national regions, states, provinces, or islands. Includes references to sides of a geopolitical border or military zone. | area, side, land, siberia, new, zone, jersey, california, russian, eastern |
spatial object | Objects of conveyance and movable objects like furniture. In specific contexts, refers to transportation vehicles or items like "ovens," where the common use case of the term prevails. | train, car, ship, boat, bed, truck, trains, cars, trucks |
To use this model, you must install the GLiNER Python library:
!pip install gliner
Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using GLiNER.from_pretrained
and predict entities with predict_entities
.
from gliner import GLiNER
model = GLiNER.from_pretrained("placingholocaust/gliner_small-v2.1-holocaust")
text = """
Okay. So now it's spring of '44? A: ‘4, And she says, You're going to go to Brzezinka. I said, What is Brzezinka? She said, It's a crematorium and the gas chamber. They have a half a million Hungarian Jews are coming in. That's when the time they -- and they need people to select. We do not select the people to -- who die or not. The women fold the clothes and look for jewelry and make packages to send it to Germany.
"""
labels = ["dlf", "populated place", "country", "region", "interior space", "env feature", "building", "spatial object"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])