File size: 1,057 Bytes
c4c0f74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fdc49e8
 
 
 
c4c0f74
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
metrics:
- accuracy
- bleu
widget:
  - text: 19, asbury place,mason city, iowa, 50401, us
    example_title: Adress 1
  - text: 1429, birch drive, mason city, iowa, 50401, us
    example_title: Adress 2
---

# Address Standardization and Correction Model

This model is [t5-base](https://huggingface.co/t5-base) fine-tuned to transform incorrect and non-standard addresses into standardized addresses.


## How to use the model

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("Hnabil/t5-address-standardizer")
tokenizer = AutoTokenizer.from_pretrained("Hnabil/t5-address-standardizer")

inputs = tokenizer(
  "220, soyth rhodeisland aveune, mason city, iowa, 50401, us",
  return_tensors="pt"
)
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

# ['220, s rhode island ave, mason city, ia, 50401, us']
```

## Training data

The model has been trained on data from [openaddresses.io](https://openaddresses.io/).