mms_benchmark / pages /3_Language_Typology.py
Szymon Woźniak
add descriptions to language typology and cross-domain pages
7140424
import streamlit as st
import time
import numpy as np
import pandas as pd
from filter_dataframe import filter_dataframe
@st.cache_data
def get_typology_df():
return pd.read_csv("data/language_typology.tsv", sep="\t")
st.set_page_config(page_title="Language Typology", page_icon="📈")
st.markdown("# Language Typology")
st.write("""\
Languages can be described using hundreds of linguistic features. [World Atlas of Language Structures](https://doi.org/10.5281/zenodo.7385533) lists almost 200 different features. Since our work focuses on sentiment classification, we select 10 features that seem to be the most relevant to the task of sentiment expression.
The table below presents the languages included in the MMS corpus, their family, genus and their values for the selected linguistic features.
You can use the **'Add filters'** checkbox to filter the table by any of the columns.""")
df = get_typology_df()
st.dataframe(filter_dataframe(df))