Spaces:

abadesalex
/

emb-rep

Sleeping

App Files Files Community

emb-rep / my-app /src /Components /Header /index.jsx

abadesalex

bar graphs

1ad4e76 about 1 year ago

raw

history blame contribute delete

4.23 kB

	import { Grid, ThemeProvider, Typography } from "@mui/material";
	import { buildTheme } from "../../infrastructure/theme/theme";

	export default function Definition() {
	return (
	<>
	<ThemeProvider theme={buildTheme()}>
	<Grid textAlign={"justify"} pl={2} pr={2}>
	<Typography variant="h3" color="#000" mb={1}>
	This is a simple web application that creates word embeddings using
	the gensim model. Word embeddings are numerical representations of
	words that capture their semantic meanings. Each of the 50
	dimensions in a GloVe model represents a latent semantic attribute,
	and even tough they are not directly interpretable, with som sort of
	reverse engineering the embedding space the latent semantic
	attributes can be inferred. Two main approaches can be used to
	achieve this:
	</Typography>
	<Typography variant="h3" color="#000" mb={5}>
	<ol>
	<li>
	<Typography
	variant="h3"
	color="#000"
	sx={{ fontWeight: "600" }}
	>
	Investigating Common Category Attributes
	</Typography>
	<Typography variant="h3" color="#000" mb={2}>
	Investigating Common Category Attributes: By analyzing
	dimensions with the lowest variance among a group of
	semantically similar words.The premise is that dimensions with
	minimal variance may be capturing attributes that are common
	across the set. For instance:
	<ul>
	<li>
	<Typography variant="h3" color="#000">
	<b>Words:</b> 'Spain', 'France', 'Germany', 'Japan' (all
	countries)
	</Typography>
	</li>
	<li>
	<Typography variant="h3" color="#000">
	<b>Low Variance Dimensions:</b> These would
	theoretically indicate attributes common to all
	countries, potentially abstract notions like
	'sovereignty', 'nationhood', or just the general
	category of being a 'country'.
	</Typography>
	</li>
	</ul>
	</Typography>
	</li>
	<li>
	<Typography
	variant="h3"
	color="#000"
	sx={{ fontWeight: "600" }}
	>
	Investigating Specific Semantic Differences:
	</Typography>
	<Typography variant="h3" color="#000">
	By analyzing the high variance dimensions resulting from
	subtracting one vector from another. This subtraction aims to
	capture the core semantic differences between two entities. It
	can become more robust if more than one pair of words is
	selected to compare. For instance:
	</Typography>
	<ul>
	<li>
	<Typography variant="h3" color="#000">
	<b>Words pairs:</b> 'Man' and 'Woman', 'Uncle' and 'Aunt',
	'Father' and 'Mother'.
	</Typography>
	</li>
	<li>
	<Typography variant="h3" color="#000">
	<b>Difference Vector (High Variance Dimensions)</b> :
	These dimensions likely highlight aspects related to
	gender differences. The highest values in this vector
	suggest dimensions where the concept of 'man' and 'woman'
	differ most significantly, potentially capturing
	gender-specific traits or roles.
	</Typography>
	</li>
	</ul>
	</li>
	</ol>
	</Typography>
	</Grid>
	</ThemeProvider>
	</>
	);
	}