Spaces:
Runtime error
Runtime error
zabir-nabil
commited on
Commit
•
a70ef44
1
Parent(s):
a2b22c2
Create app.py
Browse files
app.py
ADDED
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
from image_search import load_model, process_image, process_text, search_images
|
3 |
+
|
4 |
+
st.set_page_config(
|
5 |
+
page_title="Bangla CLIP Search",
|
6 |
+
page_icon="chart_with_upwards_trend"
|
7 |
+
)
|
8 |
+
st.markdown(
|
9 |
+
"""
|
10 |
+
<style>
|
11 |
+
#introduction {
|
12 |
+
padding: 10px 20px 10px 20px;
|
13 |
+
background-color: #aad9fe;
|
14 |
+
border-radius: 10px;
|
15 |
+
|
16 |
+
}
|
17 |
+
|
18 |
+
#introduction p {
|
19 |
+
font-size: 1.1rem;
|
20 |
+
color: #050e14;
|
21 |
+
|
22 |
+
}
|
23 |
+
|
24 |
+
img {
|
25 |
+
padding: 5px;
|
26 |
+
}
|
27 |
+
</style>
|
28 |
+
|
29 |
+
|
30 |
+
""",
|
31 |
+
unsafe_allow_html=True,
|
32 |
+
)
|
33 |
+
hide_streamlit_style = """
|
34 |
+
<style>
|
35 |
+
#MainMenu {visibility: hidden;}
|
36 |
+
footer {visibility: hidden;}
|
37 |
+
</style>
|
38 |
+
"""
|
39 |
+
st.markdown(hide_streamlit_style, unsafe_allow_html=True)
|
40 |
+
|
41 |
+
|
42 |
+
st.markdown("# বাংলা CLIP সার্চ ইঞ্জিন ")
|
43 |
+
st.markdown("""---""")
|
44 |
+
st.markdown(
|
45 |
+
"""
|
46 |
+
<div id="introduction">
|
47 |
+
|
48 |
+
<p>
|
49 |
+
Contrastive Language-Image Pre-training (CLIP), consisting of a simplified version of ConVIRT trained from scratch, is an efficient method of image representation learning from natural language supervision. , CLIP jointly trains an image encoder and a text encoder to predict the correct pairings of a batch of (image, text) training examples. At test time the learned text encoder synthesizes a zero-shot linear classifier by embedding the names or descriptions of the target dataset’s classes.
|
50 |
+
|
51 |
+
The model consists of an EfficientNet image encoder and a BERT encoder and was trained on multiple datasets from Bangla image-text domain.
|
52 |
+
|
53 |
+
</p>
|
54 |
+
</div>
|
55 |
+
""",
|
56 |
+
unsafe_allow_html=True,
|
57 |
+
)
|
58 |
+
st.markdown("""---""")
|
59 |
+
text_query = st.text_input(":mag_right: Search Images / ছবি খুজুন", "সুন্দরবনের নদীর পাশে একটি বাঘ")
|
60 |
+
st.markdown("""---""")
|
61 |
+
number_of_results = st.slider("Number of results ", 1, 100, 10)
|
62 |
+
st.markdown("""---""")
|
63 |
+
|
64 |
+
ret_imgs, ret_scores, _, _ = search_images(text_query, "demo_images/", k = number_of_results)
|
65 |
+
|
66 |
+
st.markdown("<div style='align: center; display: flex'>", unsafe_allow_html=True)
|
67 |
+
st.image([str(result) for result in ret_imgs], caption = ["Score: " + str(r_s) for r_s in ret_scores], width=230)
|
68 |
+
st.markdown("</div>", unsafe_allow_html=True)
|