Spaces:

pain
/

Arabic_story_generator

Sleeping

App Files Files Community

pain commited on Jul 3, 2024

Commit

b0359f9

verified ·

1 Parent(s): 87727f0

Upload 6 files

Browse files

Files changed (6) hide show

LICENSE +238 -0
README.md +51 -13
app.py +59 -24
image_generator.py +58 -53
llm_models.py +144 -113
requirements.txt +1 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,238 @@

+                         ACADEMIC PUBLIC LICENSE
+                                version 1.1
+               Copyright (c) 2024 MOHAMMAD ALBARHAM
+                                Preamble
+  This license contains the terms and conditions of using Arabic Story Generator in
+noncommercial settings: at academic institutions for teaching and research
+use and for personal or educational purposes. You will find that this
+license provides noncommercial users of Arabic Story Generator with rights that are
+similar to the well-known GNU General Public License, yet it retains the
+possibility for Arabic Story Generator authors to financially support the development by
+selling commercial licenses. In fact, if you intend to use Arabic Story Generator in a
+"for-profit" environment, where research is conducted to develop or enhance
+a product, is used in a commercial service offering, or when an entity uses
+Arabic Story Generator to participate in government-funded, EU-funded, military or similar
+research projects, then you need to obtain a commercial license (OMNEST).
+In that case, or if you are unsure, please contact the Author or visit
+www.omnest.com to inquire about commercial licenses.
+  What are the rights given to noncommercial users? Similarly to GPL, you
+have the right to use the software, to distribute copies, to receive source
+code, to change the software and distribute your modifications or the
+modified software. Also similarly to the GPL, if you distribute verbatim or
+modified copies of this software, they must be distributed under this
+license.
+  By modeling the GPL, this license guarantees that you're safe when using
+Arabic Story Generator in your work, for teaching or research. This license guarantees
+that Arabic Story Generator will remain available free of charge for nonprofit use. You
+can modify Arabic Story Generator to your purposes, and you can also share your modifications.
+Even in the unlikely case of the authors abandoning Arabic Story Generator entirely, this
+license permits anyone to continue developing it from the last release, and
+to create further releases under this license.
+  We believe that the combination of noncommercial open-source and commercial
+licensing will be beneficial for the whole user community, because income from
+commercial licenses will enable faster development and a higher level of
+software quality, while further enjoying the informal, open communication
+and collaboration channels of open source development.
+  The precise terms and conditions for using, copying, distribution and
+modification follow.
+                         ACADEMIC PUBLIC LICENSE
+    TERMS AND CONDITIONS FOR USE, COPYING, DISTRIBUTION AND MODIFICATION
+  0. Definitions
+  "Program" means a copy of Arabic Story Generator, which is said to be distributed under
+this Academic Public License.
+  "Work based on the Program" means either the Program or any derivative work
+under copyright law: that is to say, a work containing the Program or a
+portion of it, either verbatim or with modifications and/or translated into
+another language.  (Hereinafter, translation is included without limitation
+in the term "modification".)
+  "Using the Program" means any act of creating executables that contain or
+directly use libraries that are part of the Program, running any of the
+tools that are part of the Program, or creating works based on the Program.
+Each licensee is addressed as "you".
+  1. Permission is hereby granted to use the Program free of charge for
+noncommercial purposes, including teaching and academic research at
+universities, colleges and other educational institutions and personal
+non-profit purposes. For using the Program for commercial purposes,
+including but not restricted to consulting activities, design of commercial
+hardware or software networking products, and joint research with a
+commercial entity, government-funded, EU-funded, military or similar
+research projects, you have to contact the Author or visit www.omnest.com
+for an appropriate license. Permission is also granted to use the Program
+for a reasonably limited period of time for the purpose of evaluating its
+usefulness for a particular purpose.
+  2. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+  3. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 2
+above, provided that you also meet all of these conditions:
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose regulations for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+(If the same, independent sections are distributed as part of a package
+that is otherwise reliant on, or is based on the Program, then the
+distribution of the whole package, including but not restricted to the
+independent section, must be on the unmodified terms of this License,
+regadless of who the author of the included sections was.)
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based or reliant on the Program.
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+storage or distribution medium does not bring the other work under
+the scope of this License.
+  4. You may copy and distribute the Program (or a work based on it,
+under Section 3) in object code or executable form under the terms of
+Sections 2 and 3 above provided that you also do one of the following:
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    2 and 3 above on a medium customarily used for software interchange; or,
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 2 and 3 above on a medium
+    customarily used for software interchange; or,
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you received
+    the program in object code or executable form with such an offer,
+    in accord with Subsection b) above.)
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+  5. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+  6. You are not required to accept this License, since you have not
+signed it.  Nothing else grants you permission to modify or distribute
+the Program or its derivative works; law prohibits these actions
+if you do not accept this License.  Therefore, by modifying or distributing
+the Program (or any work based on the Program), you indicate your
+acceptance of this License and all its terms and conditions for copying,
+distributing or modifying the Program or works based on it, to do so.
+  7. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+  8. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+  9. If the distribution and/or use of the Program are restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+                            NO WARRANTY
+  10. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+  11. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED ON IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+                     END OF TERMS AND CONDITIONS

README.md CHANGED Viewed

@@ -1,13 +1,51 @@
----
-title: Arabic Story Generator
-emoji: 🦀
-colorFrom: purple
-colorTo: blue
-sdk: gradio
-sdk_version: 4.32.2
-app_file: app.py
-pinned: false
-license: mit
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Gradio Application
+<img src="image_logo.png" style="width:50%; height:auto;">
+This is a Gradio application that allows you to generate an Arabic story using generative AI models.
+## Installation
+1. Clone this repository:
+    ```shell
+    git clone https://github.com/mohammad-albarham/Arabic_story_generator.git
+    ```
+2. Install the required dependencies:
+    ```shell
+    pip install -r requirements.txt
+    ```
+## Usage
+0. Add the keys for OPEN AI API model and stability AI API in [models.py](https://github.com/mohammad-albarham/Arabic_story_generator/blob/3702d6cad85fe38ff5944d7f99f43a37d7dec151/llm_models.py#L16) and [image_generator.py](https://github.com/mohammad-albarham/Arabic_story_generator/blob/3702d6cad85fe38ff5944d7f99f43a37d7dec151/image_generator.py#L22)
+1. Run the application:
+    ```shell
+    gradio app.py
+    ```
+2. Open your web browser and navigate to [http://localhost:7860](http://localhost:7860).
+3. Add your a description and the needed number of pages and click on generate story.
+## Contributing
+Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
+### Instrcutions for the contribution:
+1. Please install black formatter as follows:
+`pip install black`
+2. Make sure to format all python files you want to change using this command on the terminal:
+`black .`
+You can see this tutorial for more information about the formatter: [tutorial](https://www.freecodecamp.org/news/auto-format-your-python-code-with-black/)
+## License
+[ACADEMIC PUBLIC LICENSE](https://github.com/mohammad-albarham/Arabic_story_generator/tree/main?tab=License-1-ov-file)

app.py CHANGED Viewed

@@ -3,7 +3,7 @@ from llm_models import get_text_image_pairs
 import time
 from tqdm import tqdm
-title_markdown = ("""
 <div style="display: flex; justify-content: center; align-items: center; text-align: center; direction: rtl;">
   <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" alt="MoE-LLaVA🚀" style="max-width: 120px; height: auto; margin-right: 20px;">
   <div style="display: flex; flex-direction: column; justify-content: center; align-items: center;">
@@ -12,15 +12,19 @@ title_markdown = ("""
     <h2 style="margin: 0; font-size: 1.5em;">صانع القصص بالذكاء الاصطناعي التوليدي</h2>
   </div>
 </div>
-""")
-def get_text_images_values(k, input_prompt):
     pages = int(k)
-    segments_list, images_names =  get_text_image_pairs(pages,input_prompt)
     return segments_list, images_names
 css = """
 .gradio-container {direction: rtl}
 .gradio-container-4-18-0 .prose h1 {direction: rtl};
@@ -31,30 +35,61 @@ with gr.Blocks(css=css) as demo:
     gr.Markdown(title_markdown)
-    prompt = gr.Textbox(label="معلومات بسيطة عن القصة",
-                info="أدخل بعض المعلومات عن القصة، مثلاً: خالد صبي في الرابعة من عمره، ويحب أن يصبح طياراً في المستقبل",
-                placeholder="خالد صبي في الرابعة من عمره، ويحب أن يصبح طياراً في المستقبل",
-                text_align="right",
-                rtl=True,
-                elem_classes="rtl-textbox",
-                elem_id="rtl-textbox")
     with gr.Row():
-        max_textboxes = 10 # Define the max number of textboxed, so we will add the max number of textboxes and images to the layout
         def variable_outputs(k, segments_list):
             k = int(k)
-            return [gr.Textbox(label= f"الصفحة رقم {i+1}", value=item, text_align="right", visible=True) for i, item in enumerate(segments_list)] + [gr.Textbox(visible=False, text_align="right", rtl=True)]*(max_textboxes-k)
-        def variable_outputs_image(k,images_names):
             k = int(k)
-            return [gr.Image(value=item, scale=1, visible=True) for item in images_names] + [gr.Image(scale=1,visible=False)]*(max_textboxes-k)
         with gr.Column():
-            s = gr.Slider(1, max_textboxes, value=1, step=1, info="أقصى عدد صفحات يمكن توليده هو 10 صفحات",label="كم عدد صفحات القصة التي تريدها؟")
             textboxes = []
             imageboxes = []
             for i in tqdm(range(max_textboxes)):
@@ -64,15 +99,15 @@ with gr.Blocks(css=css) as demo:
                     imageboxes.append(i_t)
                     textboxes.append(t)
-            segment_list = gr.JSON(value=[],visible=False)
             images_list = gr.JSON(value=[], visible=False)
     submit = gr.Button(value="أنشئ القصة الآن")
     submit.click(
         fn=get_text_images_values,
-        inputs=[s,prompt],
-        outputs=[segment_list, images_list]
     ).then(
         fn=variable_outputs,
         inputs=[s, segment_list],
@@ -83,4 +118,4 @@ with gr.Blocks(css=css) as demo:
         outputs=imageboxes,
     )
-demo.launch()

 import time
 from tqdm import tqdm
+title_markdown = """
 <div style="display: flex; justify-content: center; align-items: center; text-align: center; direction: rtl;">
   <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" alt="MoE-LLaVA🚀" style="max-width: 120px; height: auto; margin-right: 20px;">
   <div style="display: flex; flex-direction: column; justify-content: center; align-items: center;">
     <h2 style="margin: 0; font-size: 1.5em;">صانع القصص بالذكاء الاصطناعي التوليدي</h2>
   </div>
 </div>
+"""
+def get_text_images_values(k, input_prompt, api_key_openai, api_key_stability_ai):
     pages = int(k)
+    segments_list, images_names = get_text_image_pairs(
+        pages, input_prompt, api_key_openai, api_key_stability_ai
+    )
     return segments_list, images_names
 css = """
 .gradio-container {direction: rtl}
 .gradio-container-4-18-0 .prose h1 {direction: rtl};
     gr.Markdown(title_markdown)
+    with gr.Row():
+        api_key_openai = gr.Textbox(
+            label="Open AI API Key",
+            placeholder="أدخل مفتاح API الخاص بك هنا",
+            type="password",
+        )
+        api_key_stability_ai = gr.Textbox(
+            label="Stability AI API Key",
+            placeholder="أدخل مفتاح API الخاص بك هنا",
+            type="password",
+        )
+    prompt = gr.Textbox(
+        label="معلومات بسيطة عن القصة",
+        info="أدخل بعض المعلومات عن القصة، مثلاً: خالد صبي في الرابعة من عمره، ويحب أن يصبح طياراً في المستقبل",
+        placeholder="خالد صبي في الرابعة من عمره، ويحب أن يصبح طياراً في المستقبل",
+        text_align="right",
+        rtl=True,
+        elem_classes="rtl-textbox",
+        elem_id="rtl-textbox",
+    )
     with gr.Row():
+        max_textboxes = 10  # Define the max number of textboxed, so we will add the max number of textboxes and images to the layout
         def variable_outputs(k, segments_list):
             k = int(k)
+            return [
+                gr.Textbox(
+                    label=f"الصفحة رقم {i+1}",
+                    value=item,
+                    text_align="right",
+                    visible=True,
+                )
+                for i, item in enumerate(segments_list)
+            ] + [gr.Textbox(visible=False, text_align="right", rtl=True)] * (
+                max_textboxes - k
+            )
+        def variable_outputs_image(k, images_names):
             k = int(k)
+            return [
+                gr.Image(value=item, scale=1, visible=True) for item in images_names
+            ] + [gr.Image(scale=1, visible=False)] * (max_textboxes - k)
         with gr.Column():
+            s = gr.Slider(
+                1,
+                max_textboxes,
+                value=1,
+                step=1,
+                info="أقصى عدد صفحات يمكن توليده هو 10 صفحات",
+                label="كم عدد صفحات القصة التي تريدها؟",
+            )
             textboxes = []
             imageboxes = []
             for i in tqdm(range(max_textboxes)):
                     imageboxes.append(i_t)
                     textboxes.append(t)
+            segment_list = gr.JSON(value=[], visible=False)
             images_list = gr.JSON(value=[], visible=False)
     submit = gr.Button(value="أنشئ القصة الآن")
     submit.click(
         fn=get_text_images_values,
+        inputs=[s, prompt, api_key_openai, api_key_stability_ai],
+        outputs=[segment_list, images_list],
     ).then(
         fn=variable_outputs,
         inputs=[s, segment_list],
         outputs=imageboxes,
     )
+demo.launch()

image_generator.py CHANGED Viewed

@@ -5,68 +5,73 @@ from PIL import Image
 from stability_sdk import client
 import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
 import uuid
 # Our Host URL should not be prepended with "https" nor should it have a trailing slash.
-os.environ['STABILITY_HOST'] = 'grpc.stability.ai:443'
-# Sign up for an account at the following link to get an API Key.
-# https://platform.stability.ai/
-# Click on the following link once you have created an account to be taken to your API Key.
-# https://platform.stability.ai/account/keys
-# Paste your API Key below.
-os.environ['STABILITY_KEY'] = 'key_here'
-# Set up our connection to the API.
-stability_api = client.StabilityInference(
-    key=os.environ['STABILITY_KEY'], # API Key reference.
-    verbose=True, # Print debug messages.
-    engine="stable-diffusion-xl-1024-v1-0", # Set the engine to use for generation.
-    # Check out the following link for a list of available engines: https://platform.stability.ai/docs/features/api-parameters#engine
-)
-def get_image(prompt):
-    # Set up our initial generation parameters.
-    answers = stability_api.generate(
-        prompt=prompt, # The prompt we want to generate an image from.
-        seed=4253978046, # If a seed is provided, the resulting generated image will be deterministic.
-                        # What this means is that as long as all generation parameters remain the same, you can always recall the same image simply by generating it again.
-                        # Note: This isn't quite the case for Clip Guided generations, which we'll tackle in a future example notebook.
-        steps=30, # Amount of inference steps performed on image generation. Defaults to 30.
-        cfg_scale=8.0, # Influences how strongly your generation is guided to match your prompt.
-                    # Setting this value higher increases the strength in which it tries to match your prompt.
-                    # Defaults to 7.0 if not specified.
-        width=512, # Generation width, defaults to 512 if not included.
-        height=512, # Generation height, defaults to 512 if not included.
-        samples=1, # Number of images to generate, defaults to 1 if not included.
-        sampler=generation.SAMPLER_K_DPMPP_2M # Choose which sampler we want to denoise our generation with.
-                                                    # Defaults to k_dpmpp_2m if not specified. Clip Guidance only supports ancestral samplers.
-                                                    # (Available Samplers: ddim, plms, k_euler, k_euler_ancestral, k_heun, k_dpm_2, k_dpm_2_ancestral, k_dpmpp_2s_ancestral, k_lms, k_dpmpp_2m, k_dpmpp_sde)
-    )
-    # print("Finish the prompt")
-    # Set up our warning to print to the console if the adult content classifier is tripped.
-    # If adult content classifier is not tripped, save generated images.
-    for resp in answers:
-        for artifact in resp.artifacts:
-            # if artifact.finish_reason == generation.FILTER:
-            #     print(artifact.finish_reason)
-            #     print("Warning")
-            #     warnings.warn(
-            #         "Your request activated the API's safety filters and could not be processed."
-            #         "Please modify the prompt and try again.")
-            if artifact.type == generation.ARTIFACT_IMAGE:
-                img = Image.open(io.BytesIO(artifact.binary))
-                unique_filename = str(uuid.uuid4())
-                img.save(str(unique_filename)+ ".png") # Save our generated images with their seed number as the filename.
-    return unique_filename + ".png"

 from stability_sdk import client
 import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
 import uuid
+import gradio as gr
 # Our Host URL should not be prepended with "https" nor should it have a trailing slash.
+os.environ["STABILITY_HOST"] = "grpc.stability.ai:443"
+def get_image(prompt, api_key_stability_ai):
+    # Sign up for an account at the following link to get an API Key.
+    # https://platform.stability.ai/
+    # Click on the following link once you have created an account to be taken to your API Key.
+    # https://platform.stability.ai/account/keys
+    # Set up our connection to the API.
+    if api_key_stability_ai == "":
+        raise gr.Error("Please add your Stability AI API key ")
+    else:
+        try:
+            stability_api = client.StabilityInference(
+                key=api_key_stability_ai,  # API Key reference.
+                verbose=True,  # Print debug messages.
+                engine="stable-diffusion-xl-1024-v1-0",  # Set the engine to use for generation.
+                # Check out the following link for a list of available engines: https://platform.stability.ai/docs/features/api-parameters#engine
+            )
+            # Set up our initial generation parameters.
+            answers = stability_api.generate(
+                prompt=prompt,  # The prompt we want to generate an image from.
+                seed=4253978046,  # If a seed is provided, the resulting generated image will be deterministic.
+                # What this means is that as long as all generation parameters remain the same, you can always recall the same image simply by generating it again.
+                # Note: This isn't quite the case for Clip Guided generations, which we'll tackle in a future example notebook.
+                steps=30,  # Amount of inference steps performed on image generation. Defaults to 30.
+                cfg_scale=8.0,  # Influences how strongly your generation is guided to match your prompt.
+                # Setting this value higher increases the strength in which it tries to match your prompt.
+                # Defaults to 7.0 if not specified.
+                width=512,  # Generation width, defaults to 512 if not included.
+                height=512,  # Generation height, defaults to 512 if not included.
+                samples=1,  # Number of images to generate, defaults to 1 if not included.
+                sampler=generation.SAMPLER_K_DPMPP_2M,  # Choose which sampler we want to denoise our generation with.
+                # Defaults to k_dpmpp_2m if not specified. Clip Guidance only supports ancestral samplers.
+                # (Available Samplers: ddim, plms, k_euler, k_euler_ancestral, k_heun, k_dpm_2, k_dpm_2_ancestral, k_dpmpp_2s_ancestral, k_lms, k_dpmpp_2m, k_dpmpp_sde)
+            )
+            # print("Finish the prompt")
+            # Set up our warning to print to the console if the adult content classifier is tripped.
+            # If adult content classifier is not tripped, save generated images.
+            for resp in answers:
+                for artifact in resp.artifacts:
+                    # if artifact.finish_reason == generation.FILTER:
+                    #     print(artifact.finish_reason)
+                    #     print("Warning")
+                    #     warnings.warn(
+                    #         "Your request activated the API's safety filters and could not be processed."
+                    #         "Please modify the prompt and try again.")
+                    if artifact.type == generation.ARTIFACT_IMAGE:
+                        img = Image.open(io.BytesIO(artifact.binary))
+                        unique_filename = str(uuid.uuid4())
+                        img.save(
+                            str(unique_filename) + ".png"
+                        )  # Save our generated images with their seed number as the filename.
+            return unique_filename + ".png"
+        except Exception as error:
+            print(str(error))
+            raise gr.Error(
+                "An error occurred while generating the image. Please try again."
+            )

llm_models.py CHANGED Viewed

@@ -3,126 +3,157 @@ from openai import OpenAI
 from pydantic import BaseModel
 from typing import List
 from image_generator import get_image
 class StepByStepAIResponse(BaseModel):
     title: str
     story_segments: List[str]
     image_prompts: List[str]
 class GetTranslation(BaseModel):
     translated_text: List[str]
-client = OpenAI(api_key="key_here")
-def generate_story(k, prompt):
-        """ Generate a story with k segments and initial prompt"""
-        response = client.chat.completions.create(
-        model="gpt-4-turbo-preview",
-        messages=[
-            {
-            "role": "system",
-            "content": f"""
-            Your expertise lies in weaving captivating narratives for children, complemented by images that vividly bring each tale to life. Embark on a creative endeavor to construct a story segmented into {k} distinct chapters, each a cornerstone of an enchanting journey for a young audience.
-            The input prompt will be on Arabic, but the output must be in English.
-            **Task Overview**:
-            1. **Story Development**:
-            - Craft a narrative divided into {k} parts, with a strict 50-word limit for each.
-            - Start with an engaging introduction that lays the foundation for the adventure.
-            - Ensure each part naturally progresses from the previous, crafting a fluid story that escalates to an exhilarating climax.
-            - Wrap up the narrative with a gratifying conclusion that ties all story threads together.
-            - Keep character continuity intact across the story, with consistent presence from beginning to end.
-            - You must describe the characters in details in every image prompt.
-            - Use language and themes that are child-friendly, imbued with wonder, and easy to visualize.
-            - The story will talk about {prompt}
-            2. **Image Generation Instructions for Image Models**:
-            - For every story part, create a comprehensive prompt for generating an image that encapsulates the scene's essence. Each prompt should:
-                - Offer a detailed description of the scene, characters, and critical elements, providing enough specificity for the image model to create a consistent and coherent visual.
-                - Request the images be in an anime style to ensure visual consistency throughout.
-                - Given the image model's isolated processing, reintroduce characters, settings, and pivotal details in each prompt to maintain narrative and visual continuity.
-                - Focus on visual storytelling components that enhance the story segments, steering clear of direct text inclusion in the images.
-            **Key Points**:
-            - Due to the image model's lack of recall, stress the need for self-contained prompts that reintroduce crucial elements each time. This strategy guarantees that, although generated independently, each image mirrors a continuous and cohesive visual story.
-            Through your skill in melding textual and visual storytelling, you will breathe life into this magical tale, offering young readers a journey to remember through both prose and illustration.
-            """
-            },
-        ],
-        functions=[
-            {
-                "name": "get_story_segments_and_image_prompts",
-                "description": "Get user answer in series of segment and image prompts",
-                "parameters": StepByStepAIResponse.model_json_schema(),
-            }
-        ],
-        function_call={"name": "get_story_segments_and_image_prompts"},  # Corrected to match the defined function name
-        temperature=1,
-        max_tokens=1000,
-        top_p=1,
-        frequency_penalty=0,
-        presence_penalty=0
-        )
-        output = json.loads(response.choices[0].message.function_call.arguments)
-        sbs = StepByStepAIResponse(**output)
-        return sbs
-def get_Arabic_translation(story_segments):
-        response = client.chat.completions.create(
-        model="gpt-4-turbo-preview",
-        messages=[
-            {
-            "role": "system",
-            "content":
-            f"""
-            You are an expert translator of text from English to Arabic.
-            On the following, you can find the input text that you need to translate to Arabic:
-            {story_segments}
-            Translate it from English to Arabic.
-            """
-            },
-        ],
-        functions=[
-            {
-                "name": "translate_text_from_english_to_arabic",
-                "description": "Translate the text from English to Arabic.",
-                "parameters": GetTranslation.model_json_schema(),
-            }
-        ],
-        function_call={"name": "translate_text_from_english_to_arabic"},  # Corrected to match the defined function name
-        temperature=1,
-        max_tokens=1000,
-        top_p=1,
-        frequency_penalty=0,
-        presence_penalty=0
-        )
-        output = json.loads(response.choices[0].message.function_call.arguments)
-        sbs = GetTranslation(**output)
-        return sbs
-def get_text_image_pairs(k, prompt):
-    describtion = generate_story(k, prompt)
-    segements_translation = get_Arabic_translation(describtion.story_segments)
-    images_names = [get_image(itm) for itm in describtion.image_prompts]
-    return (segements_translation.translated_text, images_names)

 from pydantic import BaseModel
 from typing import List
 from image_generator import get_image
+import gradio as gr
 class StepByStepAIResponse(BaseModel):
     title: str
     story_segments: List[str]
     image_prompts: List[str]
 class GetTranslation(BaseModel):
     translated_text: List[str]
+def generate_story(k, prompt, api_key):
+    """Generate a story with k segments and initial prompt"""
+    if api_key == "":
+        raise gr.Error("Please add your OpenAI API key ")
+    else:
+        try:
+            client = OpenAI(api_key=api_key)
+            response = client.chat.completions.create(
+                model="gpt-4-turbo-preview",
+                messages=[
+                    {
+                        "role": "system",
+                        "content": f"""
+                    Your expertise lies in weaving captivating narratives for children, complemented by images that vividly bring each tale to life. Embark on a creative endeavor to construct a story segmented into {k} distinct chapters, each a cornerstone of an enchanting journey for a young audience.
+                    The input prompt will be on Arabic, but the output must be in English.
+                    **Task Overview**:
+                    1. **Story Development**:
+                    - Craft a narrative divided into {k} parts, with a strict 50-word limit for each.
+                    - Start with an engaging introduction that lays the foundation for the adventure.
+                    - Ensure each part naturally progresses from the previous, crafting a fluid story that escalates to an exhilarating climax.
+                    - Wrap up the narrative with a gratifying conclusion that ties all story threads together.
+                    - Keep character continuity intact across the story, with consistent presence from beginning to end.
+                    - You must describe the characters in details in every image prompt.
+                    - Use language and themes that are child-friendly, imbued with wonder, and easy to visualize.
+                    - The story will talk about {prompt}
+                    2. **Image Generation Instructions for Image Models**:
+                    - For every story part, create a comprehensive prompt for generating an image that encapsulates the scene's essence. Each prompt should:
+                        - Offer a detailed description of the scene, characters, and critical elements, providing enough specificity for the image model to create a consistent and coherent visual.
+                        - Request the images be in an anime style to ensure visual consistency throughout.
+                        - Given the image model's isolated processing, reintroduce characters, settings, and pivotal details in each prompt to maintain narrative and visual continuity.
+                        - Focus on visual storytelling components that enhance the story segments, steering clear of direct text inclusion in the images.
+                    **Key Points**:
+                    - Due to the image model's lack of recall, stress the need for self-contained prompts that reintroduce crucial elements each time. This strategy guarantees that, although generated independently, each image mirrors a continuous and cohesive visual story.
+                    Through your skill in melding textual and visual storytelling, you will breathe life into this magical tale, offering young readers a journey to remember through both prose and illustration.
+                    """,
+                    },
+                ],
+                functions=[
+                    {
+                        "name": "get_story_segments_and_image_prompts",
+                        "description": "Get user answer in series of segment and image prompts",
+                        "parameters": StepByStepAIResponse.model_json_schema(),
+                    }
+                ],
+                function_call={
+                    "name": "get_story_segments_and_image_prompts"
+                },  # Corrected to match the defined function name
+                temperature=1,
+                max_tokens=1000,
+                top_p=1,
+                frequency_penalty=0,
+                presence_penalty=0,
+            )
+            output = json.loads(response.choices[0].message.function_call.arguments)
+            sbs = StepByStepAIResponse(**output)
+            return sbs
+        except Exception as error:
+            print(str(error))
+            raise gr.Error(
+                "An error occurred while generating the story. Please try again."
+            )
+def get_Arabic_translation(story_segments, api_key):
+    if api_key == "":
+        raise gr.Error("Please add your OpenAI API key ")
+    else:
+        try:
+            client = OpenAI(api_key=api_key)
+            response = client.chat.completions.create(
+                model="gpt-4-turbo-preview",
+                messages=[
+                    {
+                        "role": "system",
+                        "content": f"""
+                        You are an expert translator of text from English to Arabic.
+                        On the following, you can find the input text that you need to translate to Arabic:
+                        {story_segments}
+                        Translate it from English to Arabic.
+                        """,
+                    },
+                ],
+                functions=[
+                    {
+                        "name": "translate_text_from_english_to_arabic",
+                        "description": "Translate the text from English to Arabic.",
+                        "parameters": GetTranslation.model_json_schema(),
+                    }
+                ],
+                function_call={
+                    "name": "translate_text_from_english_to_arabic"
+                },  # Corrected to match the defined function name
+                temperature=1,
+                max_tokens=1000,
+                top_p=1,
+                frequency_penalty=0,
+                presence_penalty=0,
+            )
+            output = json.loads(response.choices[0].message.function_call.arguments)
+            sbs = GetTranslation(**output)
+            return sbs
+        except Exception as error:
+            print(str(error))
+            raise gr.Error(
+                "An error occurred while translating the text. Please try again."
+            )
+def get_text_image_pairs(k, prompt, api_key_openai, api_key_stability_ai):
+    describtion = generate_story(k, prompt, api_key_openai)
+    segements_translation = get_Arabic_translation(
+        describtion.story_segments, api_key_openai
+    )
+    images_names = [
+        get_image(itm, api_key_stability_ai) for itm in describtion.image_prompts
+    ]
+    return (segements_translation.translated_text, images_names)

requirements.txt CHANGED Viewed

@@ -4,3 +4,4 @@ openai==1.12.0
 pydantic==2.6.1
 rich==13.7.0
 stability-sdk==0.8.5

 pydantic==2.6.1
 rich==13.7.0
 stability-sdk==0.8.5
+black==24.4.2