lamm-mit
/

PRefLexOR_ORPO_DPO_EXO_10242024

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mjbuehler commited on Oct 25

Commit

f9768ed

•

1 Parent(s): ba01de2

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -19,6 +19,9 @@ Figure 1: Illustration of the workflow and design principles behind generative m
 Figure 2: PRefLexOR Recursive Reasoning Algorithm: An iterative approach leveraging a fine-tuned Reasoning Model and a general-purpose Critic Model to generate, refine, and optionally integrate responses. The process involves generating initial responses, extracting reflections, improving thinking processes, and creating new responses based on refined thinking, with an optional final integration step. The algorithm relies on extracting thinking processes (indicated via ```<|thinking|>...<|/thinking|>```) and reflection processes  (indicated via ```<|reflect|>...<|/reflect|>```). The use of special tokens allows us to easily construct such agentic modeling as it facilitates pausing inference, improving the strategy, and re-generating improved answers. The sampled responses can either be used in their final state or integrated into an amalgamated response that shows very rich facets in the scientific process.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer

 Figure 2: PRefLexOR Recursive Reasoning Algorithm: An iterative approach leveraging a fine-tuned Reasoning Model and a general-purpose Critic Model to generate, refine, and optionally integrate responses. The process involves generating initial responses, extracting reflections, improving thinking processes, and creating new responses based on refined thinking, with an optional final integration step. The algorithm relies on extracting thinking processes (indicated via ```<|thinking|>...<|/thinking|>```) and reflection processes  (indicated via ```<|reflect|>...<|/reflect|>```). The use of special tokens allows us to easily construct such agentic modeling as it facilitates pausing inference, improving the strategy, and re-generating improved answers. The sampled responses can either be used in their final state or integrated into an amalgamated response that shows very rich facets in the scientific process.
+PRefLexOR Inference: Thinking and Agentic Reflection
+[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lamm-mit/PRefLexOR/blob/main/PRefLexOR_inference_thinking.ipynb)
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer