Update README.md

c68a757 verified about 2 months ago

7.71 kB

	---
	license: apache-2.0
	library_name: transformers
	inference: false
	base_model: AIDC-AI/Marco-o1
	tags:
	- llama-cpp
	- gguf-my-repo
	---

	# Triangle104/Marco-o1-Q5_K_S-GGUF
	This model was converted to GGUF format from [`AIDC-AI/Marco-o1`](https://huggingface.co/AIDC-AI/Marco-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/AIDC-AI/Marco-o1) for more details on the model.

	---
	Model details:
	-
	Marco-o1 not only focuses on disciplines with
	standard answers, such as mathematics, physics, and coding—which are
	well-suited for reinforcement learning (RL)—but also places greater
	emphasis on open-ended resolutions. We aim to address the question: "Can
	the o1 model effectively generalize to broader domains where clear
	standards are absent and rewards are challenging to quantify?"

	Currently, Marco-o1 Large Language Model (LLM) is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and _innovative reasoning strategies_—optimized for complex real-world problem-solving tasks.


	⚠️ Limitations: We would like to emphasize that
	this research work is inspired by OpenAI's o1 (from which the name is
	also derived). This work aims to explore potential approaches to shed
	light on the currently unclear technical roadmap for large reasoning
	models. Besides, our focus is on open-ended questions, and we have
	observed interesting phenomena in multilingual applications. However, we
	must acknowledge that the current model primarily exhibits o1-like
	reasoning characteristics and its performance still fall short of a
	fully realized "o1" model. This is not a one-time effort, and we remain
	committed to continuous optimization and ongoing improvement.










	🚀 Highlights




	Currently, our work is distinguished by the following highlights:


	🍀 Fine-Tuning with CoT Data: We develop Marco-o1-CoT by performing
	full-parameter fine-tuning on the base model using open-source CoT
	dataset combined with our self-developed synthetic data.
	🍀 Solution Space Expansion via MCTS: We integrate LLMs with MCTS
	(Marco-o1-MCTS), using the model's output confidence to guide the search
	and expand the solution space.
	🍀 Reasoning Action Strategy: We implement novel reasoning action
	strategies and a reflection mechanism (Marco-o1-MCTS Mini-Step),
	including exploring different action granularities within the MCTS
	framework and prompting the model to self-reflect, thereby significantly
	enhancing the model's ability to solve complex problems.
	🍀 Application in Translation Tasks: We are the first to apply Large
	Reasoning Models (LRM) to Machine Translation task, exploring inference
	time scaling laws in the multilingual and translation domain.


	OpenAI recently introduced the groundbreaking o1 model, renowned for
	its exceptional reasoning capabilities. This model has demonstrated
	outstanding performance on platforms such as AIME, CodeForces,
	surpassing other leading models. Inspired by this success, we aimed to
	push the boundaries of LLMs even further, enhancing their reasoning
	abilities to tackle complex, real-world challenges.


	🌍 Marco-o1 leverages advanced techniques like CoT fine-tuning, MCTS,
	and Reasoning Action Strategies to enhance its reasoning power. As
	shown in Figure 2, by fine-tuning Qwen2-7B-Instruct with a combination
	of the filtered Open-O1 CoT dataset, Marco-o1 CoT dataset, and Marco-o1
	Instruction dataset, Marco-o1 improved its handling of complex tasks.
	MCTS allows exploration of multiple reasoning paths using confidence
	scores derived from softmax-applied log probabilities of the top-k
	alternative tokens, guiding the model to optimal solutions. Moreover,
	our reasoning action strategy involves varying the granularity of
	actions within steps and mini-steps to optimize search efficiency and
	accuracy.





	Figure 2: The overview of Marco-o1.





	🌏 As shown in Figure 3, Marco-o1 achieved accuracy improvements of
	+6.17% on the MGSM (English) dataset and +5.60% on the MGSM (Chinese)
	dataset, showcasing enhanced reasoning capabilities.





	Figure 3: The main results of Marco-o1.





	🌎 Additionally, in translation tasks, we demonstrate that Marco-o1
	excels in translating slang expressions, such as translating "这个鞋拥有踩屎感"
	(literal translation: "This shoe offers a stepping-on-poop sensation.")
	to "This shoe has a comfortable sole," demonstrating its superior grasp
	of colloquial nuances.





	Figure 4: The demostration of translation task using Marco-o1.





	For more information,please visit our Github.







	Usage




	Load Marco-o1-CoT model:


	# Load model directly
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("AIDC-AI/Marco-o1")
	model = AutoModelForCausalLM.from_pretrained("AIDC-AI/Marco-o1")






	Inference:


	Execute the inference script (you can give any customized inputs inside):


	./src/talk_with_model.py

	# Use vLLM
	./src/talk_with_model_vllm.py










	👨🏻‍💻 Acknowledgement









	Main Contributors




	From MarcoPolo Team, AI Business, Alibaba International Digital Commerce:


	Yu Zhao
	Huifeng Yin
	Hao Wang
	Longyue Wang







	Citation




	If you find Marco-o1 useful for your research and applications, please cite:


	@misc{zhao2024marcoo1openreasoningmodels,
	title={Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions},
	author={Yu Zhao and Huifeng Yin and Bo Zeng and Hao Wang and Tianqi Shi and Chenyang Lyu and Longyue Wang and Weihua Luo and Kaifu Zhang},
	year={2024},
	eprint={2411.14405},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2411.14405},
	}








	LICENSE




	This project is licensed under Apache License Version 2 (SPDX-License-identifier: Apache-2.0).







	DISCLAIMER




	We used compliance checking algorithms during the training process,
	to ensure the compliance of the trained model and dataset to the best of
	our ability. Due to complex data and the diversity of language model
	usage scenarios, we cannot guarantee that the model is completely free
	of copyright issues or improper content. If you believe anything
	infringes on your rights or generates improper content, please contact
	us, and we will promptly address the matter.

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/Marco-o1-Q5_K_S-GGUF --hf-file marco-o1-q5_k_s.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/Marco-o1-Q5_K_S-GGUF --hf-file marco-o1-q5_k_s.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/Marco-o1-Q5_K_S-GGUF --hf-file marco-o1-q5_k_s.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/Marco-o1-Q5_K_S-GGUF --hf-file marco-o1-q5_k_s.gguf -c 2048
	```