PRefLexOR

PRefLexOR

by MIT
PRefLexOR is a self-learning AI framework by MIT that combines preference optimization and reinforcement learning to improve reasoning through iterative inference.

What is PRefLexOR?

PRefLexOR (Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning) is a self-learning AI framework developed by MIT. It integrates preference optimization and reinforcement learning to enhance reasoning through iterative inference. The framework uses a recursive reasoning algorithm, enabling the model to perform multi-step reasoning, review, and improve intermediate steps during training and inference, resulting in more accurate outputs.

Key Features of PRefLexOR

  • Dynamic Knowledge Graph Construction: The framework dynamically generates tasks and reasoning steps to construct knowledge graphs in real-time, allowing continuous adaptation to new tasks.
  • Cross-Domain Reasoning Capability: PRefLexOR can integrate and reason across different domains, such as materials science, to generate new design principles.
  • Self-Learning and Evolution: Through recursive optimization and real-time feedback, PRefLexOR can self-teach during training, continuously improving its reasoning strategies.

Technical Principles of PRefLexOR

  • Recursive Reasoning and Reflection: PRefLexOR introduces "thought tokens" and "reflection tokens" to mark intermediate steps and reflection phases during reasoning, improving response accuracy.
  • Preference Optimization: The framework uses Odds Ratio Preference Optimization (ORPO) and Direct Preference Optimization (DPO) to align reasoning paths with human preferences and enhance reasoning quality.
  • Multi-Stage Training: PRefLexOR's training is divided into multiple stages, first aligning reasoning paths through ORPO and then optimizing reasoning quality through DPO.

Application Scenarios of PRefLexOR

  • Materials Science and Design: PRefLexOR demonstrates strong reasoning capabilities in materials science, using dynamic problem generation and Retrieval-Augmented Generation (RAG) techniques.
  • Cross-Domain Reasoning: The framework can integrate knowledge from different domains for cross-domain reasoning and decision-making.
  • Open-Domain Problem Solving: As a reinforcement learning-based self-learning system, PRefLexOR can solve open-domain problems through iterative optimization and feedback-driven learning.
  • Generative Materials Informatics: PRefLexOR can generate materials informatics workflows, transforming information into knowledge and actionable results.

Project Links for PRefLexOR

Framework Features

Supported Tasks
Reasoning Optimization Dynamic Knowledge Graph Construction Cross-Domain Reasoning Open-Domain Problem Solving Generative Materials Informatics
Tags
AI Framework Self-Learning Reasoning Optimization Reinforcement Learning Preference Optimization Recursive Reasoning Dynamic Knowledge Graphs Cross-Domain Reasoning Materials Science Open-Domain Problem Solving

Getting Started

Pricing
free

Screenshots & Images

Additional Images

Stats

0 Views
0 Favorites
202 GitHub Stars

Community & Support

Similar Frameworks

TPO
0
Phantom by ByteDance
0
AgentSociety by Tsinghua University
0