{"id":51285,"date":"2024-06-19T13:57:32","date_gmt":"2024-06-19T11:57:32","guid":{"rendered":"https:\/\/www.inovex.de\/?p=51285"},"modified":"2024-06-19T13:57:32","modified_gmt":"2024-06-19T11:57:32","slug":"mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning","status":"publish","type":"post","link":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/","title":{"rendered":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning"},"content":{"rendered":"<p>In the dynamic world of artificial intelligence, Large Language Models (LLMs) have emerged as groundbreaking tools, offering exciting possibilities for innovation and research. However, their effectiveness is often hampered by limitations in handling tasks that demand a deeper understanding beyond text or require nuanced common-sense reasoning and extensive world knowledge. Addressing these limitations is crucial for advancing towards the goal of Artificial General Intelligence (AGI).<\/p>\n<p>This blog post introduces a novel framework designed to empower LLMs with more sophisticated decision-making abilities through the strategic use of advanced planning algorithms. Specifically, this approach is particularly exemplified through a case study on Visual Question Answering (VQA) using Monte Carlo Tree Search (MCTS), demonstrating the power of our framework in enabling LLMs to act more autonomously and effectively within an environment.<!--more--><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\"><\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#LLMs-as-Agents\" >LLMs as Agents<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Instructing-and-Reasoning\" >Instructing and Reasoning<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#MCTS-as-Environment-and-Tools\" >MCTS as Environment and Tools<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Case-Study-WebQA-Tackling-a-Multi-modal-Multi-hop-Question-Answering-Benchmark\" >Case Study WebQA: Tackling a Multi-modal, Multi-hop Question Answering Benchmark<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Task-Description\" >Task Description<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Why-is-it-hard-and-important\" >Why is it hard (and important)?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Approach\" >Approach<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Results-and-Findings\" >Results and Findings<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#Broader-Picture-Future-Uses-of-LLMs\" >Broader Picture: Future Uses of LLMs<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"LLMs-as-Agents\"><\/span>LLMs as Agents<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The term \u201cagents\u201c might sound familiar to many who have followed the Machine Learning (ML) community for longer, since it was already frequently used in the context of Reinforcement Learning (RL) back when DeepMind introduced AlphaZero and AlphaGo. The agents back then were manifested by relatively small neural networks, compared to the LLM powerhouses we have today. Thus, the question arises: why not use even more powerful networks in the form of LLMs?<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Instructing-and-Reasoning\"><\/span>Instructing and Reasoning<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>An agent executes actions and observes their consequences, meaning how the world changes and what feedback it receives \u2013 which can be thought of as a reward or punishment. Doing so requires two \u201cskills\u201c for LLMs: understanding the instructions and being able to reason about the information available. Since we all are familiar with GPT by now, we know exactly what that means: you input text, alias the prompt, and you get text as output.<\/p>\n<p>Being able to understand instructions is crucial to not only understand the task at hand but also to concisely follow the instructions written in the prompt. In general, both recognizing the task and its demands are intuitive for humans but a lot harder to infuse into an ML model.<\/p>\n<figure id=\"attachment_51288\" aria-describedby=\"caption-attachment-51288\" style=\"width: 940px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-51288 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/cot.png\" alt=\"\" width=\"940\" height=\"473\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/cot.png 940w, https:\/\/www.inovex.de\/wp-content\/uploads\/cot-300x151.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/cot-768x386.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/cot-400x201.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/cot-360x181.png 360w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><figcaption id=\"caption-attachment-51288\" class=\"wp-caption-text\">Left: the standard way of questioning an LLM. Right: elicit \u201cthought\u201c behavior, as part of the prompt.<\/figcaption><\/figure>\n<p>Given we understand that some task needs to be done, we want to instruct it with a simple task. What does simple even mean? In the figure above on the left side, we can see that a seemingly simple task is answered incorrectly. This started the research community to explore ways to tackle such mistakes with prompting techniques to avoid touching the model parameters which in the case of proprietary models might not even be possible.<\/p>\n<p>The approach on the right, called Chain-of-Thought prompting, is one such example that changes the prompt structure and aims to elicit a more thoughtful response. Note that we still deal with a black box mode that is far from perfect but it is a good start to make the LLM behave in an intended way.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"MCTS-as-Environment-and-Tools\"><\/span>MCTS as Environment and Tools<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Now that we talked about the agent, we need to define the \u201cworld\u201c or environment it operates in. This requires setting the rules and settings of the world this agent interacts with. For finite, well-defined games like chess this is quite simple and can be hardcoded (we ignore the value estimation process since out of scope) as we can check what are illegal moves, who won in the end, etc. Specifically, to plan strategically one needs to foresee possible consequences of actions, preferably tested in parallel, and such simulations, or rollouts as RL enthusiasm might remember, should be given a reasonable reward.<\/p>\n<figure id=\"attachment_52436\" aria-describedby=\"caption-attachment-52436\" style=\"width: 1600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-52436 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1.jpg\" alt=\"\" width=\"1600\" height=\"900\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1.jpg 1600w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-300x169.jpg 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-1024x576.jpg 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-768x432.jpg 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-1536x864.jpg 1536w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-400x225.jpg 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-720x406.jpg 720w, https:\/\/www.inovex.de\/wp-content\/uploads\/rl-mcts-1-360x203.jpg 360w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\" \/><figcaption id=\"caption-attachment-52436\" class=\"wp-caption-text\">Left: RL formulation of action, state, and reward with LLM as agent. Environment dynamics are handled by MCTS, making updates to the tree based on LLMs decisions. Right: Four stages of MCTS that grow the tree, progressing and learning over time.<\/figcaption><\/figure>\n<p>Repeating this process of exploring different actions and analyzing the consequences and being able to do so iteratively allows the agent to refine its decision-making, much like trial and error learning.<\/p>\n<p>Additionally, actions might sometimes require functionality or capabilities not directly built into the agent. Hence, utilizing tools has become something that researchers tried to pair LLMs with. These tools can manifest in various forms, including web search, API utilization, and, more recently, instructing robotic systems through textual commands provided by the LLM itself, thereby bridging the gap between software and hardware.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Case-Study-WebQA-Tackling-a-Multi-modal-Multi-hop-Question-Answering-Benchmark\"><\/span>Case Study WebQA: Tackling a Multi-modal, Multi-hop Question Answering Benchmark<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Wow, that was a mouthful! We will first break down what we mean by that, alias what the task is about exactly and its characteristics, as well as how exactly an LLM can be seen as an agent to tackle the given task.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Task-Description\"><\/span>Task Description<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>WebQA is a VQA benchmark, this means the objective is to provide answers to textual questions by leveraging images as potential contextual cues, which may contain partial or complete information required to answer the overall question.<\/p>\n<p>Multimodal denotes the property wherein the input encompasses not only textual data but also incorporates visual elements. Multihop, on the other hand, signifies the necessity of gathering information from multiple sources \u2013 be it text or image \u2013 to obtain all the necessary data. Lastly, the predicted answer is expected to be fluent text, which poses a greater challenge compared to its multiple-choice counterpart. This is because you not only have to come up with a reasonable answer but also one that matches the ground truth as closely as possible.<\/p>\n<figure id=\"attachment_52439\" aria-describedby=\"caption-attachment-52439\" style=\"width: 800px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-52439\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/webqa-example.png\" alt=\"\" width=\"800\" height=\"497\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/webqa-example.png 550w, https:\/\/www.inovex.de\/wp-content\/uploads\/webqa-example-300x187.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/webqa-example-400x249.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/webqa-example-360x224.png 360w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><figcaption id=\"caption-attachment-52439\" class=\"wp-caption-text\">Example of a WebQA question, showing all possible sources to choose from. The correct ones are indicated by the green check mark.<\/figcaption><\/figure>\n<p>The figure above shows all the elements we just discussed. Given many potential sources, the task involves identifying the correct ones, followed by a process of reasoning and ultimately formulating an answer based on the information gathered.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Why-is-it-hard-and-important\"><\/span><strong>Why is it hard (and important)?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">At first glance, VQA sounds a lot simpler than chess: we look at texts and images and have to answer some questions. As a human, one would first identify relevant sources (retrieval stage) and then combine the knowledge from those sources to answer the question (question-answering stage). While indeed humans can do those tasks relatively well, LLMs struggle due to multiple reasons:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Lack of information: Although trained on an impressive amount of text, for complex questions the answer probably cannot be found internally. Thus, the LLM needs access to an outside database since in its plain form it only can rely on its \u201cinternal\u201c knowledge.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Understanding images: Paired with our first point, plain LLMs are not capable of interpreting and loading images. There are now more capable multimodal models, however, they are still a work in progress. Integrating different modalities is challenging since it brings even more considerations regarding how you train the models, how you evaluate the models etc.<\/span><\/li>\n<\/ol>\n<p>Upon overcoming those challenges, we will have models adept at comprehending both text and images and capable of reasoning over them. With such capabilities, we can harness these powerful machines to automate tasks and provide support across various domains.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Approach\"><\/span>Approach<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Now, let&#8217;s consolidate all aspects and components, bringing the LLM as VQA Agent to life! From a high level, the structure mirrors the classic ML approach: feed some inputs into an algorithm and expect some output.<\/span><\/p>\n<p>In our scenario, this black box is the fusion of the LLM acting as an agent within the MCTS environment we defined. Here, MCTS dynamically constructs a tree structure based on the actions undertaken by the LLM, thereby generating new states that either add new information or lead to a conclusive answer to the overarching question. Notably, there can be multiple \u201cAnswer\u201c states, which are called terminal states, however, the one with the highest total reward is chosen when the algorithm stops.<\/p>\n<p>The reward itself is determined by the LLM, which evaluates the utility of the action within the current state. While this method is straightforward and cost-effective, it is suboptimal and presents an opportunity for improvement.<\/p>\n<p>The retrieval system, so to speak, functions as the toolset for tasks beyond the capabilities of the LLM. Our retrieval system comprises a standard vector database, facilitated by an embedder and a vision-language model to translate information from the image to the text domain. Conversely, answering entails a purely textual task, requiring no external tool.<\/p>\n<p>The intuition behind this approach lies in the LLM&#8217;s ability to make decisions regarding two key aspects:<\/p>\n<p>1) Assessing the quality of a state, or the contextual relevance to answer a question.<br \/>\n2) Evaluating whether additional information is required, identifying missing pieces, and formulating queries to address these gaps.<\/p>\n<p>This incremental nature aligns seamlessly with the multi-hop requirement of WebQA, where multiple pieces of information may be needed to fully address a question.<\/p>\n<figure id=\"attachment_52474\" aria-describedby=\"caption-attachment-52474\" style=\"width: 1600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-52474 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/overview.jpg\" alt=\"\" width=\"1600\" height=\"900\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/overview.jpg 1600w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-300x169.jpg 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-1024x576.jpg 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-768x432.jpg 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-1536x864.jpg 1536w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-400x225.jpg 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-720x406.jpg 720w, https:\/\/www.inovex.de\/wp-content\/uploads\/overview-360x203.jpg 360w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\" \/><figcaption id=\"caption-attachment-52474\" class=\"wp-caption-text\">(Simplified) Overview: in between the inputs and the output we have our algorithm using MCTS for the environment dynamics and the LLM as an agent, making decisions about and executing actions at each decision point alias node.<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">As components, we have the agent, represented by the LLM, along with additional models as tools that provide relevant evidence. The green and blue colors indicate which model is active at each step. In our core setup, the actions available are \u201cRetrieve\u201c and \u201cAnswer\u201c; at each node, the model can choose a fixed number of actions (in our example, two actions). The \u201cAnswer\u201c action marks a terminal state, indicating that the tree search stops at that point. As depicted, there are multiple terminal states (\u201cAnswer\u201c nodes) in the tree. The winning node is the terminal state with the highest total reward, which utilizes the reward to estimate the best possible sequence of actions. It is essential to note that this is a simplified case where only two types of actions are considered; however, one has the freedom to design any actions they deem appropriate for the task<\/span><\/p>\n<p>Once again, let&#8217;s discuss actions within this framework. Each action leads to a new state. In the case of \u201cAnswer\u201c, the model is prompted to provide a final solution to the question based on the evidence collected thus far. For \u201cRetrieve\u201c, the process is more intricate: The reasoner formulates a textual query, which is then sent to the retriever. The retriever returns the highest-scoring evidence in text format. If the selected evidence includes images, a dedicated multimodal model extracts relevant information from those images. Since \u201cRetrieve\u201c is not a terminal node, the algorithm continues by asking the reasoner to select actions from the available set of actions.<\/p>\n<p><span style=\"font-weight: 400;\">As one can see, the model is free to gather evidence it perceives as lacking, thereby iteratively constructing a working memory. Simultaneously, the model explores multiple potential paths, rendering it more failure-tolerant compared to, for instance, Chain-of-Thought. In Chain-of-Thought, once derailed, recovery from accumulated mistakes is typically challenging.<\/span><\/p>\n<p>Furthermore, this framework is highly adaptable, allowing for the straightforward definition of new actions, customization of reward structures, and exploration of various parameters such as those governing the tree search. For instance, you can adjust the number of actions selected per step, thereby balancing between speed and exploration.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Results-and-Findings\"><\/span><strong>Results and Findings<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>We investigated how well our approach works when compared against methods that are fine-tuned and specifically tailored only for WebQA. We examined the effectiveness of our approach by comparing it against methods fine-tuned and explicitly tailored for WebQA. Our approach demonstrates comparable performance while offering greater flexibility and ease of investigation. This is because we can analyze all actions and their associated rewards, enabling us to pinpoint where the LLM might have taken the wrong turn and make adjustments accordingly.<\/p>\n<p>The integration of MCTS with LLMs in VQA represents a novel approach that offers several advantages:<\/p>\n<p>1. Flexibility: Our framework can adapt to various question types and visual contexts, showcasing broad applicability across different VQA scenarios.<\/p>\n<p>2. Extensibility: The architecture of our system is designed for scalability, facilitating seamless integration of additional modules or updates as the field progresses.<\/p>\n<p>3. Robust Decision-Making: Through the use of MCTS, our approach excels in navigating complex decision spaces, enabling more nuanced and contextually appropriate responses. In contrast to CoT, MCTS allows the LLM to correct itself, which is not possible for CoT.<\/p>\n<p>4. Generalizability: Unlike models that necessitate extensive fine-tuning on specific datasets, our framework maintains a degree of generalizability, performing well across diverse datasets without extensive dataset-specific optimization.<\/p>\n<p>However, we also realized that despite our advancements, there are persisting challenges. Error case propagation, for instance, occurs when a flawed query yields unreliable sources, potentially leading to erroneous conclusions. Despite our progress, current models continue to struggle with accurately interpreting instructions. Moreover, hallucinations and overconfidence persist as challenges, potentially skewing the accuracy of generated responses.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Broader-Picture-Future-Uses-of-LLMs\"><\/span>Broader Picture: Future Uses of LLMs<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Our framework&#8217;s performance not only showcases the flexibility of Large Language Models (LLMs) but also underscores their potential to serve as foundational tools for a wide range of tasks. The ability of our framework to adapt and integrate within the scope of VQA is a testament to the versatility of LLMs. The integration of MCTS with LLMs represents a significant step towards harnessing the computational and contextual strengths of these models, demonstrating that the power of LLMs extends well beyond mere language processing tasks. Such advancements will pave the way for more sophisticated and real-life use cases, such as interactive robots, advanced medical diagnostics, and enhanced virtual assistants.<\/p>\n<p>We anticipate the emergence of more sophisticated algorithms that enhance the applicability of LLMs in production environments. The implications for real-world applications are vast and varied. From improving accessibility technology to creating more immersive educational tools, the potential is boundless.<\/p>\n<p>If you are interested in collaborating on use cases or exploring how LLMs can used to improve your business, feel free to reach out to us at inovex!<\/p>\n<p>In conclusion, our framework is a precursor to a broader adoption and refinement of LLMs across tasks and modalities. It is a clear indication that as we move forward, the integration of complex algorithms with LLMs will not only be common but also essential for creating versatile, efficient, and effective AI systems. We stand on the brink of a new era in AI, characterized by a profound and intuitive understanding of the world around us, driven by the remarkable advancements in LLMs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the dynamic world of artificial intelligence, Large Language Models (LLMs) have emerged as groundbreaking tools, offering exciting possibilities for innovation and research. However, their effectiveness is often hampered by limitations in handling tasks that demand a deeper understanding beyond text or require nuanced common-sense reasoning and extensive world knowledge. Addressing these limitations is crucial [&hellip;]<\/p>\n","protected":false},"author":192,"featured_media":54874,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"ep_exclude_from_search":false,"footnotes":""},"tags":[511,206,393,578,258],"service":[76,75],"coauthors":[{"id":192,"display_name":"Andreas Binder","user_nicename":"abinder"},{"id":311,"display_name":"Johanna Heinz","user_nicename":"jheinz2"},{"id":52,"display_name":"Florian Wilhelm","user_nicename":"fwilhelm"}],"class_list":["post-51285","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","tag-artificial-intelligence-2","tag-data-science","tag-image-retrieval","tag-information-extraction","tag-reinforcement-learning","service-artificial-intelligence","service-nlp"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH<\/title>\n<meta name=\"description\" content=\"Explore how our novel framework enhances LLMs&#039; decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH\" \/>\n<meta property=\"og:description\" content=\"Explore how our novel framework enhances LLMs&#039; decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/\" \/>\n<meta property=\"og:site_name\" content=\"inovex GmbH\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inovexde\" \/>\n<meta property=\"article:published_time\" content=\"2024-06-19T11:57:32+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1510\" \/>\n\t<meta property=\"og:image:height\" content=\"890\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andreas Binder, Johanna Heinz, Florian Wilhelm\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS-1024x604.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:site\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andreas Binder\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"12\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label3\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data3\" content=\"Andreas Binder, Johanna Heinz, Florian Wilhelm\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/\"},\"author\":{\"name\":\"Andreas Binder\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/ac4b0328408e4f08c3775b34a64ff887\"},\"headline\":\"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning\",\"datePublished\":\"2024-06-19T11:57:32+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/\"},\"wordCount\":2394,\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/Blogheader-MCTS.jpg\",\"keywords\":[\"Artificial Intelligence\",\"Data Science\",\"Image Retrieval\",\"Information Extraction\",\"Reinforcement Learning\"],\"articleSection\":[\"Analytics\"],\"inLanguage\":\"de\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/\",\"name\":\"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/Blogheader-MCTS.jpg\",\"datePublished\":\"2024-06-19T11:57:32+00:00\",\"description\":\"Explore how our novel framework enhances LLMs' decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/Blogheader-MCTS.jpg\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/Blogheader-MCTS.jpg\",\"width\":1510,\"height\":890,\"caption\":\"Grafik: Drei Menschen, die an einem Chatbot arbeiten, der mit einer Monte Carlo Tree Search arbeitet.\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"name\":\"inovex GmbH\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\",\"name\":\"inovex GmbH\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"width\":1921,\"height\":1081,\"caption\":\"inovex GmbH\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/inovexde\",\"https:\\\/\\\/x.com\\\/inovexgmbh\",\"https:\\\/\\\/www.instagram.com\\\/inovexlife\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/inovex\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UC7r66GT14hROB_RQsQBAQUQ\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/ac4b0328408e4f08c3775b34a64ff887\",\"name\":\"Andreas Binder\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=ge2c09b0b4033d43dec53e943c3da0bf1\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=g\",\"caption\":\"Andreas Binder\"},\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/author\\\/abinder\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH","description":"Explore how our novel framework enhances LLMs' decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/","og_locale":"de_DE","og_type":"article","og_title":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH","og_description":"Explore how our novel framework enhances LLMs' decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).","og_url":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/","og_site_name":"inovex GmbH","article_publisher":"https:\/\/www.facebook.com\/inovexde","article_published_time":"2024-06-19T11:57:32+00:00","og_image":[{"width":1510,"height":890,"url":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg","type":"image\/jpeg"}],"author":"Andreas Binder, Johanna Heinz, Florian Wilhelm","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS-1024x604.jpg","twitter_creator":"@inovexgmbh","twitter_site":"@inovexgmbh","twitter_misc":{"Verfasst von":"Andreas Binder","Gesch\u00e4tzte Lesezeit":"12\u00a0Minuten","Written by":"Andreas Binder, Johanna Heinz, Florian Wilhelm"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#article","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/"},"author":{"name":"Andreas Binder","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/ac4b0328408e4f08c3775b34a64ff887"},"headline":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning","datePublished":"2024-06-19T11:57:32+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/"},"wordCount":2394,"publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg","keywords":["Artificial Intelligence","Data Science","Image Retrieval","Information Extraction","Reinforcement Learning"],"articleSection":["Analytics"],"inLanguage":"de"},{"@type":"WebPage","@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/","url":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/","name":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning - inovex GmbH","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#primaryimage"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg","datePublished":"2024-06-19T11:57:32+00:00","description":"Explore how our novel framework enhances LLMs' decision-making through advanced planning algorithms like MCTS, demonstrated in Visual Question Answering (VQA).","breadcrumb":{"@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#primaryimage","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/Blogheader-MCTS.jpg","width":1510,"height":890,"caption":"Grafik: Drei Menschen, die an einem Chatbot arbeiten, der mit einer Monte Carlo Tree Search arbeitet."},{"@type":"BreadcrumbList","@id":"https:\/\/www.inovex.de\/de\/blog\/mcts-meets-llms-enabling-complex-reasoning-and-strategic-planning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inovex.de\/de\/"},{"@type":"ListItem","position":2,"name":"MCTS meets LLMs: Enabling Complex Reasoning and Strategic Planning"}]},{"@type":"WebSite","@id":"https:\/\/www.inovex.de\/de\/#website","url":"https:\/\/www.inovex.de\/de\/","name":"inovex GmbH","description":"","publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inovex.de\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.inovex.de\/de\/#organization","name":"inovex GmbH","url":"https:\/\/www.inovex.de\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","width":1921,"height":1081,"caption":"inovex GmbH"},"image":{"@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inovexde","https:\/\/x.com\/inovexgmbh","https:\/\/www.instagram.com\/inovexlife\/","https:\/\/www.linkedin.com\/company\/inovex","https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ"]},{"@type":"Person","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/ac4b0328408e4f08c3775b34a64ff887","name":"Andreas Binder","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/secure.gravatar.com\/avatar\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=ge2c09b0b4033d43dec53e943c3da0bf1","url":"https:\/\/secure.gravatar.com\/avatar\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4c0f2ebe076701ffd35ce6e6df6c4112ac28f53e06cd0d1c30865e7cf6eb6bb9?s=96&d=retro&r=g","caption":"Andreas Binder"},"url":"https:\/\/www.inovex.de\/de\/blog\/author\/abinder\/"}]}},"_links":{"self":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/51285","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/users\/192"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/comments?post=51285"}],"version-history":[{"count":5,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/51285\/revisions"}],"predecessor-version":[{"id":54582,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/51285\/revisions\/54582"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media\/54874"}],"wp:attachment":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media?parent=51285"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/tags?post=51285"},{"taxonomy":"service","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/service?post=51285"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/coauthors?post=51285"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}