{"id":12782,"date":"2024-06-10T10:39:06","date_gmt":"2024-06-10T10:39:06","guid":{"rendered":"https:\/\/dailyai.com\/?p=12782"},"modified":"2024-06-10T10:39:06","modified_gmt":"2024-06-10T10:39:06","slug":"natural-plan-benchmarking-llms-on-natural-language-planning","status":"publish","type":"post","link":"https:\/\/dailyai.com\/it\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/","title":{"rendered":"NATURAL PLAN: analisi comparativa dei LLM sulla pianificazione del linguaggio naturale"},"content":{"rendered":"<p><strong>I ricercatori di Google DeepMind hanno sviluppato NATURAL PLAN, un benchmark per valutare la capacit\u00e0 dei LLM di pianificare attivit\u00e0 del mondo reale sulla base di suggerimenti in linguaggio naturale.<\/strong><\/p>\n<p>La prossima evoluzione dell'IA consiste nel farle uscire dai confini di una piattaforma di chat e farle assumere ruoli agenziali per completare attivit\u00e0 su tutte le piattaforme per nostro conto. Ma \u00e8 pi\u00f9 difficile di quanto sembri.<\/p>\n<p>Pianificare attivit\u00e0 come la programmazione di una riunione o la compilazione di un itinerario per le vacanze pu\u00f2 sembrare semplice per noi. Gli esseri umani sono bravi a ragionare su pi\u00f9 fasi e a prevedere se un'azione raggiunger\u00e0 o meno l'obiettivo desiderato.<\/p>\n<p>Potreste trovarlo facile, ma anche i migliori modelli di intelligenza artificiale fanno fatica a pianificare. Potremmo fare un benchmark per vedere quale LLM \u00e8 pi\u00f9 bravo a pianificare?<\/p>\n<p>Il benchmark NATURAL PLAN mette alla prova i LLM su 3 compiti di pianificazione:<\/p>\n<ul>\n<li><strong>Pianificazione del viaggio<\/strong> - Pianificare l'itinerario di un viaggio con vincoli di volo e di destinazione<\/li>\n<li><strong>Pianificazione della riunione<\/strong> - Programmazione di incontri con pi\u00f9 amici in luoghi diversi<\/li>\n<li><strong>Pianificazione del calendario<\/strong> - Programmare le riunioni di lavoro tra pi\u00f9 persone in base agli orari esistenti e ai vari vincoli.<\/li>\n<\/ul>\n<p>L'esperimento \u00e8 iniziato con un prompt di pochi colpi in cui ai modelli sono stati forniti 5 esempi di prompt e le relative risposte corrette. Poi sono stati sollecitati con richieste di pianificazione di difficolt\u00e0 variabile.<\/p>\n<p>Ecco un esempio di richiesta e soluzione fornita come esempio ai modelli:<\/p>\n<figure id=\"attachment_12784\" aria-describedby=\"caption-attachment-12784\" style=\"width: 1342px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-12784 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example.png\" alt=\"\" width=\"1342\" height=\"808\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example.png 1342w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example-300x181.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example-1024x617.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example-768x462.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example-18x12.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-Prompt-example-60x36.png 60w\" sizes=\"auto, (max-width: 1342px) 100vw, 1342px\" \/><figcaption id=\"caption-attachment-12784\" class=\"wp-caption-text\">Un esempio di prompt e di soluzione utilizzati nell'esperimento Trip Planning. Fonte: arXiv<\/figcaption><\/figure>\n<h2>Risultati<\/h2>\n<p>I ricercatori hanno testato GPT-3.5, GPT-4, <a href=\"https:\/\/dailyai.com\/it\/2024\/05\/everything-you-need-to-know-about-openais-new-flagship-model-gpt-4o\/\">GPT-4o<\/a>, Gemini 1.5 Flash e <a href=\"https:\/\/dailyai.com\/it\/2024\/02\/google-plays-another-ai-card-in-the-form-of-gemini-1-5-pro\/\"><span class=\"noTranslate\" data-no-translation=\"\">Gemini<\/span> 1,5 Pro<\/a>, nessuno dei quali si \u00e8 comportato molto bene in questi test.<\/p>\n<p>I risultati devono essere stati ben accolti nell'ufficio di DeepMind, visto che Gemini 1.5 Pro si \u00e8 aggiudicato il primo posto.<\/p>\n<figure id=\"attachment_12785\" aria-describedby=\"caption-attachment-12785\" style=\"width: 1302px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-12785 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results.png\" alt=\"\" width=\"1302\" height=\"204\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results.png 1302w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results-300x47.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results-1024x160.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results-768x120.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results-18x3.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLAN-results-60x9.png 60w\" sizes=\"auto, (max-width: 1302px) 100vw, 1302px\" \/><figcaption id=\"caption-attachment-12785\" class=\"wp-caption-text\">Risultati del benchmark NATURAL PLAN. Fonte: arXiv<\/figcaption><\/figure>\n<p>Come previsto, i risultati peggiorano in modo esponenziale con le richieste pi\u00f9 complesse, quando aumenta il numero di persone o di citt\u00e0. Ad esempio, si osservi quanto rapidamente la precisione sia diminuita con l'aggiunta di altre persone al test sulla pianificazione di una riunione.<\/p>\n<figure id=\"attachment_12786\" aria-describedby=\"caption-attachment-12786\" style=\"width: 1330px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-12786 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity.png\" alt=\"\" width=\"1330\" height=\"530\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity.png 1330w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity-300x120.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity-1024x408.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity-768x306.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity-18x7.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/NATURAL-PLANNING-results-vs-complexity-60x24.png 60w\" sizes=\"auto, (max-width: 1330px) 100vw, 1330px\" \/><figcaption id=\"caption-attachment-12786\" class=\"wp-caption-text\">L'accuratezza dei risultati nel test di pianificazione delle riunioni degrada in modo esponenziale man mano che le richieste diventano pi\u00f9 complesse. Fonte: arXiv<\/figcaption><\/figure>\n<p>La richiesta di pi\u00f9 colpi pu\u00f2 migliorare l'accuratezza? I risultati della ricerca indicano che \u00e8 possibile, ma solo se il modello ha una finestra di contesto sufficientemente ampia.<\/p>\n<p>La finestra di contesto pi\u00f9 ampia di Gemini 1.5 Pro consente di sfruttare un maggior numero di esempi in contesto rispetto ai modelli GPT.<\/p>\n<p>I ricercatori hanno scoperto che nella Pianificazione del viaggio, l'aumento del numero di scatti da 1 a 800 migliora la precisione di Gemini Pro 1.5 da 2,7% a 39,9%.<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2406.04520\" target=\"_blank\" rel=\"noopener\">La carta<\/a> ha osservato: \"Questi risultati mostrano la promessa della pianificazione in-context, dove le capacit\u00e0 di long-context consentono ai LLM di sfruttare ulteriori contesti per migliorare la pianificazione\".<\/p>\n<p>Un risultato strano \u00e8 stato che il GPT-4o era davvero pessimo nella pianificazione del viaggio. I ricercatori hanno scoperto che faticava a \"comprendere e rispettare i vincoli di connettivit\u00e0 dei voli e di data del viaggio\".<\/p>\n<p>Un altro risultato strano \u00e8 che l'autocorrezione ha portato a un calo significativo delle prestazioni dei modelli in tutti i modelli. Quando i modelli sono stati invitati a controllare il loro lavoro e ad apportare correzioni, hanno commesso pi\u00f9 errori.<\/p>\n<p>\u00c8 interessante notare che i modelli pi\u00f9 forti, come GPT-4 e Gemini 1.5 Pro, hanno subito perdite maggiori rispetto a GPT-3.5 in fase di autocorrezione.<\/p>\n<p>L'IA agenziale \u00e8 una prospettiva entusiasmante e stiamo gi\u00e0 assistendo ad alcuni casi d'uso pratici in <a href=\"https:\/\/dailyai.com\/it\/2024\/05\/ai-agents-multimodal-phi-3-unveiled-at-microsoft-build-2024\/\">Microsoft <span class=\"noTranslate\" data-no-translation=\"\">Copilot<\/span> agenti<\/a>.<\/p>\n<p>Ma i risultati dei test di benchmark NATURAL PLAN dimostrano che c'\u00e8 ancora molta strada da fare prima che l'intelligenza artificiale possa gestire una pianificazione pi\u00f9 complessa.<\/p>\n<p>I ricercatori di DeepMind hanno concluso che \"NATURAL PLAN \u00e8 molto difficile da risolvere per i modelli pi\u00f9 avanzati\".<\/p>\n<p>Sembra che l'intelligenza artificiale non sostituir\u00e0 ancora le agenzie di viaggio e gli assistenti personali.<\/p>","protected":false},"excerpt":{"rendered":"<p>I ricercatori di Google DeepMind hanno sviluppato NATURAL PLAN, un benchmark per valutare la capacit\u00e0 delle LLM di pianificare attivit\u00e0 del mondo reale sulla base di richieste in linguaggio naturale. La prossima evoluzione dell'IA consiste nel farle uscire dai confini di una piattaforma di chat e farle assumere ruoli agenziali per completare compiti su pi\u00f9 piattaforme per nostro conto. Ma \u00e8 pi\u00f9 difficile di quanto sembri. Pianificare attivit\u00e0 come la programmazione di una riunione o la compilazione di un itinerario per le vacanze pu\u00f2 sembrare semplice per noi. Gli esseri umani sono bravi a ragionare su pi\u00f9 fasi e a prevedere se un'azione raggiunger\u00e0 o meno l'obiettivo desiderato. Potreste scoprire che<\/p>","protected":false},"author":6,"featured_media":12787,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[147,118],"class_list":["post-12782","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-deepmind","tag-llms"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>NATURAL PLAN: Benchmarking LLMs on natural language planning | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/it\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/\" \/>\n<meta property=\"og:locale\" content=\"it_IT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"NATURAL PLAN: Benchmarking LLMs on natural language planning | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Google DeepMind researchers developed NATURAL PLAN, a benchmark for evaluating the capability of LLMs to plan real-world tasks based on natural language prompts. The next evolution of AI is to have it leave the confines of a chat platform and take on agentic roles to complete tasks across platforms on our behalf. But that\u2019s harder than it sounds. Planning tasks like scheduling a meeting or compiling a holiday itinerary might seem simple for us. Humans are good at reasoning through multiple steps and predicting whether a course of action will accomplish the desired objective or not. You might find that\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/it\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-06-10T10:39:06+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Eugene van der Watt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Scritto da\" \/>\n\t<meta name=\"twitter:data1\" content=\"Eugene van der Watt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tempo di lettura stimato\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minuti\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/\"},\"author\":{\"name\":\"Eugene van der Watt\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\"},\"headline\":\"NATURAL PLAN: Benchmarking LLMs on natural language planning\",\"datePublished\":\"2024-06-10T10:39:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/\"},\"wordCount\":606,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/Planning.webp\",\"keywords\":[\"DeepMind\",\"LLMS\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"it-IT\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/\",\"name\":\"NATURAL PLAN: Benchmarking LLMs on natural language planning | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/Planning.webp\",\"datePublished\":\"2024-06-10T10:39:06+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#breadcrumb\"},\"inLanguage\":\"it-IT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/Planning.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/06\\\/Planning.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/06\\\/natural-plan-benchmarking-llms-on-natural-language-planning\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"NATURAL PLAN: Benchmarking LLMs on natural language planning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"it-IT\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\",\"name\":\"Eugene van der Watt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"it-IT\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"caption\":\"Eugene van der Watt\"},\"description\":\"Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.\",\"sameAs\":[\"www.linkedin.com\\\/in\\\/eugene-van-der-watt-16828119\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/it\\\/author\\\/eugene\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"NATURAL PLAN: Benchmarking LLMs sulla pianificazione del linguaggio naturale | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/it\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/","og_locale":"it_IT","og_type":"article","og_title":"NATURAL PLAN: Benchmarking LLMs on natural language planning | DailyAI","og_description":"Google DeepMind researchers developed NATURAL PLAN, a benchmark for evaluating the capability of LLMs to plan real-world tasks based on natural language prompts. The next evolution of AI is to have it leave the confines of a chat platform and take on agentic roles to complete tasks across platforms on our behalf. But that\u2019s harder than it sounds. Planning tasks like scheduling a meeting or compiling a holiday itinerary might seem simple for us. Humans are good at reasoning through multiple steps and predicting whether a course of action will accomplish the desired objective or not. You might find that","og_url":"https:\/\/dailyai.com\/it\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/","og_site_name":"DailyAI","article_published_time":"2024-06-10T10:39:06+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp","type":"image\/webp"}],"author":"Eugene van der Watt","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Scritto da":"Eugene van der Watt","Tempo di lettura stimato":"4 minuti"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/"},"author":{"name":"Eugene van der Watt","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa"},"headline":"NATURAL PLAN: Benchmarking LLMs on natural language planning","datePublished":"2024-06-10T10:39:06+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/"},"wordCount":606,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp","keywords":["DeepMind","LLMS"],"articleSection":["Industry"],"inLanguage":"it-IT"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/","url":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/","name":"NATURAL PLAN: Benchmarking LLMs sulla pianificazione del linguaggio naturale | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp","datePublished":"2024-06-10T10:39:06+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#breadcrumb"},"inLanguage":"it-IT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/"]}]},{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/06\/Planning.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/06\/natural-plan-benchmarking-llms-on-natural-language-planning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"NATURAL PLAN: Benchmarking LLMs on natural language planning"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"La vostra dose quotidiana di notizie sull'intelligenza artificiale","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"it-IT"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa","name":"Eugene van der Watt","image":{"@type":"ImageObject","inLanguage":"it-IT","@id":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","caption":"Eugene van der Watt"},"description":"Eugene proviene da un background di ingegneria elettronica e ama tutto ci\u00f2 che \u00e8 tecnologico. Quando si prende una pausa dal consumo di notizie sull'intelligenza artificiale, lo si pu\u00f2 trovare al tavolo da biliardo.","sameAs":["www.linkedin.com\/in\/eugene-van-der-watt-16828119"],"url":"https:\/\/dailyai.com\/it\/author\/eugene\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/posts\/12782","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/comments?post=12782"}],"version-history":[{"count":3,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/posts\/12782\/revisions"}],"predecessor-version":[{"id":12789,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/posts\/12782\/revisions\/12789"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/media\/12787"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/media?parent=12782"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/categories?post=12782"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/it\/wp-json\/wp\/v2\/tags?post=12782"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}