{"id":11543,"date":"2024-04-14T16:29:00","date_gmt":"2024-04-14T16:29:00","guid":{"rendered":"https:\/\/dailyai.com\/?p=11543"},"modified":"2024-04-15T11:48:24","modified_gmt":"2024-04-15T11:48:24","slug":"xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa","status":"publish","type":"post","link":"https:\/\/dailyai.com\/da\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","title":{"rendered":"xAI forh\u00e5ndsviser Grok-1.5 og opretter et nyt benchmark kaldet RealWorldQA"},"content":{"rendered":"<p><strong>Elon Musks xAI har afsl\u00f8ret Grok-1.5, en multimodal AI-model, der er designet til at sl\u00e5 konkurrenterne i at forst\u00e5 scenarier fra den virkelige verden.\u00a0<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">Den nye Grok-1.5 f\u00f8lger i fodsporene p\u00e5 andre som GPT-4V og introducerer visuel behandling til at analysere alt fra dokumenter og diagrammer til diagrammer, sk\u00e6rmbilleder og fotografier.<\/span><\/p>\n<p><a href=\"https:\/\/x.ai\/blog\/grok-1.5\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Grok-1.5<\/span><\/a><span style=\"font-weight: 400;\"> vinder ogs\u00e5 terr\u00e6n i tekst-, kodnings- og matematikopgaver og scorer 50,6% p\u00e5 MATH-benchmarket, 90% p\u00e5 GSM8K-benchmarket og 74,1% p\u00e5 HumanEval-benchmarket.\u00a0<\/span><\/p>\n<p>Det kaster Grok-1.5 direkte ind i LLM-sv\u00e6rv\u00e6gtsgruppen med et gennemsnit, der er lidt lavere end Gemini Pro 1.5, GPT-4 og Claude 3 Opus.<\/p>\n<figure id=\"attachment_11546\" aria-describedby=\"caption-attachment-11546\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11546 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png\" alt=\"Grok\" width=\"1024\" height=\"343\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-300x100.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-768x257.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1536x515.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-60x20.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2.png 1633w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11546\" class=\"wp-caption-text\">Grok-1.5's konkurrencedygtige benchmarks for tekst, matematik og kodning. Kilde: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Grok-1.5 tilbyder ogs\u00e5 l\u00e6ngere kontekstforst\u00e5else p\u00e5 op til 128K tokens, en 16-dobling i forhold til forg\u00e6ngeren, men et godt stykke under det, Claude 3 Opus og Gemini 1.5 Pro kan pr\u00e6stere.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">En evaluering af Needle In A Haystack (NIAH) demonstrerede Grok-1.5's evne til at finde indlejret tekst i kontekster p\u00e5 op til 128K tokens i l\u00e6ngden.<\/span><\/p>\n<p>Det er dog Grok-1.5's synsf\u00e6rdigheder, som xAI presser h\u00e5rdest.<\/p>\n<p><span style=\"font-weight: 400;\">Demoer <\/span><span style=\"font-weight: 400;\">Vis Grok-1.5, der konverterer blokskemaer til Python-kode, genererer godnathistorier inspireret af b\u00f8rnemalerier, skaber CSV-datas\u00e6t ud fra sk\u00e6rmbilleder og endda \"udvider\" memes.\u00a0<\/span><\/p>\n<p>Grok-1.5 topper ranglisten i nogle etablerede benchmarks som Mathvista og TextVQA og scorer h\u00f8jest i xAI's nyetablerede benchmark, RealWorldQA.<\/p>\n<figure id=\"attachment_11544\" aria-describedby=\"caption-attachment-11544\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11544 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png\" alt=\"\" width=\"1024\" height=\"695\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-300x204.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-768x522.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-60x41.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks.png 1309w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11544\" class=\"wp-caption-text\">Grok-1.5's imponerende syns-benchmarks. Kilde: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Under motorhjelmen drives Grok-1.5 af en brugerdefineret, distribueret tr\u00e6ningsramme, der g\u00f8r det muligt for xAI's team at prototype ideer og tr\u00e6ne nye arkitekturer i stor skala med minimal indsats.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> xAI var <\/span><a href=\"http:\/\/v\"><span style=\"font-weight: 400;\">grundlagt sidste \u00e5r<\/span><\/a><span style=\"font-weight: 400;\"> og omfatter nogle af verdens bedste AI-forskere med det ultraambiti\u00f8se m\u00e5l at \"Forst\u00e5 universet\".\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Indtil videre har vi den vittige og besynderlige Grok-1, der fort\u00e6ller folk, hvordan man syntetiserer narkotika og <\/span><a href=\"https:\/\/dailyai.com\/da\/2023\/12\/xais-grok-drops-an-awkward-blooper-by-referring-to-openai\/\"><span style=\"font-weight: 400;\">kritiserer Musk og Tesla<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> Grok er ogs\u00e5 forbundet med X's postdatabase, som blandt andre unikke s\u00e6rheder har givet den en hel del tilh\u00e6ngere p\u00e5 trods af, at den ikke kan m\u00e5le sig med de f\u00f8rende i ren ydeevne.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Musks xAI-projekt udfordrer generativ AI's prim\u00e6rt lukkede \u00f8kosystem ved at g\u00f8re modellerne generelt tilg\u00e6ngelige under \u00e6gte <\/span><a href=\"https:\/\/dailyai.com\/da\/2024\/03\/elon-musks-xai-open-sources-its-llm-grok-1\/\"><span style=\"font-weight: 400;\">open source-licenser<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Kombineret med Meta, som har en lignende intention om at g\u00e5 imod konkurrenterne, kan xAI's \u00e5bne tese blive en torn i \u00f8jet p\u00e5 OpenAI's, Microsofts, Anthropics og Googles bestr\u00e6belser p\u00e5 at tjene penge.<\/span><\/p>\n<h2>RealWorldQA<\/h2>\n<p>I forbindelse med Grok-1.5's preview afsl\u00f8rede xAI ogs\u00e5 RealWorldQA, et nyt benchmark best\u00e5ende af over 700 billeder, som hver is\u00e6r er ledsaget af et sp\u00f8rgsm\u00e5l og et verificerbart svar.<\/p>\n<p><span style=\"font-weight: 400;\">Datas\u00e6ttet best\u00e5r prim\u00e6rt af anonymiserede billeder taget fra k\u00f8ret\u00f8jer og andre situationer i den virkelige verden.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">RealWorldQA-datas\u00e6ttet er designet til at evaluere den rumlige forst\u00e5else i Grok 1.5 og andre multimodale AI-modeller. xAI mente, at andre benchmarks manglede i denne afdeling.\u00a0<\/span><\/p>\n<figure id=\"attachment_11545\" aria-describedby=\"caption-attachment-11545\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11545 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png\" alt=\"Grok\" width=\"1024\" height=\"258\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-300x76.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-768x193.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1536x387.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-60x15.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld.png 1947w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11545\" class=\"wp-caption-text\">RealWorldQA-benchmarkdatas\u00e6ttet har til form\u00e5l at teste modellernes evne til at forst\u00e5 naturlige scener. Kilde: xAI<\/figcaption><\/figure>\n<p>Grok-1.5 klarer sig bedre end konkurrenterne i RealWorldQA, og det bliver interessant at se, om den sl\u00e5r igennem.<\/p>\n<p><span style=\"font-weight: 400;\">Selv om den ikke kan forst\u00e5 universet, vil Grok-1.5 indtage sin plads som endnu en topmodel i et stadigt voksende sortiment. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Det viser ogs\u00e5, hvordan generativ AI i sin nuv\u00e6rende form er ved at n\u00e5 toppen af sin form\u00e5en - men m\u00e5ske ikke s\u00e5 l\u00e6nge endnu.\u00a0<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Elon Musks xAI har afsl\u00f8ret Grok-1.5, en multimodal AI-model, der er designet til at sl\u00e5 konkurrenterne i forst\u00e5elsen af scenarier fra den virkelige verden.  Den nye Grok-1.5 f\u00f8lger i fodsporene p\u00e5 andre som GPT-4V og introducerer visuel behandling til at analysere alt fra dokumenter og diagrammer til diagrammer, sk\u00e6rmbilleder og fotografier. Grok-1.5 vinder ogs\u00e5 terr\u00e6n i tekst-, kodnings- og matematikopgaver og scorer 50,6% p\u00e5 MATH-benchmarket, 90% p\u00e5 GSM8K-benchmarket og 74,1% p\u00e5 HumanEval-benchmarket.  Dette kaster Grok-1.5 lige ind i LLM-sv\u00e6rv\u00e6gtsklassen og giver i gennemsnit lidt lavere score end Gemini Pro 1.5, GPT-4 og Claude 3 Opus. Grok-1.5 tilbyder ogs\u00e5 l\u00e6ngere kontekstforst\u00e5else op til<\/p>","protected":false},"author":2,"featured_media":11548,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[188,481,223],"class_list":["post-11543","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-elon-musk","tag-grok","tag-xai"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/da\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:locale\" content=\"da_DK\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/da\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-14T16:29:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-15T11:48:24+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Skrevet af\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimeret l\u00e6setid\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutter\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"wordCount\":546,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"keywords\":[\"Elon Musk\",\"Grok\",\"xAI\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"da-DK\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\"},\"inLanguage\":\"da-DK\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"da-DK\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/da\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"xAI forh\u00e5ndsviser Grok-1.5 og opretter et nyt benchmark kaldet RealWorldQA | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/da\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_locale":"da_DK","og_type":"article","og_title":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI","og_description":"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up","og_url":"https:\/\/dailyai.com\/da\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_site_name":"DailyAI","article_published_time":"2024-04-14T16:29:00+00:00","article_modified_time":"2024-04-15T11:48:24+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","type":"image\/webp"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Skrevet af":"Sam Jeans","Estimeret l\u00e6setid":"4 minutter"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"wordCount":546,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","keywords":["Elon Musk","Grok","xAI"],"articleSection":["Industry"],"inLanguage":"da-DK"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","url":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","name":"xAI forh\u00e5ndsviser Grok-1.5 og opretter et nyt benchmark kaldet RealWorldQA | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb"},"inLanguage":"da-DK","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"]}]},{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"Din daglige dosis af AI-nyheder","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"da-DK"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Sam Jeans","image":{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam er videnskabs- og teknologiforfatter og har arbejdet i forskellige AI-startups. N\u00e5r han ikke skriver, kan han finde p\u00e5 at l\u00e6se medicinske tidsskrifter eller grave i kasser med vinylplader.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/da\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/11543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/comments?post=11543"}],"version-history":[{"count":6,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/11543\/revisions"}],"predecessor-version":[{"id":11553,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/11543\/revisions\/11553"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/media\/11548"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/media?parent=11543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/categories?post=11543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/tags?post=11543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}