{"id":11543,"date":"2024-04-14T16:29:00","date_gmt":"2024-04-14T16:29:00","guid":{"rendered":"https:\/\/dailyai.com\/?p=11543"},"modified":"2024-04-15T11:48:24","modified_gmt":"2024-04-15T11:48:24","slug":"xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa","status":"publish","type":"post","link":"https:\/\/dailyai.com\/sv\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","title":{"rendered":"xAI f\u00f6rhandsgranskar Grok-1.5 och skapar ett nytt benchmark kallat RealWorldQA"},"content":{"rendered":"<p><strong>Elon Musks xAI har avsl\u00f6jat Grok-1.5, en multimodal AI-modell som \u00e4r utformad f\u00f6r att sl\u00e5 konkurrenterna n\u00e4r det g\u00e4ller att f\u00f6rst\u00e5 verkliga scenarier.\u00a0<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">Nya Grok-1.5 f\u00f6ljer i fotsp\u00e5ren av andra program, som GPT-4V, och introducerar visuell bearbetning f\u00f6r att analysera allt fr\u00e5n dokument och diagram till tabeller, sk\u00e4rmdumpar och fotografier.<\/span><\/p>\n<p><a href=\"https:\/\/x.ai\/blog\/grok-1.5\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Grok-1.5<\/span><\/a><span style=\"font-weight: 400;\"> vinner ocks\u00e5 mark i text-, kodnings- och matematikuppgifter och f\u00e5r 50,6% p\u00e5 MATH-riktm\u00e4rket, 90% p\u00e5 GSM8K-riktm\u00e4rket och 74,1% p\u00e5 HumanEval-riktm\u00e4rket.\u00a0<\/span><\/p>\n<p>Detta g\u00f6r att Grok-1.5 hamnar i LLM:s tungviktsklass, med ett genomsnittligt resultat som \u00e4r n\u00e5got l\u00e4gre \u00e4n Gemini Pro 1.5, GPT-4 och Claude 3 Opus.<\/p>\n<figure id=\"attachment_11546\" aria-describedby=\"caption-attachment-11546\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11546 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png\" alt=\"Grok\" width=\"1024\" height=\"343\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-300x100.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-768x257.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1536x515.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-60x20.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2.png 1633w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11546\" class=\"wp-caption-text\">Grok-1.5:s konkurrenskraftiga riktm\u00e4rken f\u00f6r text, matematik och kodning. K\u00e4lla: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Grok-1.5 erbjuder ocks\u00e5 l\u00e4ngre kontextf\u00f6rst\u00e5else upp till 128K tokens, en 16-faldig \u00f6kning j\u00e4mf\u00f6rt med sin f\u00f6reg\u00e5ngare, men l\u00e5ngt bakom de som Claude 3 Opus och Gemini 1.5 Pro erbjuder.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">En NIAH-utv\u00e4rdering (Needle In A Haystack) visade Grok-1.5:s f\u00f6rm\u00e5ga att lokalisera inb\u00e4ddad text i sammanhang med en l\u00e4ngd p\u00e5 upp till 128 000 tokens.<\/span><\/p>\n<p>Det \u00e4r dock Grok-1.5:s visionskompetens som xAI driver h\u00e5rdast.<\/p>\n<p><span style=\"font-weight: 400;\">Demos <\/span><span style=\"font-weight: 400;\">visa Grok-1.5 som omvandlar blockscheman till Python-kod, genererar godnattsagor inspirerade av barns m\u00e5lningar, skapar CSV-dataset fr\u00e5n sk\u00e4rmdumpar och till och med \"expanderar\" memes.\u00a0<\/span><\/p>\n<p>Grok-1.5 toppar resultatlistan i n\u00e5gra etablerade benchmarks som Mathvista och TextVQA och f\u00e5r h\u00f6gst po\u00e4ng i xAI:s nyetablerade benchmark, RealWorldQA.<\/p>\n<figure id=\"attachment_11544\" aria-describedby=\"caption-attachment-11544\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11544 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png\" alt=\"\" width=\"1024\" height=\"695\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-300x204.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-768x522.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-60x41.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks.png 1309w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11544\" class=\"wp-caption-text\">Grok-1.5:s imponerande vision benchmarks. K\u00e4lla: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Grok-1.5 drivs av ett anpassat ramverk f\u00f6r distribuerad utbildning som g\u00f6r det m\u00f6jligt f\u00f6r xAI:s team att ta fram prototyper och utbilda nya arkitekturer i stor skala med minimal anstr\u00e4ngning.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> xAI var <\/span><a href=\"http:\/\/v\"><span style=\"font-weight: 400;\">grundades f\u00f6rra \u00e5ret<\/span><\/a><span style=\"font-weight: 400;\"> och inkluderar n\u00e5gra av v\u00e4rldens fr\u00e4msta AI-forskare med det ultraambiti\u00f6sa m\u00e5let att \"f\u00f6rst\u00e5 universum\".\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Hittills har vi f\u00e5tt den kvicka och udda Grok-1 som ber\u00e4ttar f\u00f6r m\u00e4nniskor hur man syntetiserar narkotika och <\/span><a href=\"https:\/\/dailyai.com\/sv\/2023\/12\/xais-grok-drops-an-awkward-blooper-by-referring-to-openai\/\"><span style=\"font-weight: 400;\">kritiserar Musk och Tesla<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> Grok \u00e4r ocks\u00e5 ansluten till X:s postdatabas, som bland andra unika egenheter har gett den en hel del f\u00f6ljare trots att den inte st\u00f6r ledarna i ren prestanda.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Musks xAI-projekt utmanar generativ AI:s ekosystem, som i huvudsak best\u00e5r av slutna k\u00e4llor, och g\u00f6r dess modeller allm\u00e4nt tillg\u00e4ngliga under \u00e4kta <\/span><a href=\"https:\/\/dailyai.com\/sv\/2024\/03\/elon-musks-xai-open-sources-its-llm-grok-1\/\"><span style=\"font-weight: 400;\">licenser f\u00f6r \u00f6ppen k\u00e4llkod<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Kombinerat med Meta, som har en liknande avsikt att g\u00e5 emot konkurrenternas korn, kan xAI: s \u00f6ppna avhandling bli en tagg i int\u00e4ktsf\u00f6rs\u00f6rjningsinsatserna fr\u00e5n OpenAI, Microsoft, Anthropic och Google.<\/span><\/p>\n<h2>RealWorldQA<\/h2>\n<p>I samband med f\u00f6rhandsvisningen av Grok-1.5 presenterade xAI ocks\u00e5 RealWorldQA, ett nytt benchmark som best\u00e5r av \u00f6ver 700 bilder, var och en med en fr\u00e5ga och ett verifierbart svar.<\/p>\n<p><span style=\"font-weight: 400;\">Datasetet best\u00e5r huvudsakligen av anonymiserade bilder tagna fr\u00e5n fordon och andra verkliga situationer.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">RealWorldQA-datasetet \u00e4r utformat f\u00f6r att utv\u00e4rdera den rumsliga f\u00f6rst\u00e5elsen hos Grok 1.5 och andra multimodala AI-modeller. xAI ans\u00e5g att andra riktm\u00e4rken saknade den h\u00e4r avdelningen.\u00a0<\/span><\/p>\n<figure id=\"attachment_11545\" aria-describedby=\"caption-attachment-11545\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11545 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png\" alt=\"Grok\" width=\"1024\" height=\"258\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-300x76.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-768x193.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1536x387.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-60x15.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld.png 1947w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11545\" class=\"wp-caption-text\">Referensdatasetet RealWorldQA syftar till att testa modellernas f\u00f6rm\u00e5ga att f\u00f6rst\u00e5 naturliga scener. K\u00e4lla: xAI<\/figcaption><\/figure>\n<p>Grok-1.5 \u00f6vertr\u00e4ffar konkurrenterna i RealWorldQA, och det ska bli intressant att se om det f\u00e5r genomslag.<\/p>\n<p><span style=\"font-weight: 400;\">\u00c4ven om Grok-1.5 inte r\u00e4cker till f\u00f6r att f\u00f6rst\u00e5 universum kommer den att bli ytterligare en toppmodell i ett st\u00e4ndigt v\u00e4xande sortiment. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Det visar ocks\u00e5 hur generativ AI i sin nuvarande form h\u00e5ller p\u00e5 att n\u00e5 toppen av sin f\u00f6rm\u00e5ga - men kanske inte s\u00e5 l\u00e4nge till.\u00a0<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Elon Musks xAI har avsl\u00f6jat Grok-1.5, en multimodal AI-modell som \u00e4r utformad f\u00f6r att sl\u00e5 konkurrenterna n\u00e4r det g\u00e4ller att f\u00f6rst\u00e5 scenarier i den verkliga v\u00e4rlden.  Den nya Grok-1.5 f\u00f6ljer i fotsp\u00e5ren av andra, som GPT-4V, och introducerar visuell bearbetning f\u00f6r att analysera allt fr\u00e5n dokument och diagram till diagram, sk\u00e4rmdumpar och fotografier. Grok-1.5 vinner ocks\u00e5 mark i text-, kodnings- och matematikuppgifter, med 50,6% p\u00e5 MATH-riktm\u00e4rket, 90% p\u00e5 GSM8K-riktm\u00e4rket och 74,1% p\u00e5 HumanEval-riktm\u00e4rket.  Detta g\u00f6r att Grok-1.5 hamnar i LLM-tungviktsklassen, med i genomsnitt n\u00e5got l\u00e4gre po\u00e4ng \u00e4n Gemini Pro 1.5, GPT-4 och Claude 3 Opus. Grok-1.5 erbjuder ocks\u00e5 l\u00e4ngre kontextf\u00f6rst\u00e5else upp till<\/p>","protected":false},"author":2,"featured_media":11548,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[188,481,223],"class_list":["post-11543","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-elon-musk","tag-grok","tag-xai"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/sv\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:locale\" content=\"sv_SE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/sv\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-14T16:29:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-15T11:48:24+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Skriven av\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Ber\u00e4knad l\u00e4stid\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minuter\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"wordCount\":546,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"keywords\":[\"Elon Musk\",\"Grok\",\"xAI\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"sv-SE\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\"},\"inLanguage\":\"sv-SE\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"sv-SE\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/sv\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"xAI f\u00f6rhandsgranskar Grok-1.5 och skapar ett nytt benchmark kallat RealWorldQA | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/sv\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_locale":"sv_SE","og_type":"article","og_title":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI","og_description":"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up","og_url":"https:\/\/dailyai.com\/sv\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_site_name":"DailyAI","article_published_time":"2024-04-14T16:29:00+00:00","article_modified_time":"2024-04-15T11:48:24+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","type":"image\/webp"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Skriven av":"Sam Jeans","Ber\u00e4knad l\u00e4stid":"4 minuter"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"wordCount":546,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","keywords":["Elon Musk","Grok","xAI"],"articleSection":["Industry"],"inLanguage":"sv-SE"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","url":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","name":"xAI f\u00f6rhandsgranskar Grok-1.5 och skapar ett nytt benchmark kallat RealWorldQA | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb"},"inLanguage":"sv-SE","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"]}]},{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DagligaAI","description":"Din dagliga dos av AI-nyheter","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"sv-SE"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DagligaAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Sam Jeans","image":{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam \u00e4r en vetenskaps- och teknikskribent som har arbetat i olika AI-startups. N\u00e4r han inte skriver l\u00e4ser han medicinska tidskrifter eller gr\u00e4ver igenom l\u00e5dor med vinylskivor.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/sv\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/comments?post=11543"}],"version-history":[{"count":6,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11543\/revisions"}],"predecessor-version":[{"id":11553,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11543\/revisions\/11553"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/media\/11548"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/media?parent=11543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/categories?post=11543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/tags?post=11543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}