{"id":11543,"date":"2024-04-14T16:29:00","date_gmt":"2024-04-14T16:29:00","guid":{"rendered":"https:\/\/dailyai.com\/?p=11543"},"modified":"2024-04-15T11:48:24","modified_gmt":"2024-04-15T11:48:24","slug":"xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa","status":"publish","type":"post","link":"https:\/\/dailyai.com\/de\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","title":{"rendered":"xAI gibt eine Vorschau auf Grok-1.5 und erstellt einen neuen Benchmark namens RealWorldQA"},"content":{"rendered":"<p><strong>Elon Musks xAI hat Grok-1.5 vorgestellt, ein multimodales KI-Modell, das die Konkurrenz beim Verstehen realer Szenarien schlagen soll.\u00a0<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">Das neue Grok-1.5 tritt in die Fu\u00dfstapfen anderer Programme wie GPT-4V und erm\u00f6glicht die visuelle Verarbeitung von Dokumenten, Diagrammen, Tabellen, Screenshots und Fotos.<\/span><\/p>\n<p><a href=\"https:\/\/x.ai\/blog\/grok-1.5\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Grok-1.5<\/span><\/a><span style=\"font-weight: 400;\"> gewinnt auch bei Text-, Codierungs- und Mathematikaufgaben an Boden und erzielt 50,6% beim MATH-Benchmark, 90% beim GSM8K-Benchmark und 74,1% beim HumanEval-Benchmark.\u00a0<\/span><\/p>\n<p>Damit geh\u00f6rt Grok-1.5 zu den LLM-Schwergewichten, die im Durchschnitt etwas schlechter abschneiden als Gemini Pro 1.5, GPT-4 und Claude 3 Opus.<\/p>\n<figure id=\"attachment_11546\" aria-describedby=\"caption-attachment-11546\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11546 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png\" alt=\"Grok\" width=\"1024\" height=\"343\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1024x343.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-300x100.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-768x257.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-1536x515.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2-60x20.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks2.png 1633w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11546\" class=\"wp-caption-text\">Grok-1.5's wettbewerbsf\u00e4hige Text-, Mathe- und Codierungsbenchmarks. Quelle: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Grok-1.5 bietet auch ein l\u00e4ngeres Kontextverst\u00e4ndnis mit bis zu 128K Token, eine 16-fache Steigerung im Vergleich zu seinem Vorg\u00e4nger, die jedoch weit hinter den Werten von Claude 3 Opus und Gemini 1.5 Pro liegt.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Eine Needle In A Haystack (NIAH)-Evaluierung zeigte die F\u00e4higkeit von Grok-1.5, eingebetteten Text in Kontexten von bis zu 128K Token L\u00e4nge zu finden.<\/span><\/p>\n<p>Die Sehf\u00e4higkeit von Grok-1.5 wird von xAI jedoch am st\u00e4rksten gef\u00f6rdert.<\/p>\n<p><span style=\"font-weight: 400;\">Demos <\/span><span style=\"font-weight: 400;\">zeigen, wie Grok-1.5 Blockschemata in Python-Code umwandelt, von Kinderbildern inspirierte Gute-Nacht-Geschichten erzeugt, CSV-Datens\u00e4tze aus Screenshots erstellt und sogar Memes \"erweitert\".\u00a0<\/span><\/p>\n<p>Grok-1.5 f\u00fchrt die Rangliste in einigen etablierten Benchmarks wie Mathvista und TextVQA an und erzielt die h\u00f6chste Punktzahl in dem von xAI neu eingef\u00fchrten Benchmark RealWorldQA.<\/p>\n<figure id=\"attachment_11544\" aria-describedby=\"caption-attachment-11544\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11544 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png\" alt=\"\" width=\"1024\" height=\"695\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-1024x695.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-300x204.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-768x522.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks-60x41.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/GrokBenchmarks.png 1309w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11544\" class=\"wp-caption-text\">Grok-1.5's beeindruckende Vision-Benchmarks. Quelle: xAI<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Unter der Haube wird Grok-1.5 von einem benutzerdefinierten, verteilten Trainingsframework angetrieben, das es dem xAI-Team erm\u00f6glicht, mit minimalem Aufwand Ideen zu prototypisieren und neue Architekturen in gro\u00dfem Ma\u00dfstab zu trainieren.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> xAI wurde <\/span><a href=\"http:\/\/v\"><span style=\"font-weight: 400;\">letztes Jahr gegr\u00fcndet<\/span><\/a><span style=\"font-weight: 400;\"> und umfasst einige der weltbesten KI-Forscher mit dem \u00e4u\u00dferst ehrgeizigen Ziel, \"das Universum zu verstehen\".\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Bislang haben wir die witzige und ausgefallene Grok-1, die den Leuten erkl\u00e4rt, wie man Rauschgift synthetisiert und <\/span><a href=\"https:\/\/dailyai.com\/de\/2023\/12\/xais-grok-drops-an-awkward-blooper-by-referring-to-openai\/\"><span style=\"font-weight: 400;\">kritisiert Musk und Tesla<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> Grok ist auch mit der Postdatenbank von X verbunden, was ihm neben anderen einzigartigen Eigenheiten eine gro\u00dfe Fangemeinde beschert hat, obwohl es die Spitzenreiter bei der reinen Leistung nicht in Bedr\u00e4ngnis bringt.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Musks xAI-Projekt stellt das \u00fcberwiegend geschlossene \u00d6kosystem der generativen KI in Frage, indem es seine Modelle allgemein unter echten <\/span><a href=\"https:\/\/dailyai.com\/de\/2024\/03\/elon-musks-xai-open-sources-its-llm-grok-1\/\"><span style=\"font-weight: 400;\">Open-Source-Lizenzen<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In Kombination mit Meta, das eine \u00e4hnliche Absicht hat, sich gegen die Konkurrenz zu stellen, k\u00f6nnte die offene These von xAI ein Dorn im Auge der Monetarisierungsbem\u00fchungen von OpenAI, Microsoft, Anthropic und Google werden.<\/span><\/p>\n<h2>RealWorldQA<\/h2>\n<p>Im Rahmen der Grok-1.5-Vorschau stellte xAI auch den RealWorldQA vor, einen neuen Benchmark, der aus \u00fcber 700 Bildern besteht, die jeweils mit einer Frage und einer \u00fcberpr\u00fcfbaren Antwort versehen sind.<\/p>\n<p><span style=\"font-weight: 400;\">Der Datensatz besteht haupts\u00e4chlich aus anonymisierten Bildern, die in Fahrzeugen und anderen realen Situationen aufgenommen wurden.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Der RealWorldQA-Datensatz wurde entwickelt, um die r\u00e4umlichen Verstehensf\u00e4higkeiten von Grok 1.5 und anderen multimodalen KI-Modellen zu bewerten. xAI war der Meinung, dass andere Benchmarks in diesem Bereich unzureichend waren.\u00a0<\/span><\/p>\n<figure id=\"attachment_11545\" aria-describedby=\"caption-attachment-11545\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11545 size-large\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png\" alt=\"Grok\" width=\"1024\" height=\"258\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1024x258.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-300x76.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-768x193.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-1536x387.png 1536w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld-60x15.png 60w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/realworld.png 1947w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-11545\" class=\"wp-caption-text\">Mit dem RealWorldQA-Benchmark-Datensatz soll die F\u00e4higkeit der Modelle getestet werden, nat\u00fcrliche Szenen zu verstehen. Quelle: xAI<\/figcaption><\/figure>\n<p>Grok-1.5 schneidet bei RealWorldQA besser ab als die Konkurrenz, und es wird interessant sein zu sehen, ob es sich durchsetzt.<\/p>\n<p><span style=\"font-weight: 400;\">Auch wenn das Grok-1.5 nicht in der Lage ist, das Universum zu verstehen, wird es seinen Platz als ein weiteres Spitzenmodell in einer immer gr\u00f6\u00dfer werdenden Produktpalette einnehmen. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Das zeigt auch, dass die generative KI in ihrer derzeitigen Form den H\u00f6hepunkt ihrer Leistungsf\u00e4higkeit erreicht hat - wenn auch vielleicht nicht mehr lange.\u00a0<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Elon Musks xAI hat Grok-1.5 vorgestellt, ein multimodales KI-Modell, das die Konkurrenz beim Verstehen realer Szenarien schlagen soll.  Das neue Grok-1.5 tritt in die Fu\u00dfstapfen anderer Modelle wie GPT-4V und f\u00fchrt die visuelle Verarbeitung ein, um alles von Dokumenten und Diagrammen bis hin zu Tabellen, Screenshots und Fotos zu analysieren. Grok-1.5 gewinnt auch bei Text-, Codierungs- und Matheaufgaben an Boden: 50,6% beim MATH-Benchmark, 90% beim GSM8K-Benchmark und 74,1% beim HumanEval-Benchmark.  Damit geh\u00f6rt Grok-1.5 zu den LLM-Schwergewichten, die im Durchschnitt etwas schlechter abschneiden als Gemini Pro 1.5, GPT-4 und Claude 3 Opus. Grok-1.5 bietet auch l\u00e4ngeres Kontextverst\u00e4ndnis bis<\/p>","protected":false},"author":2,"featured_media":11548,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[188,481,223],"class_list":["post-11543","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-elon-musk","tag-grok","tag-xai"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/de\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/de\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-14T16:29:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-15T11:48:24+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\u00a0Minuten\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"},\"wordCount\":546,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"keywords\":[\"Elon Musk\",\"Grok\",\"xAI\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"de\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\",\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"datePublished\":\"2024-04-14T16:29:00+00:00\",\"dateModified\":\"2024-04-15T11:48:24+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/de\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"xAI gibt eine Vorschau auf Grok-1.5 und erstellt einen neuen Benchmark namens RealWorldQA | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/de\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_locale":"de_DE","og_type":"article","og_title":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA | DailyAI","og_description":"Elon Musk&#8217;s xAI has revealed Grok-1.5, a multimodal AI model designed to beat competitors in understanding real-world scenarios.\u00a0 Following in the footsteps of others, like GPT-4V, the new Grok-1.5 introduces visual processing to analyze anything from documents and diagrams to charts, screenshots, and photographs. Grok-1.5 also gains ground in text, coding, and math tasks, scoring 50.6% on the MATH benchmark, 90% on the GSM8K benchmark, and 74.1% on the HumanEval benchmark.\u00a0 This throws Grok-1.5 right into the LLM heavyweight tier, averaging slightly lower scores than Gemini Pro 1.5, GPT-4, and Claude 3 Opus. Grok-1.5 also offers longer context understanding up","og_url":"https:\/\/dailyai.com\/de\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","og_site_name":"DailyAI","article_published_time":"2024-04-14T16:29:00+00:00","article_modified_time":"2024-04-15T11:48:24+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","type":"image\/webp"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Verfasst von":"Sam Jeans","Gesch\u00e4tzte Lesezeit":"4\u00a0Minuten"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"},"wordCount":546,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","keywords":["Elon Musk","Grok","xAI"],"articleSection":["Industry"],"inLanguage":"de"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","url":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/","name":"xAI gibt eine Vorschau auf Grok-1.5 und erstellt einen neuen Benchmark namens RealWorldQA | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","datePublished":"2024-04-14T16:29:00+00:00","dateModified":"2024-04-15T11:48:24+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-14-17.28.41-Create-an-image-with-the-word-GROK-in-a-clear-legible-bold-sans-serif-font-centered-on-a-high-quality-landscape-canvas.-The-background-should-b.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/04\/xai-previews-grok-1-5-and-creates-a-new-benchmark-called-realworldqa\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"xAI previews Grok-1.5 and creates a new benchmark called RealWorldQA"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"Ihre t\u00e4gliche Dosis an AI-Nachrichten","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Sam Jeans","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam ist ein Wissenschafts- und Technologiewissenschaftler, der in verschiedenen KI-Startups gearbeitet hat. Wenn er nicht gerade schreibt, liest er medizinische Fachzeitschriften oder kramt in Kisten mit Schallplatten.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/de\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/posts\/11543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/comments?post=11543"}],"version-history":[{"count":6,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/posts\/11543\/revisions"}],"predecessor-version":[{"id":11553,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/posts\/11543\/revisions\/11553"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/media\/11548"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/media?parent=11543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/categories?post=11543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/de\/wp-json\/wp\/v2\/tags?post=11543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}