{"id":11426,"date":"2024-04-08T17:45:24","date_gmt":"2024-04-08T17:45:24","guid":{"rendered":"https:\/\/dailyai.com\/?p=11426"},"modified":"2024-04-09T08:28:17","modified_gmt":"2024-04-09T08:28:17","slug":"inside-big-techs-tussle-over-ai-training-data","status":"publish","type":"post","link":"https:\/\/dailyai.com\/sv\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/","title":{"rendered":"Inblick i Big Techs kamp om AI-tr\u00e4ningsdata"},"content":{"rendered":"<p><b>I den frenetiska jakten p\u00e5 AI-tr\u00e4ningsdata har teknikj\u00e4ttarna OpenAI, Google och Meta enligt uppgift kringg\u00e5tt f\u00f6retagspolicyer, \u00e4ndrat sina regler och diskuterat kringg\u00e5ende av upphovsr\u00e4ttslagstiftningen.\u00a0<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A <\/span><a href=\"https:\/\/www.nytimes.com\/2024\/04\/06\/technology\/tech-giants-harvest-data-artificial-intelligence.html?smid=nytcore-ios-share&amp;sgrp=c-cb\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">New York Times unders\u00f6kning<\/span><\/a><span style=\"font-weight: 400;\"> avsl\u00f6jar hur l\u00e5ngt dessa f\u00f6retag har g\u00e5tt f\u00f6r att samla in onlineinformation f\u00f6r att mata sina datahungriga AI-system.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I slutet av 2021 utvecklade OpenAI-forskare ett taligenk\u00e4nningsverktyg som heter Whisper f\u00f6r att transkribera YouTube-videor n\u00e4r det r\u00e5der brist p\u00e5 ansedda engelskspr\u00e5kiga textdata.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Trots interna diskussioner om att eventuellt bryta mot YouTubes regler, som f\u00f6rbjuder anv\u00e4ndning av dess videor f\u00f6r \"oberoende\" applikationer,\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">NYT fann att OpenAI i slut\u00e4ndan transkriberade \u00f6ver en miljon timmar YouTube-inneh\u00e5ll. Greg Brockman, OpenAI:s president, hj\u00e4lpte personligen till med att samla in videorna. Den transkriberade texten matades sedan in i GPT-4.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Google p\u00e5st\u00e5s ocks\u00e5 ha transkriberat YouTube-videor f\u00f6r att samla in text till sina AI-modeller, vilket potentiellt kan inneb\u00e4ra intr\u00e5ng i videoskaparnas upphovsr\u00e4tt. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Detta kommer dagar efter att YouTubes VD sa att s\u00e5dan aktivitet skulle bryta mot <\/span><a href=\"https:\/\/dailyai.com\/sv\/2024\/04\/youtube-ceo-warns-openai-about-potential-terms-of-service-violation\/\"><span style=\"font-weight: 400;\">f\u00f6retagets anv\u00e4ndarvillkor<\/span><\/a><span style=\"font-weight: 400;\"> och underminera kreat\u00f6rer.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I juni 2023 beg\u00e4rde Googles juridiska avdelning att f\u00f6retagets integritetspolicy skulle \u00e4ndras s\u00e5 att offentligt tillg\u00e4ngligt inneh\u00e5ll fr\u00e5n Google Docs och andra Google-appar skulle kunna anv\u00e4ndas f\u00f6r ett bredare utbud av AI-produkter.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Meta, som st\u00e5r inf\u00f6r sin egen databrist, har \u00f6verv\u00e4gt olika alternativ f\u00f6r att skaffa mer utbildningsdata.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Cheferna diskuterade att betala f\u00f6r boklicensr\u00e4ttigheter, k\u00f6pa f\u00f6rlaget Simon &amp; Schuster och till och med sk\u00f6rda upphovsr\u00e4ttsskyddat material fr\u00e5n internet utan tillst\u00e5nd, med risk f\u00f6r potentiella st\u00e4mningar.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Metas advokater h\u00e4vdade att anv\u00e4ndning av data f\u00f6r att tr\u00e4na AI-system borde falla under \"r\u00e4ttvis anv\u00e4ndning\", med h\u00e4nvisning till ett domstolsbeslut fr\u00e5n 2015 som involverade Googles bokskanningsprojekt.<\/span><\/p>\n<h2>Etiska fr\u00e5gor och framtiden f\u00f6r AI-tr\u00e4ningsdata<\/h2>\n<p><span style=\"font-weight: 400;\">Dessa teknikf\u00f6retags kollektiva agerande belyser den avg\u00f6rande betydelsen av online-data i den blomstrande AI-industrin.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Dessa metoder har v\u00e4ckt fr\u00e5gor om upphovsr\u00e4ttsintr\u00e5ng och r\u00e4ttvis ers\u00e4ttning till upphovsm\u00e4nnen.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Justine Bateman, filmskapare och f\u00f6rfattare, ber\u00e4ttade f\u00f6r Copyright Office att AI-modeller tog inneh\u00e5ll - inklusive hennes texter och filmer - utan tillst\u00e5nd eller betalning. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">\"Det h\u00e4r \u00e4r den st\u00f6rsta st\u00f6lden i USA, punkt slut\", s\u00e4ger hon i en intervju.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Inom bildkonsten har MidJourney och andra bildmodeller varit <\/span><a href=\"https:\/\/dailyai.com\/sv\/2024\/01\/16000-artist-names-leaked-as-midjourney-styles\/\"><span style=\"font-weight: 400;\">bevisat att generera upphovsr\u00e4tt<\/span><\/a><span style=\"font-weight: 400;\"> inneh\u00e5ll, som scener fr\u00e5n Marvel-filmer.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Eftersom vissa experter f\u00f6rutsp\u00e5r att h\u00f6gkvalitativ onlinedata kan vara utt\u00f6md 2026, utforskar f\u00f6retagen alternativa metoder, som att generera syntetisk data med hj\u00e4lp av AI-modeller.\u00a0<\/span><span style=\"font-weight: 400;\">Syntetisk utbildningsdata medf\u00f6r dock sina egna risker och utmaningar och kan ha en negativ inverkan p\u00e5 <\/span><a href=\"https:\/\/dailyai.com\/sv\/2023\/06\/what-happens-when-ai-starts-consuming-its-own-output\/\"><span style=\"font-weight: 400;\">p\u00e5verka modellernas kvalitet<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OpenAI:s VD Sam Altman erk\u00e4nde sj\u00e4lv att onlinedata \u00e4r en begr\u00e4nsad tillg\u00e5ng i ett tal p\u00e5 en teknikkonferens i maj 2023: \"Det kommer att ta slut\", sa han.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Sy Damle, en advokat som f\u00f6retr\u00e4der Andreessen Horowitz, ett riskkapitalbolag i Silicon Valley, diskuterade ocks\u00e5 utmaningen: \"Det enda praktiska s\u00e4ttet f\u00f6r dessa verktyg att existera \u00e4r om de kan tr\u00e4nas p\u00e5 enorma m\u00e4ngder data utan att beh\u00f6va licensiera dessa data. Den data som beh\u00f6vs \u00e4r s\u00e5 massiv att inte ens kollektiv licensiering verkligen kan fungera.\"<\/span><\/p>\n<p>NYT och OpenAI \u00e4r fastl\u00e5sta i en <a href=\"https:\/\/dailyai.com\/sv\/2023\/08\/the-new-york-times-may-sue-openai-over-copyright-claims\/\">bitter upphovsr\u00e4ttsst\u00e4mning<\/a>Times beg\u00e4rde vad som sannolikt skulle bli miljonbelopp i skadest\u00e5nd.<\/p>\n<p>OpenAI slog tillbaka och anklagade Times f\u00f6r att <a href=\"https:\/\/dailyai.com\/sv\/2024\/02\/openai-blasts-the-new-york-times-claiming-they-hacked-their-evidence\/\">\"hacka\" sina modeller<\/a> f\u00f6r att h\u00e4mta exempel p\u00e5 upphovsr\u00e4ttsintr\u00e5ng.<\/p>\n<p>Med \"hacking\" menar de jailbreaking eller red-teaming, vilket inneb\u00e4r att man riktar in sig p\u00e5 modellen med speciellt formulerade uppmaningar som \u00e4r avsedda att manipulera resultaten.<\/p>\n<p>NYT skrev att de inte skulle beh\u00f6va jailbreaka modeller om AI-f\u00f6retagen var transparenta med vilka data de anv\u00e4nde.<\/p>\n<p>Det r\u00e5der ingen tvekan om att denna insiderutredning ytterligare framh\u00e4ver Big Techs datakupp som etiskt och juridiskt oacceptabel.<\/p>\n<p><span style=\"font-weight: 400;\">Med st\u00e4mningar som hopar sig,<\/span><span style=\"font-weight: 400;\">\u00a0det juridiska landskapet kring anv\u00e4ndningen av onlinedata f\u00f6r AI-tr\u00e4ning \u00e4r extremt os\u00e4kert.\u00a0<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>I den frenetiska jakten p\u00e5 AI-tr\u00e4ningsdata har teknikj\u00e4ttarna OpenAI, Google och Meta enligt uppgift kringg\u00e5tt f\u00f6retagspolicyer, \u00e4ndrat sina regler och diskuterat att kringg\u00e5 upphovsr\u00e4ttslagen.  En New York Times-unders\u00f6kning avsl\u00f6jar hur l\u00e5ngt dessa f\u00f6retag har g\u00e5tt f\u00f6r att sk\u00f6rda onlineinformation f\u00f6r att mata sina datahungriga AI-system. I slutet av 2021 utvecklade OpenAI-forskare ett taligenk\u00e4nningsverktyg som heter Whisper f\u00f6r att transkribera YouTube-videor n\u00e4r de stod inf\u00f6r en brist p\u00e5 ansedda engelskspr\u00e5kiga textdata.  Trots interna diskussioner om potentiellt brott mot YouTubes regler, som f\u00f6rbjuder att anv\u00e4nda sina videor f\u00f6r \"oberoende\" applikationer, fann NYT att OpenAI i slut\u00e4ndan transkriberade \u00f6ver en miljon timmar<\/p>","protected":false},"author":2,"featured_media":11427,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[88],"tags":[197],"class_list":["post-11426","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ethics","tag-copyright"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Inside Big Tech\u2019s tussle over AI training data | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/sv\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/\" \/>\n<meta property=\"og:locale\" content=\"sv_SE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Inside Big Tech\u2019s tussle over AI training data | DailyAI\" \/>\n<meta property=\"og:description\" content=\"In the frantic pursuit of AI training data, tech giants OpenAI, Google, and Meta have reportedly bypassed corporate policies, altered their rules, and discussed circumventing copyright law.\u00a0 A New York Times investigation reveals the lengths these companies have gone to harvest online information to feed their data-hungry AI systems. In late 2021, OpenAI researchers developed a speech recognition tool called Whisper to transcribe YouTube videos when facing a shortage of reputable English-language text data.\u00a0 Despite internal discussions about potentially violating YouTube&#8217;s rules, which prohibit using its videos for &#8220;independent&#8221; applications,\u00a0 NYT found that OpenAI ultimately transcribed over one million hours\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/sv\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-08T17:45:24+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-09T08:28:17+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Skriven av\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Ber\u00e4knad l\u00e4stid\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minuter\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"Inside Big Tech\u2019s tussle over AI training data\",\"datePublished\":\"2024-04-08T17:45:24+00:00\",\"dateModified\":\"2024-04-09T08:28:17+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/\"},\"wordCount\":621,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp\",\"keywords\":[\"Copyright\"],\"articleSection\":[\"Ethics &amp; Society\"],\"inLanguage\":\"sv-SE\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/\",\"name\":\"Inside Big Tech\u2019s tussle over AI training data | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp\",\"datePublished\":\"2024-04-08T17:45:24+00:00\",\"dateModified\":\"2024-04-09T08:28:17+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#breadcrumb\"},\"inLanguage\":\"sv-SE\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/04\\\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp\",\"width\":1792,\"height\":1024,\"caption\":\"Data\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/04\\\/inside-big-techs-tussle-over-ai-training-data\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Inside Big Tech\u2019s tussle over AI training data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"sv-SE\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"sv-SE\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/sv\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Inblick i Big Techs kamp om AI-tr\u00e4ningsdata | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/sv\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/","og_locale":"sv_SE","og_type":"article","og_title":"Inside Big Tech\u2019s tussle over AI training data | DailyAI","og_description":"In the frantic pursuit of AI training data, tech giants OpenAI, Google, and Meta have reportedly bypassed corporate policies, altered their rules, and discussed circumventing copyright law.\u00a0 A New York Times investigation reveals the lengths these companies have gone to harvest online information to feed their data-hungry AI systems. In late 2021, OpenAI researchers developed a speech recognition tool called Whisper to transcribe YouTube videos when facing a shortage of reputable English-language text data.\u00a0 Despite internal discussions about potentially violating YouTube&#8217;s rules, which prohibit using its videos for &#8220;independent&#8221; applications,\u00a0 NYT found that OpenAI ultimately transcribed over one million hours","og_url":"https:\/\/dailyai.com\/sv\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/","og_site_name":"DailyAI","article_published_time":"2024-04-08T17:45:24+00:00","article_modified_time":"2024-04-09T08:28:17+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp","type":"image\/webp"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Skriven av":"Sam Jeans","Ber\u00e4knad l\u00e4stid":"3 minuter"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"Inside Big Tech\u2019s tussle over AI training data","datePublished":"2024-04-08T17:45:24+00:00","dateModified":"2024-04-09T08:28:17+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/"},"wordCount":621,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp","keywords":["Copyright"],"articleSection":["Ethics &amp; Society"],"inLanguage":"sv-SE"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/","url":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/","name":"Inblick i Big Techs kamp om AI-tr\u00e4ningsdata | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp","datePublished":"2024-04-08T17:45:24+00:00","dateModified":"2024-04-09T08:28:17+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#breadcrumb"},"inLanguage":"sv-SE","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/"]}]},{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/04\/DALL\u00b7E-2024-04-08-18.42.46-Visualize-a-dramatic-and-futuristic-scene-inside-a-vast-data-center-filled-with-towering-server-racks-emitting-blue-and-red-lights-casting-a-vibrant.webp","width":1792,"height":1024,"caption":"Data"},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/04\/inside-big-techs-tussle-over-ai-training-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"Inside Big Tech\u2019s tussle over AI training data"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DagligaAI","description":"Din dagliga dos av AI-nyheter","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"sv-SE"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DagligaAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Sam Jeans","image":{"@type":"ImageObject","inLanguage":"sv-SE","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam \u00e4r en vetenskaps- och teknikskribent som har arbetat i olika AI-startups. N\u00e4r han inte skriver l\u00e4ser han medicinska tidskrifter eller gr\u00e4ver igenom l\u00e5dor med vinylskivor.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/sv\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11426","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/comments?post=11426"}],"version-history":[{"count":7,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11426\/revisions"}],"predecessor-version":[{"id":11434,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/posts\/11426\/revisions\/11434"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/media\/11427"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/media?parent=11426"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/categories?post=11426"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/sv\/wp-json\/wp\/v2\/tags?post=11426"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}