{"id":3939,"date":"2023-08-07T18:18:57","date_gmt":"2023-08-07T18:18:57","guid":{"rendered":"https:\/\/dailyai.com\/?p=3939"},"modified":"2023-08-09T09:59:48","modified_gmt":"2023-08-09T09:59:48","slug":"openai-inconspicuously-unveils-its-own-data-scraper-gptbot","status":"publish","type":"post","link":"https:\/\/dailyai.com\/fr\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/","title":{"rendered":"OpenAI d\u00e9voile discr\u00e8tement son propre scraper de donn\u00e9es, GPTBot"},"content":{"rendered":"<p><b>OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un scraper web d\u00e9di\u00e9 \u00e0 la collecte de donn\u00e9es d'entra\u00eenement.<\/b><\/p>\n<p><strong>Editer<\/strong>: Il n'est actuellement pas clair si GPTBot est le m\u00eame bot \/ mis \u00e0 jour que OpenAI a utilis\u00e9 pour gratter des donn\u00e9es parall\u00e8lement \u00e0 Common Crawl en 2018\/2019 ou s'il s'agit d'une version nouvelle \/ \u00e9volu\u00e9e. Quoi qu'il en soit, c'est la premi\u00e8re fois qu'ils publient des donn\u00e9es sur la fa\u00e7on de l'emp\u00eacher de gratter des donn\u00e9es de sites Web.<\/p>\n<p><span style=\"font-weight: 400;\">OpenAI a publi\u00e9 des informations sur GPTBot sur son site Web. <\/span><a href=\"https:\/\/platform.openai.com\/docs\/gptbot\"><span style=\"font-weight: 400;\">site web ici<\/span><\/a><span style=\"font-weight: 400;\">La Commission europ\u00e9enne a publi\u00e9 un rapport sur l'utilisation de l'Internet dans les sites web, y compris des d\u00e9tails sur la fa\u00e7on dont les administrateurs de sites web peuvent l'emp\u00eacher d'explorer et de gratter leurs sites web.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Pour emp\u00eacher GPTBot d'explorer un site web, les administrateurs peuvent modifier les param\u00e8tres du fichier robots.txt. Ce fichier, qui est un outil standard de gestion des sites web datant d'une trentaine d'ann\u00e9es, indique les zones du site web interdites aux robots d'exploration.\u00a0<\/span><\/p>\n<p>Pour distinguer bri\u00e8vement le crawling du scraping, les crawlers parcourent le contenu des sites web tandis que les scrapers en extraient les donn\u00e9es. Il s'agit d'un processus en deux parties, bien que les deux soient g\u00e9n\u00e9ralement appel\u00e9s collectivement \"scraping\".<\/p>\n<p><span style=\"font-weight: 400;\">OpenAI a \u00e9galement r\u00e9v\u00e9l\u00e9 le bloc d'adresses IP utilis\u00e9 par GPTBot, <\/span><a href=\"https:\/\/openai.com\/gptbot-ranges.txt\"><span style=\"font-weight: 400;\">disponible ici<\/span><\/a><span style=\"font-weight: 400;\">ce qui offre une autre possibilit\u00e9 d'inhiber l'activit\u00e9 du bot.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">D'aucuns se demandent si cela n'offre pas \u00e0 l'OpenAI une couche suppl\u00e9mentaire de protection contre les all\u00e9gations d'utilisation non autoris\u00e9e des donn\u00e9es.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> OpenAI et d'autres d\u00e9veloppeurs d'IA sont <a href=\"https:\/\/dailyai.com\/fr\/2023\/08\/inside-the-battle-between-artists-and-ai-image-generators\/\">les poursuites judiciaires les plus importantes<\/a> concernant la mani\u00e8re dont ils ont utilis\u00e9 les donn\u00e9es des personnes sans leur autorisation.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">D\u00e9sormais, les administrateurs de sites web doivent emp\u00eacher de mani\u00e8re proactive que leurs sites soient scrapp\u00e9s pour obtenir des donn\u00e9es d'entra\u00eenement, et il leur incombe donc d'\u00e9viter que les donn\u00e9es de leur site ne se retrouvent dans les ensembles de donn\u00e9es d'entra\u00eenement.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Il convient de noter que GPTBot n'est pas le seul outil de ce type. OpenAI a utilis\u00e9 d'autres ensembles de donn\u00e9es pour entra\u00eener ses mod\u00e8les, notamment l'ensemble de donn\u00e9es Common Crawl.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Comme GPTBot, le crawler CCBot peut \u00e9galement \u00eatre contr\u00f4l\u00e9 en ajoutant des lignes de code sp\u00e9cifiques dans le fichier robots.txt.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Comment emp\u00eacher ChatGPT d'explorer les donn\u00e9es de votre site ?<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">OpenAI utilisera GPTBot pour le scraping cibl\u00e9 de donn\u00e9es, mais il est possible de l'emp\u00eacher de scanner des sites web entiers ou des pages web sp\u00e9cifiques. Lire le communiqu\u00e9 de presse d'OpenAI <\/span><a href=\"https:\/\/platform.openai.com\/docs\/gptbot\"><span style=\"font-weight: 400;\">Documentation compl\u00e8te ici<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OpenAI a publi\u00e9 les informations suivantes :<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPTBot est identifi\u00e9 par son user agent token \"GPTBot\". La cha\u00eene compl\u00e8te de l'agent utilisateur qui lui est associ\u00e9e est la suivante : \"Mozilla\/5.0 AppleWebKit\/537.36 (KHTML, comme Gecko ; compatible ; GPTBot\/1.0 ; +https:\/\/openai.com\/gptbot)\".<\/span><\/p>\n<p><span style=\"font-weight: 400;\">En modifiant le fichier robots.txt, il est possible d'emp\u00eacher GPTBot d'acc\u00e9der \u00e0 l'ensemble d'un site web ou \u00e0 des parties s\u00e9lectionn\u00e9es.\u00a0<\/span><\/p>\n<p><strong>Pour emp\u00eacher GPTBot d'acc\u00e9der \u00e0 un site, les administrateurs peuvent modifier le fichier robots.txt de leur site web comme suit :<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">User-agent : GPTBot<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Ne pas autoriser : \/<\/span><\/p>\n<p><strong>Des parties de sites web peuvent \u00eatre autoris\u00e9es\/interdites par les moyens suivants :<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">User-agent : GPTBot<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Autoriser : \/directory-1\/<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Disallow : \/directory-2\/<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OpenAI a \u00e9galement rendu publiques les plages d'adresses IP utilis\u00e9es par GPTBot <\/span><a href=\"https:\/\/openai.com\/gptbot-ranges.txt\"><span style=\"font-weight: 400;\">disponible ici<\/span><\/a><span style=\"font-weight: 400;\">. Bien qu'une seule gamme ait \u00e9t\u00e9 r\u00e9pertori\u00e9e, d'autres pourront \u00eatre ajout\u00e9es en temps voulu.<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un scraper web d\u00e9di\u00e9 \u00e0 la collecte de donn\u00e9es d'entra\u00eenement. Edit : Il n'est actuellement pas clair si GPTBot est le m\u00eame bot \/ mis \u00e0 jour que celui qu'OpenAI a utilis\u00e9 pour gratter des donn\u00e9es avec Common Crawl en 2018\/2019 ou s'il s'agit d'une version nouvelle \/ \u00e9volu\u00e9e. Quoi qu'il en soit, c'est la premi\u00e8re fois qu'ils publient des donn\u00e9es sur la fa\u00e7on de l'emp\u00eacher de gratter des donn\u00e9es de sites Web. OpenAI a publi\u00e9 des informations sur GPTBot sur son site web ici, y compris des d\u00e9tails sur la fa\u00e7on dont les administrateurs de sites web peuvent l'emp\u00eacher d'explorer et de gratter leurs sites web.  Pour emp\u00eacher GPTBot d'explorer un site web, les administrateurs peuvent ajuster les param\u00e8tres du fichier robots.txt.<\/p>","protected":false},"author":2,"featured_media":3940,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[115,238,93],"class_list":["post-3939","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-chatgpt","tag-data-scraping","tag-openai"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>OpenAI inconspicuously unveils its own data scraper, GPTBot | DailyAI<\/title>\n<meta name=\"description\" content=\"OpenAI discretely unveiled GPTBot, a dedicated web crawler.OpenAI discretely unveiled GPTBot, a dedicated web scraper for collecting training data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/fr\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/\" \/>\n<meta property=\"og:locale\" content=\"fr_FR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"OpenAI inconspicuously unveils its own data scraper, GPTBot | DailyAI\" \/>\n<meta property=\"og:description\" content=\"OpenAI discretely unveiled GPTBot, a dedicated web crawler.OpenAI discretely unveiled GPTBot, a dedicated web scraper for collecting training data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/fr\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-07T18:18:57+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-09T09:59:48+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"667\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"\u00c9crit par\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Dur\u00e9e de lecture estim\u00e9e\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"OpenAI inconspicuously unveils its own data scraper, GPTBot\",\"datePublished\":\"2023-08-07T18:18:57+00:00\",\"dateModified\":\"2023-08-09T09:59:48+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/\"},\"wordCount\":455,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/shutterstock_2283461521-1.jpg\",\"keywords\":[\"ChatGPT\",\"Data scraping\",\"OpenAI\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"fr-FR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/\",\"name\":\"OpenAI inconspicuously unveils its own data scraper, GPTBot | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/shutterstock_2283461521-1.jpg\",\"datePublished\":\"2023-08-07T18:18:57+00:00\",\"dateModified\":\"2023-08-09T09:59:48+00:00\",\"description\":\"OpenAI discretely unveiled GPTBot, a dedicated web crawler.OpenAI discretely unveiled GPTBot, a dedicated web scraper for collecting training data.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#breadcrumb\"},\"inLanguage\":\"fr-FR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/shutterstock_2283461521-1.jpg\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/08\\\/shutterstock_2283461521-1.jpg\",\"width\":1000,\"height\":667,\"caption\":\"OpenAI GPTBot\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/08\\\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"OpenAI inconspicuously unveils its own data scraper, GPTBot\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"fr-FR\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/fr\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"OpenAI d\u00e9voile discr\u00e8tement son propre scraper de donn\u00e9es, GPTBot | DailyAI","description":"OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un crawler web d\u00e9di\u00e9.OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un scraper web d\u00e9di\u00e9 \u00e0 la collecte de donn\u00e9es d'entra\u00eenement.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/fr\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/","og_locale":"fr_FR","og_type":"article","og_title":"OpenAI inconspicuously unveils its own data scraper, GPTBot | DailyAI","og_description":"OpenAI discretely unveiled GPTBot, a dedicated web crawler.OpenAI discretely unveiled GPTBot, a dedicated web scraper for collecting training data.","og_url":"https:\/\/dailyai.com\/fr\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/","og_site_name":"DailyAI","article_published_time":"2023-08-07T18:18:57+00:00","article_modified_time":"2023-08-09T09:59:48+00:00","og_image":[{"width":1000,"height":667,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg","type":"image\/jpeg"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"\u00c9crit par":"Sam Jeans","Dur\u00e9e de lecture estim\u00e9e":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"OpenAI inconspicuously unveils its own data scraper, GPTBot","datePublished":"2023-08-07T18:18:57+00:00","dateModified":"2023-08-09T09:59:48+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/"},"wordCount":455,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg","keywords":["ChatGPT","Data scraping","OpenAI"],"articleSection":["Industry"],"inLanguage":"fr-FR"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/","url":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/","name":"OpenAI d\u00e9voile discr\u00e8tement son propre scraper de donn\u00e9es, GPTBot | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg","datePublished":"2023-08-07T18:18:57+00:00","dateModified":"2023-08-09T09:59:48+00:00","description":"OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un crawler web d\u00e9di\u00e9.OpenAI a discr\u00e8tement d\u00e9voil\u00e9 GPTBot, un scraper web d\u00e9di\u00e9 \u00e0 la collecte de donn\u00e9es d'entra\u00eenement.","breadcrumb":{"@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#breadcrumb"},"inLanguage":"fr-FR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/"]}]},{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/08\/shutterstock_2283461521-1.jpg","width":1000,"height":667,"caption":"OpenAI GPTBot"},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2023\/08\/openai-inconspicuously-unveils-its-own-data-scraper-gptbot\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"OpenAI inconspicuously unveils its own data scraper, GPTBot"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"Votre dose quotidienne de nouvelles sur l'IA","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"fr-FR"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Sam Jeans","image":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam est un r\u00e9dacteur scientifique et technologique qui a travaill\u00e9 dans diverses start-ups sp\u00e9cialis\u00e9es dans l'IA. Lorsqu'il n'\u00e9crit pas, on peut le trouver en train de lire des revues m\u00e9dicales ou de fouiller dans des bo\u00eetes de disques vinyles.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/fr\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/posts\/3939","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/comments?post=3939"}],"version-history":[{"count":6,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/posts\/3939\/revisions"}],"predecessor-version":[{"id":3992,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/posts\/3939\/revisions\/3992"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/media\/3940"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/media?parent=3939"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/categories?post=3939"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/fr\/wp-json\/wp\/v2\/tags?post=3939"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}