{"id":6804,"date":"2023-10-26T19:21:21","date_gmt":"2023-10-26T19:21:21","guid":{"rendered":"https:\/\/dailyai.com\/?p=6804"},"modified":"2023-10-26T21:16:11","modified_gmt":"2023-10-26T21:16:11","slug":"new-research-into-datasets-reveals-systemic-ethical-and-legal-issues","status":"publish","type":"post","link":"https:\/\/dailyai.com\/pt\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/","title":{"rendered":"Nova investiga\u00e7\u00e3o sobre conjuntos de dados revela quest\u00f5es \u00e9ticas e jur\u00eddicas sist\u00e9micas"},"content":{"rendered":"<p><b>A IA gira em torno dos dados, mas de onde \u00e9 que eles v\u00eam? Os conjuntos de dados s\u00e3o legais e \u00e9ticos? Como \u00e9 que os programadores determinam isso com certeza?\u00a0<\/b><\/p>\n<p><span style=\"font-weight: 400;\">O treino de modelos de aprendizagem autom\u00e1tica, como os modelos de linguagem de grande dimens\u00e3o (LLM), requer grandes volumes de dados de texto.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Existem montes de conjuntos de dados dispon\u00edveis em plataformas como o Kaggle, o GitHub e o Hugging Face, mas estes encontram-se numa zona cinzenta do ponto de vista legal e \u00e9tico, principalmente devido a quest\u00f5es de licenciamento e de utiliza\u00e7\u00e3o justa.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">O <a href=\"https:\/\/www.dataprovenance.org\/\">Iniciativa sobre a proveni\u00eancia dos dados<\/a>, um esfor\u00e7o de colabora\u00e7\u00e3o entre investigadores de IA e profissionais do sector jur\u00eddico, analisou milhares de conjuntos de dados para esclarecer as suas verdadeiras origens. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00c9 <\/span><span style=\"font-weight: 400;\">centrou-se em mais de 1800 conjuntos de dados dispon\u00edveis em plataformas, incluindo Hugging Face, GitHub e Papers With Code. <\/span><span style=\"font-weight: 400;\">Os conjuntos de dados s\u00e3o predominantemente concebidos para afinar modelos de fonte aberta como o Llama-2.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">O estudo revelou que aproximadamente 70% destes conjuntos de dados n\u00e3o tinham informa\u00e7\u00f5es claras sobre o licenciamento ou estavam marcados com licen\u00e7as demasiado permissivas.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Com uma flagrante falta de clareza sobre os direitos de autor e as restri\u00e7\u00f5es de utiliza\u00e7\u00e3o comercial, os programadores de IA arriscam-se a infringir acidentalmente a lei ou a violar os direitos de autor.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Shayne Longpre, um candidato a doutoramento no MIT Media Lab que liderou a auditoria, sublinhou que a quest\u00e3o n\u00e3o \u00e9 culpa das plataformas de alojamento, mas sim um problema sist\u00e9mico da comunidade de aprendizagem autom\u00e1tica.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">2023 assistiu-se a um <a href=\"https:\/\/dailyai.com\/pt\/2023\/09\/george-r-r-martin-and-17-other-writers-file-lawsuit-against-openai\/\">enxurrada de ac\u00e7\u00f5es judiciais<\/a> visando os principais criadores de IA, como a Meta, a Anthropic e a OpenAI, que est\u00e3o sob extrema press\u00e3o para adotar pr\u00e1ticas de recolha de dados mais transparentes. Regulamentos, como o <a href=\"https:\/\/dailyai.com\/pt\/2023\/06\/eu-ai-act-passes-crucial-vote-and-enters-its-final-stages\/\">Lei da IA da UE<\/a>est\u00e3o definidas para aplicar exatamente isso.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A iniciativa Data Provenance permite aos programadores de aprendizagem autom\u00e1tica <\/span><a href=\"https:\/\/www.dataprovenance.org\/\"><span style=\"font-weight: 400;\">explorar os conjuntos de dados auditados aqui<\/span><\/a><span style=\"font-weight: 400;\">. A iniciativa tamb\u00e9m analisa padr\u00f5es dentro dos conjuntos de dados, lan\u00e7ando luz sobre as suas origens geogr\u00e1ficas e institucionais.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A maior parte dos conjuntos de dados s\u00e3o constru\u00eddos no Norte Global angl\u00f3fono, o que real\u00e7a os desequil\u00edbrios socioculturais.\u00a0<\/span><\/p>\n<figure id=\"attachment_6805\" aria-describedby=\"caption-attachment-6805\" style=\"width: 973px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-6805 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution.png\" alt=\"Proveni\u00eancia dos dados IA\" width=\"973\" height=\"529\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution.png 973w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-300x163.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-768x418.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-370x201.png 370w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-800x435.png 800w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-20x11.png 20w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-740x402.png 740w, https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/datadistribution-88x48.png 88w\" sizes=\"auto, (max-width: 973px) 100vw, 973px\" \/><figcaption id=\"caption-attachment-6805\" class=\"wp-caption-text\">A iniciativa Data Provenance constatou que os conjuntos de dados representam predominantemente pa\u00edses de l\u00edngua inglesa e o Norte Global. Fonte: <a href=\"https:\/\/www.dataprovenance.org\/paper.pdf\">Dados Provenance.org<\/a>.<\/figcaption><\/figure>\n<h2><span style=\"font-weight: 400;\">Mais informa\u00e7\u00f5es sobre o estudo<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Esta an\u00e1lise em grande escala de conjuntos de dados revelou problemas sistem\u00e1ticos na forma como os dados s\u00e3o recolhidos e distribu\u00eddos. A iniciativa tamb\u00e9m produziu um documento para explicar as suas conclus\u00f5es, <\/span><a href=\"https:\/\/www.dataprovenance.org\/paper.pdf\"><span style=\"font-weight: 400;\">publicado aqui<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Mais informa\u00e7\u00f5es sobre os m\u00e9todos e os resultados do estudo:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Analisar conjuntos de dados quanto \u00e0 origem e rotulagem<\/b><span style=\"font-weight: 400;\">: Este estudo auditou sistematicamente mais de 1800 conjuntos de dados de afina\u00e7\u00e3o para analisar a proveni\u00eancia, o licenciamento e a documenta\u00e7\u00e3o dos dados.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Provas de rotulagem incorrecta<\/b><span style=\"font-weight: 400;\">: Os resultados evidenciaram a diferen\u00e7a entre os tipos de dados dispon\u00edveis ao abrigo de diferentes licen\u00e7as e as implica\u00e7\u00f5es para as interpreta\u00e7\u00f5es jur\u00eddicas dos direitos de autor e da utiliza\u00e7\u00e3o justa. Descobriu-se uma elevada taxa de m\u00e1 classifica\u00e7\u00e3o das licen\u00e7as, com mais de 72% de conjuntos de dados que n\u00e3o especificam uma licen\u00e7a e uma taxa de erro de 50% nos que a especificam.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Proveni\u00eancia de dados n\u00e3o fi\u00e1vel<\/b><span style=\"font-weight: 400;\">: A investiga\u00e7\u00e3o chama a aten\u00e7\u00e3o para a quest\u00e3o da falta de fiabilidade da proveni\u00eancia dos dados, salientando a necessidade de normas para rastrear a linhagem dos dados, garantir uma atribui\u00e7\u00e3o adequada e incentivar uma utiliza\u00e7\u00e3o respons\u00e1vel dos dados.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Distribui\u00e7\u00e3o geogr\u00e1fica: <\/b><span style=\"font-weight: 400;\">O estudo destaca uma grave falta de representa\u00e7\u00e3o e atribui\u00e7\u00e3o de conjuntos de dados provenientes do Sul Global. A maioria dos conjuntos de dados gira em torno da l\u00edngua inglesa e est\u00e1 culturalmente ligada \u00e0 Europa, \u00e0 Am\u00e9rica do Norte e \u00e0 Oce\u00e2nia de l\u00edngua inglesa.\u00a0<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">Este estudo destaca quest\u00f5es sist\u00e9micas e estruturais na forma como os dados s\u00e3o criados, distribu\u00eddos e utilizados. Os dados s\u00e3o um recurso cr\u00edtico para a IA e, tal como os recursos naturais, s\u00e3o finitos.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Existe a preocupa\u00e7\u00e3o de que a tecnologia de IA acabe por ultrapassar os actuais conjuntos de dados e, possivelmente, at\u00e9 <\/span><a href=\"https:\/\/dailyai.com\/pt\/2023\/06\/what-happens-when-ai-starts-consuming-its-own-output\/\"><span style=\"font-weight: 400;\">come\u00e7ar a consumir a sua pr\u00f3pria produ\u00e7\u00e3o<\/span><\/a><span style=\"font-weight: 400;\">ou seja, os modelos de IA aprender\u00e3o com o texto gerado pela IA.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Esta situa\u00e7\u00e3o pode corroer a qualidade dos modelos, o que significa que os dados de alta qualidade, \u00e9ticos e legais podem tornar-se efetivamente muito valiosos. <\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>A IA gira em torno dos dados, mas de onde \u00e9 que eles v\u00eam? Os conjuntos de dados s\u00e3o legais e \u00e9ticos? Como \u00e9 que os programadores determinam isso com certeza?  O treino de modelos de aprendizagem autom\u00e1tica, como os modelos de linguagem de grande dimens\u00e3o (LLM), requer grandes volumes de dados de texto.  Existem montes de conjuntos de dados dispon\u00edveis em plataformas como o Kaggle, o GitHub e o Hugging Face, mas estes existem numa zona cinzenta do ponto de vista legal e \u00e9tico, principalmente devido a quest\u00f5es de licenciamento e de utiliza\u00e7\u00e3o justa.  A Data Provenance Initiative, um esfor\u00e7o de colabora\u00e7\u00e3o entre investigadores de IA e profissionais da \u00e1rea jur\u00eddica, analisou milhares de conjuntos de dados para esclarecer as suas verdadeiras origens. Centrou-se em mais de 1800<\/p>","protected":false},"author":2,"featured_media":6806,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[454,453,105],"class_list":["post-6804","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-data","tag-datasets","tag-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>New research into datasets reveals systemic ethical and legal issues | DailyAI<\/title>\n<meta name=\"description\" content=\"AI revolves around data, but where does it come from? Is it legal to use? It might be labeled as such, but is it really?\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/pt\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/\" \/>\n<meta property=\"og:locale\" content=\"pt_PT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"New research into datasets reveals systemic ethical and legal issues | DailyAI\" \/>\n<meta property=\"og:description\" content=\"AI revolves around data, but where does it come from? Is it legal to use? It might be labeled as such, but is it really?\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/pt\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-26T19:21:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-10-26T21:16:11+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"583\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Sam Jeans\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sam Jeans\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tempo estimado de leitura\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/\"},\"author\":{\"name\":\"Sam Jeans\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\"},\"headline\":\"New research into datasets reveals systemic ethical and legal issues\",\"datePublished\":\"2023-10-26T19:21:21+00:00\",\"dateModified\":\"2023-10-26T21:16:11+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/\"},\"wordCount\":576,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/shutterstock_1166248483.jpg\",\"keywords\":[\"Data\",\"Datasets\",\"machine learning\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"pt-PT\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/\",\"name\":\"New research into datasets reveals systemic ethical and legal issues | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/shutterstock_1166248483.jpg\",\"datePublished\":\"2023-10-26T19:21:21+00:00\",\"dateModified\":\"2023-10-26T21:16:11+00:00\",\"description\":\"AI revolves around data, but where does it come from? Is it legal to use? It might be labeled as such, but is it really?\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#breadcrumb\"},\"inLanguage\":\"pt-PT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-PT\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/shutterstock_1166248483.jpg\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/shutterstock_1166248483.jpg\",\"width\":1000,\"height\":583},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2023\\\/10\\\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"New research into datasets reveals systemic ethical and legal issues\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"pt-PT\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-PT\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/711e81f945549438e8bbc579efdeb3c9\",\"name\":\"Sam Jeans\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-PT\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g\",\"caption\":\"Sam Jeans\"},\"description\":\"Sam is a science and technology writer who has worked in various AI startups. When he\u2019s not writing, he can be found reading medical journals or digging through boxes of vinyl records.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/sam-jeans-6746b9142\\\/\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/pt\\\/author\\\/samjeans\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Nova investiga\u00e7\u00e3o sobre conjuntos de dados revela quest\u00f5es \u00e9ticas e jur\u00eddicas sist\u00e9micas | DailyAI","description":"A IA gira em torno dos dados, mas de onde \u00e9 que eles v\u00eam? A sua utiliza\u00e7\u00e3o \u00e9 legal? Pode ser rotulado como tal, mas ser\u00e1 mesmo?","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/pt\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/","og_locale":"pt_PT","og_type":"article","og_title":"New research into datasets reveals systemic ethical and legal issues | DailyAI","og_description":"AI revolves around data, but where does it come from? Is it legal to use? It might be labeled as such, but is it really?","og_url":"https:\/\/dailyai.com\/pt\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/","og_site_name":"DailyAI","article_published_time":"2023-10-26T19:21:21+00:00","article_modified_time":"2023-10-26T21:16:11+00:00","og_image":[{"width":1000,"height":583,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg","type":"image\/jpeg"}],"author":"Sam Jeans","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Escrito por":"Sam Jeans","Tempo estimado de leitura":"3 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/"},"author":{"name":"Sam Jeans","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9"},"headline":"New research into datasets reveals systemic ethical and legal issues","datePublished":"2023-10-26T19:21:21+00:00","dateModified":"2023-10-26T21:16:11+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/"},"wordCount":576,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg","keywords":["Data","Datasets","machine learning"],"articleSection":["Industry"],"inLanguage":"pt-PT"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/","url":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/","name":"Nova investiga\u00e7\u00e3o sobre conjuntos de dados revela quest\u00f5es \u00e9ticas e jur\u00eddicas sist\u00e9micas | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg","datePublished":"2023-10-26T19:21:21+00:00","dateModified":"2023-10-26T21:16:11+00:00","description":"A IA gira em torno dos dados, mas de onde \u00e9 que eles v\u00eam? A sua utiliza\u00e7\u00e3o \u00e9 legal? Pode ser rotulado como tal, mas ser\u00e1 mesmo?","breadcrumb":{"@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#breadcrumb"},"inLanguage":"pt-PT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/"]}]},{"@type":"ImageObject","inLanguage":"pt-PT","@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/10\/shutterstock_1166248483.jpg","width":1000,"height":583},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2023\/10\/new-research-into-datasets-reveals-systemic-ethical-and-legal-issues\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"New research into datasets reveals systemic ethical and legal issues"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"A sua dose di\u00e1ria de not\u00edcias sobre IA","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"pt-PT"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"pt-PT","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/711e81f945549438e8bbc579efdeb3c9","name":"Cal\u00e7as de ganga Sam","image":{"@type":"ImageObject","inLanguage":"pt-PT","@id":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a24a4a8f8e2a1a275b7491dc9c9f032c401eabf23c3206da4628dc84b6dac5c8?s=96&d=robohash&r=g","caption":"Sam Jeans"},"description":"Sam \u00e9 um escritor de ci\u00eancia e tecnologia que trabalhou em v\u00e1rias startups de IA. Quando n\u00e3o est\u00e1 a escrever, pode ser encontrado a ler revistas m\u00e9dicas ou a vasculhar caixas de discos de vinil.","sameAs":["https:\/\/www.linkedin.com\/in\/sam-jeans-6746b9142\/"],"url":"https:\/\/dailyai.com\/pt\/author\/samjeans\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/posts\/6804","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/comments?post=6804"}],"version-history":[{"count":11,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/posts\/6804\/revisions"}],"predecessor-version":[{"id":6837,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/posts\/6804\/revisions\/6837"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/media\/6806"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/media?parent=6804"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/categories?post=6804"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/pt\/wp-json\/wp\/v2\/tags?post=6804"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}