{"id":13539,"date":"2024-07-22T10:04:27","date_gmt":"2024-07-22T10:04:27","guid":{"rendered":"https:\/\/dailyai.com\/?p=13539"},"modified":"2024-07-22T10:04:27","modified_gmt":"2024-07-22T10:04:27","slug":"llm-refusal-training-easily-bypassed-with-past-tense-prompts","status":"publish","type":"post","link":"https:\/\/dailyai.com\/es\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","title":{"rendered":"La formaci\u00f3n para rechazar el LLM se evita f\u00e1cilmente con preguntas en pasado"},"content":{"rendered":"<p><strong>Investigadores de la Escuela Polit\u00e9cnica Federal de Lausana (EPFL) descubrieron que escribir indicaciones peligrosas en pasado sorteaba el entrenamiento en rechazo de los LLM m\u00e1s avanzados.<\/strong><\/p>\n<p>Los modelos de IA suelen alinearse mediante t\u00e9cnicas como el ajuste fino supervisado (SFT) o la retroalimentaci\u00f3n humana del aprendizaje por refuerzo (RLHF) para asegurarse de que el modelo no responda a indicaciones peligrosas o indeseables.<\/p>\n<p>Esta formaci\u00f3n sobre el rechazo entra en acci\u00f3n cuando pides consejo a ChatGPT sobre c\u00f3mo fabricar una bomba o drogas. Hemos cubierto una serie de <a href=\"https:\/\/dailyai.com\/es\/2024\/06\/microsoft-reveal-skeleton-key-jailbreak-which-works-across-different-ai-models\/\">interesantes t\u00e9cnicas de jailbreak<\/a> que se saltan estos guardarra\u00edles, pero el m\u00e9todo que probaron los investigadores de la EPFL es, con mucho, el m\u00e1s sencillo.<\/p>\n<p>Los investigadores tomaron un conjunto de datos de 100 conductas nocivas y utilizaron GPT-3.5 para reescribir las indicaciones en pasado.<\/p>\n<p>He aqu\u00ed un ejemplo del m\u00e9todo explicado en <a href=\"https:\/\/arxiv.org\/pdf\/2407.11969\" target=\"_blank\" rel=\"noopener\">su papel<\/a>.<\/p>\n<figure id=\"attachment_13541\" aria-describedby=\"caption-attachment-13541\" style=\"width: 1180px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13541 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense.png\" alt=\"\" width=\"1180\" height=\"574\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense.png 1180w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-300x146.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-1024x498.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-768x374.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-18x9.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-60x29.png 60w\" sizes=\"auto, (max-width: 1180px) 100vw, 1180px\" \/><figcaption id=\"caption-attachment-13541\" class=\"wp-caption-text\">Uso de un LLM para reescribir indicaciones peligrosas en pasado. Fuente: arXiv<\/figcaption><\/figure>\n<p>A continuaci\u00f3n, evaluaron las respuestas a estas instrucciones reescritas de estos 8 LLM: Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, <a href=\"https:\/\/dailyai.com\/es\/2024\/07\/openai-releases-gpt-4o-mini-a-high-performance-super-low-cost-model\/\">GPT-4o-mini<\/a>GPT-4o y R2D2.<\/p>\n<p>Utilizaron varios LLM para juzgar los resultados y clasificarlos como un intento de fuga fallido o exitoso.<\/p>\n<p>El simple hecho de cambiar el tiempo de la pregunta tuvo un efecto sorprendentemente significativo en la tasa de \u00e9xito de los ataques (ASR). GPT-4o y GPT-4o mini fueron especialmente susceptibles a esta t\u00e9cnica.<\/p>\n<p>El ASR de este \"simple ataque a GPT-4o aumenta de 1% usando peticiones directas a 88% usando 20 intentos de reformulaci\u00f3n en pasado de peticiones da\u00f1inas\".<\/p>\n<p>Aqu\u00ed hay un ejemplo de c\u00f3mo GPT-4o se vuelve compatible cuando simplemente reescribes el prompt en pasado. Us\u00e9 ChatGPT para esto y la vulnerabilidad a\u00fan no ha sido parcheada.<\/p>\n<figure id=\"attachment_13542\" aria-describedby=\"caption-attachment-13542\" style=\"width: 1254px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13542 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses.png\" alt=\"\" width=\"1254\" height=\"1058\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses.png 1254w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-300x253.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-1024x864.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-768x648.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-14x12.png 14w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-60x51.png 60w\" sizes=\"auto, (max-width: 1254px) 100vw, 1254px\" \/><figcaption id=\"caption-attachment-13542\" class=\"wp-caption-text\">ChatGPT con GPT-4o rechaza una pregunta en presente, pero la acepta cuando se reescribe en pasado. Fuente: ChatGPT<\/figcaption><\/figure>\n<p>El entrenamiento de rechazo mediante RLHF y SFT capacita a un modelo para generalizar con \u00e9xito el rechazo de indicaciones perjudiciales incluso si no ha visto la indicaci\u00f3n espec\u00edfica antes.<\/p>\n<p>Cuando la pregunta est\u00e1 escrita en pasado, los LLM parecen perder la capacidad de generalizar. A los dem\u00e1s LLM no les fue mucho mejor que a GPT-4o, aunque Llama-3 8B pareci\u00f3 ser el m\u00e1s resistente.<\/p>\n<figure id=\"attachment_13543\" aria-describedby=\"caption-attachment-13543\" style=\"width: 1268px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13543 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts.png\" alt=\"\" width=\"1268\" height=\"492\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts.png 1268w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-300x116.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-1024x397.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-768x298.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-18x7.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-60x23.png 60w\" sizes=\"auto, (max-width: 1268px) 100vw, 1268px\" \/><figcaption id=\"caption-attachment-13543\" class=\"wp-caption-text\">Tasas de \u00e9xito de los ataques utilizando indicaciones peligrosas en tiempo presente y pasado. Fuente: arXiv<\/figcaption><\/figure>\n<p>La reescritura de la pregunta en tiempo futuro aument\u00f3 el ASR, pero fue menos eficaz que la reescritura en tiempo pasado.<\/p>\n<p>Los investigadores concluyeron que esto podr\u00eda deberse a que \"los conjuntos de datos de ajuste fino pueden contener una mayor proporci\u00f3n de peticiones perjudiciales expresadas en tiempo futuro o como sucesos hipot\u00e9ticos\".<\/p>\n<p>Tambi\u00e9n sugirieron que \"el razonamiento interno del modelo podr\u00eda interpretar las peticiones orientadas al futuro como potencialmente m\u00e1s da\u00f1inas, mientras que las afirmaciones en tiempo pasado, como los acontecimientos hist\u00f3ricos, podr\u00edan percibirse como m\u00e1s benignas\".<\/p>\n<h2>\u00bfSe puede arreglar?<\/h2>\n<p>Otros experimentos demostraron que la adici\u00f3n de indicaciones de tiempo pasado a los conjuntos de datos de ajuste reduc\u00eda eficazmente la susceptibilidad a esta t\u00e9cnica de fuga.<\/p>\n<p>Aunque eficaz, este enfoque requiere adelantarse a los tipos de indicaciones peligrosas que un usuario puede introducir.<\/p>\n<p>Los investigadores sugieren que evaluar el resultado de un modelo antes de presentarlo al usuario es una soluci\u00f3n m\u00e1s sencilla.<\/p>\n<p>Por muy sencillo que sea este jailbreak, no parece que las principales compa\u00f1\u00edas de IA hayan encontrado a\u00fan la forma de parchearlo.<\/p>","protected":false},"excerpt":{"rendered":"<p>Investigadores de la Escuela Polit\u00e9cnica Federal de Lausana (EPFL) descubrieron que escribir instrucciones peligrosas en pasado elud\u00eda el entrenamiento de rechazo de los LLM m\u00e1s avanzados. Los modelos de inteligencia artificial suelen alinearse mediante t\u00e9cnicas como el ajuste fino supervisado (SFT) o el aprendizaje por refuerzo con retroalimentaci\u00f3n humana (RLHF) para asegurarse de que el modelo no responde a mensajes peligrosos o indeseables. Este entrenamiento de rechazo entra en acci\u00f3n cuando le pides consejo a ChatGPT sobre c\u00f3mo fabricar una bomba o drogas. Hemos analizado una serie de t\u00e9cnicas interesantes para eludir estas barreras, pero el m\u00e9todo que han probado los investigadores de la EPFL es, con diferencia, el m\u00e1s sencillo.<\/p>","protected":false},"author":6,"featured_media":13544,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[163,118],"class_list":["post-13539","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-ai-risks","tag-llms"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>LLM refusal training easily bypassed with past tense prompts | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/es\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/\" \/>\n<meta property=\"og:locale\" content=\"es_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LLM refusal training easily bypassed with past tense prompts | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Researchers from the Swiss Federal Institute of Technology Lausanne (EPFL) found that writing dangerous prompts in the past tense bypassed the refusal training of the most advanced LLMs. AI models are commonly aligned using techniques like supervised fine-tuning (SFT) or reinforcement learning human feedback (RLHF) to make sure the model doesn\u2019t respond to dangerous or undesirable prompts. This refusal training kicks in when you ask ChatGPT for advice on how to make a bomb or drugs. We\u2019ve covered a range of interesting jailbreak techniques that bypass these guardrails but the method the EPFL researchers tested is by far the simplest.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/es\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-07-22T10:04:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Eugene van der Watt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"Eugene van der Watt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Tiempo de lectura\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"},\"author\":{\"name\":\"Eugene van der Watt\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\"},\"headline\":\"LLM refusal training easily bypassed with past tense prompts\",\"datePublished\":\"2024-07-22T10:04:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"},\"wordCount\":569,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"keywords\":[\"AI risks\",\"LLMS\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"es\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\",\"name\":\"LLM refusal training easily bypassed with past tense prompts | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"datePublished\":\"2024-07-22T10:04:27+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#breadcrumb\"},\"inLanguage\":\"es\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"LLM refusal training easily bypassed with past tense prompts\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"es\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\",\"name\":\"Eugene van der Watt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"es\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"caption\":\"Eugene van der Watt\"},\"description\":\"Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.\",\"sameAs\":[\"www.linkedin.com\\\/in\\\/eugene-van-der-watt-16828119\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/es\\\/author\\\/eugene\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"La formaci\u00f3n sobre el rechazo del LLM se salta f\u00e1cilmente con indicaciones en pasado | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/es\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","og_locale":"es_ES","og_type":"article","og_title":"LLM refusal training easily bypassed with past tense prompts | DailyAI","og_description":"Researchers from the Swiss Federal Institute of Technology Lausanne (EPFL) found that writing dangerous prompts in the past tense bypassed the refusal training of the most advanced LLMs. AI models are commonly aligned using techniques like supervised fine-tuning (SFT) or reinforcement learning human feedback (RLHF) to make sure the model doesn\u2019t respond to dangerous or undesirable prompts. This refusal training kicks in when you ask ChatGPT for advice on how to make a bomb or drugs. We\u2019ve covered a range of interesting jailbreak techniques that bypass these guardrails but the method the EPFL researchers tested is by far the simplest.","og_url":"https:\/\/dailyai.com\/es\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","og_site_name":"DailyAI","article_published_time":"2024-07-22T10:04:27+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","type":"image\/webp"}],"author":"Eugene van der Watt","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Escrito por":"Eugene van der Watt","Tiempo de lectura":"4 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"},"author":{"name":"Eugene van der Watt","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa"},"headline":"LLM refusal training easily bypassed with past tense prompts","datePublished":"2024-07-22T10:04:27+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"},"wordCount":569,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","keywords":["AI risks","LLMS"],"articleSection":["Industry"],"inLanguage":"es"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","url":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","name":"La formaci\u00f3n sobre el rechazo del LLM se salta f\u00e1cilmente con indicaciones en pasado | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","datePublished":"2024-07-22T10:04:27+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#breadcrumb"},"inLanguage":"es","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"]}]},{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"LLM refusal training easily bypassed with past tense prompts"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"Su dosis diaria de noticias sobre IA","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"es"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa","name":"Eugene van der Watt","image":{"@type":"ImageObject","inLanguage":"es","@id":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","caption":"Eugene van der Watt"},"description":"Eugene es ingeniero electr\u00f3nico y le encanta todo lo relacionado con la tecnolog\u00eda. Cuando descansa de consumir noticias sobre IA, lo encontrar\u00e1 jugando al billar.","sameAs":["www.linkedin.com\/in\/eugene-van-der-watt-16828119"],"url":"https:\/\/dailyai.com\/es\/author\/eugene\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/posts\/13539","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/comments?post=13539"}],"version-history":[{"count":3,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/posts\/13539\/revisions"}],"predecessor-version":[{"id":13546,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/posts\/13539\/revisions\/13546"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/media\/13544"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/media?parent=13539"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/categories?post=13539"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/es\/wp-json\/wp\/v2\/tags?post=13539"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}