{"id":13539,"date":"2024-07-22T10:04:27","date_gmt":"2024-07-22T10:04:27","guid":{"rendered":"https:\/\/dailyai.com\/?p=13539"},"modified":"2024-07-22T10:04:27","modified_gmt":"2024-07-22T10:04:27","slug":"llm-refusal-training-easily-bypassed-with-past-tense-prompts","status":"publish","type":"post","link":"https:\/\/dailyai.com\/da\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","title":{"rendered":"LLM-afslagstr\u00e6ning let omg\u00e5et med prompter i datid"},"content":{"rendered":"<p><strong>Forskere fra Swiss Federal Institute of Technology Lausanne (EPFL) fandt ud af, at det at skrive farlige sp\u00f8rgsm\u00e5l i datid gik uden om de mest avancerede LLM'ers tr\u00e6ning i at afvise sp\u00f8rgsm\u00e5l.<\/strong><\/p>\n<p>AI-modeller justeres ofte ved hj\u00e6lp af teknikker som supervised fine-tuning (SFT) eller reinforcement learning human feedback (RLHF) for at sikre, at modellen ikke reagerer p\u00e5 farlige eller u\u00f8nskede opfordringer.<\/p>\n<p>Denne afvisningstr\u00e6ning tr\u00e6der i kraft, n\u00e5r du sp\u00f8rger ChatGPT til r\u00e5ds om, hvordan man laver en bombe eller stoffer. Vi har d\u00e6kket en r\u00e6kke <a href=\"https:\/\/dailyai.com\/da\/2024\/06\/microsoft-reveal-skeleton-key-jailbreak-which-works-across-different-ai-models\/\">Interessante jailbreak-teknikker<\/a> der omg\u00e5r disse sikkerhedsforanstaltninger, men den metode, EPFL-forskerne testede, er langt den enkleste.<\/p>\n<p>Forskerne tog et datas\u00e6t med 100 skadelige adf\u00e6rdsm\u00f8nstre og brugte GPT-3.5 til at omskrive sp\u00f8rgsm\u00e5lene til datid.<\/p>\n<p>Her er et eksempel p\u00e5 den metode, der forklares i <a href=\"https:\/\/arxiv.org\/pdf\/2407.11969\" target=\"_blank\" rel=\"noopener\">deres papir<\/a>.<\/p>\n<figure id=\"attachment_13541\" aria-describedby=\"caption-attachment-13541\" style=\"width: 1180px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13541 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense.png\" alt=\"\" width=\"1180\" height=\"574\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense.png 1180w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-300x146.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-1024x498.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-768x374.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-18x9.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Rewrite-prompt-in-past-tense-60x29.png 60w\" sizes=\"auto, (max-width: 1180px) 100vw, 1180px\" \/><figcaption id=\"caption-attachment-13541\" class=\"wp-caption-text\">Brug af en LLM til at omskrive farlige beskeder i datid. Kilde: arXiv<\/figcaption><\/figure>\n<p>De evaluerede derefter svarene p\u00e5 disse omskrevne prompter fra disse 8 LLM'er: Llama-3 8B, Claude-3.5 Sonnet, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, <a href=\"https:\/\/dailyai.com\/da\/2024\/07\/openai-releases-gpt-4o-mini-a-high-performance-super-low-cost-model\/\">GPT-4o-mini<\/a>, GPT-4o og R2D2.<\/p>\n<p>De brugte flere LLM'er til at bed\u00f8mme output og klassificere dem som enten et mislykket eller et vellykket jailbreak-fors\u00f8g.<\/p>\n<p>Bare det at \u00e6ndre tempoet i prompten havde en overraskende stor effekt p\u00e5 angrebets succesrate (ASR). GPT-4o og GPT-4o mini var s\u00e6rligt modtagelige for denne teknik.<\/p>\n<p>ASR for dette \"simple angreb p\u00e5 GPT-4o stiger fra 1% ved hj\u00e6lp af direkte anmodninger til 88% ved hj\u00e6lp af 20 fors\u00f8g p\u00e5 omformulering af skadelige anmodninger i datid.\"<\/p>\n<p>Her er et eksempel p\u00e5, hvor kompatibel GPT-4o bliver, n\u00e5r man blot omskriver prompten til datid. Jeg brugte ChatGPT til dette, og s\u00e5rbarheden er ikke blevet patchet endnu.<\/p>\n<figure id=\"attachment_13542\" aria-describedby=\"caption-attachment-13542\" style=\"width: 1254px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13542 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses.png\" alt=\"\" width=\"1254\" height=\"1058\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses.png 1254w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-300x253.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-1024x864.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-768x648.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-14x12.png 14w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Present-and-past-tense-prompt-responses-60x51.png 60w\" sizes=\"auto, (max-width: 1254px) 100vw, 1254px\" \/><figcaption id=\"caption-attachment-13542\" class=\"wp-caption-text\">ChatGPT med GPT-4o afviser en prompt i nutid, men overholder den, n\u00e5r den omskrives til datid. Kilde: ChatGPT: ChatGPT<\/figcaption><\/figure>\n<p>Afvisningstr\u00e6ning ved hj\u00e6lp af RLHF og SFT tr\u00e6ner en model til at kunne generalisere til at afvise skadelige opfordringer, selv om den ikke har set den specifikke opfordring f\u00f8r.<\/p>\n<p>N\u00e5r sp\u00f8rgsm\u00e5let er skrevet i datid, ser det ud til, at LLM'erne mister evnen til at generalisere. De andre LLM'er klarede sig ikke meget bedre end GPT-4o, selvom Llama-3 8B s\u00e5 ud til at v\u00e6re mest modstandsdygtig.<\/p>\n<figure id=\"attachment_13543\" aria-describedby=\"caption-attachment-13543\" style=\"width: 1268px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13543 size-full\" src=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts.png\" alt=\"\" width=\"1268\" height=\"492\" srcset=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts.png 1268w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-300x116.png 300w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-1024x397.png 1024w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-768x298.png 768w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-18x7.png 18w, https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/ASR-using-past-tense-prompts-60x23.png 60w\" sizes=\"auto, (max-width: 1268px) 100vw, 1268px\" \/><figcaption id=\"caption-attachment-13543\" class=\"wp-caption-text\">Succesrater for angreb ved hj\u00e6lp af farlige prompter i nutid og datid. Kilde: arXiv<\/figcaption><\/figure>\n<p>Ved at omskrive prompten til fremtid s\u00e5 man en stigning i ASR, men det var mindre effektivt end prompten i datid.<\/p>\n<p>Forskerne konkluderede, at det kunne skyldes, at \"de finjusterende datas\u00e6t kan indeholde en st\u00f8rre andel af skadelige anmodninger udtrykt i fremtidsform eller som hypotetiske begivenheder.\"<\/p>\n<p>De foreslog ogs\u00e5, at \"modellens interne r\u00e6sonnement kan fortolke fremtidsorienterede anmodninger som potentielt mere skadelige, mens udsagn i fortid, som f.eks. historiske begivenheder, kan opfattes som mere godartede.\"<\/p>\n<h2>Kan det fikses?<\/h2>\n<p>Yderligere eksperimenter viste, at tilf\u00f8jelse af datidsprompter til de finjusterende datas\u00e6t effektivt reducerede modtageligheden over for denne jailbreak-teknik.<\/p>\n<p>Selv om denne tilgang er effektiv, kr\u00e6ver den, at man forudser den slags farlige beskeder, som en bruger kan indtaste.<\/p>\n<p>Forskerne foresl\u00e5r, at det er en nemmere l\u00f8sning at evaluere en models output, f\u00f8r den pr\u00e6senteres for brugeren.<\/p>\n<p>Selv om dette jailbreak er enkelt, ser det ikke ud til, at de f\u00f8rende AI-virksomheder har fundet en m\u00e5de at patche det p\u00e5 endnu.<\/p>","protected":false},"excerpt":{"rendered":"<p>Forskere fra Swiss Federal Institute of Technology Lausanne (EPFL) fandt ud af, at det at skrive farlige opfordringer i datid gik uden om afvisningstr\u00e6ningen af de mest avancerede LLM'er. AI-modeller justeres almindeligvis ved hj\u00e6lp af teknikker som supervised fine-tuning (SFT) eller reinforcement learning human feedback (RLHF) for at sikre, at modellen ikke reagerer p\u00e5 farlige eller u\u00f8nskede beskeder. Denne afvisningstr\u00e6ning tr\u00e6der i kraft, n\u00e5r du beder ChatGPT om r\u00e5d om, hvordan man laver en bombe eller stoffer. Vi har beskrevet en r\u00e6kke interessante jailbreak-teknikker, som omg\u00e5r disse sikkerhedsforanstaltninger, men den metode, som EPFL-forskerne testede, er langt den enkleste.<\/p>","protected":false},"author":6,"featured_media":13544,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[163,118],"class_list":["post-13539","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry","tag-ai-risks","tag-llms"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>LLM refusal training easily bypassed with past tense prompts | DailyAI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dailyai.com\/da\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/\" \/>\n<meta property=\"og:locale\" content=\"da_DK\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LLM refusal training easily bypassed with past tense prompts | DailyAI\" \/>\n<meta property=\"og:description\" content=\"Researchers from the Swiss Federal Institute of Technology Lausanne (EPFL) found that writing dangerous prompts in the past tense bypassed the refusal training of the most advanced LLMs. AI models are commonly aligned using techniques like supervised fine-tuning (SFT) or reinforcement learning human feedback (RLHF) to make sure the model doesn\u2019t respond to dangerous or undesirable prompts. This refusal training kicks in when you ask ChatGPT for advice on how to make a bomb or drugs. We\u2019ve covered a range of interesting jailbreak techniques that bypass these guardrails but the method the EPFL researchers tested is by far the simplest.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dailyai.com\/da\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/\" \/>\n<meta property=\"og:site_name\" content=\"DailyAI\" \/>\n<meta property=\"article:published_time\" content=\"2024-07-22T10:04:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1792\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Eugene van der Watt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:site\" content=\"@DailyAIOfficial\" \/>\n<meta name=\"twitter:label1\" content=\"Skrevet af\" \/>\n\t<meta name=\"twitter:data1\" content=\"Eugene van der Watt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimeret l\u00e6setid\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutter\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"},\"author\":{\"name\":\"Eugene van der Watt\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\"},\"headline\":\"LLM refusal training easily bypassed with past tense prompts\",\"datePublished\":\"2024-07-22T10:04:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"},\"wordCount\":569,\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"keywords\":[\"AI risks\",\"LLMS\"],\"articleSection\":[\"Industry\"],\"inLanguage\":\"da-DK\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\",\"name\":\"LLM refusal training easily bypassed with past tense prompts | DailyAI\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"datePublished\":\"2024-07-22T10:04:27+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#breadcrumb\"},\"inLanguage\":\"da-DK\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#primaryimage\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2024\\\/07\\\/Jailbreak-AI-model-with-past-tense.webp\",\"width\":1792,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/2024\\\/07\\\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dailyai.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"LLM refusal training easily bypassed with past tense prompts\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#website\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"name\":\"DailyAI\",\"description\":\"Your Daily Dose of AI News\",\"publisher\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/dailyai.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"da-DK\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#organization\",\"name\":\"DailyAI\",\"url\":\"https:\\\/\\\/dailyai.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/Daily-Ai_TL_colour.png\",\"width\":4501,\"height\":934,\"caption\":\"DailyAI\"},\"image\":{\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/DailyAIOfficial\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/dailyaiofficial\\\/\",\"https:\\\/\\\/www.youtube.com\\\/@DailyAIOfficial\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/#\\\/schema\\\/person\\\/7ce525c6d0c79838b7cc7cde96993cfa\",\"name\":\"Eugene van der Watt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"da-DK\",\"@id\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"url\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"contentUrl\":\"https:\\\/\\\/dailyai.com\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/Eugine_Profile_Picture-96x96.png\",\"caption\":\"Eugene van der Watt\"},\"description\":\"Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.\",\"sameAs\":[\"www.linkedin.com\\\/in\\\/eugene-van-der-watt-16828119\"],\"url\":\"https:\\\/\\\/dailyai.com\\\/da\\\/author\\\/eugene\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"LLM-afslagstr\u00e6ning kan nemt omg\u00e5s med prompter i datid | DailyAI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dailyai.com\/da\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","og_locale":"da_DK","og_type":"article","og_title":"LLM refusal training easily bypassed with past tense prompts | DailyAI","og_description":"Researchers from the Swiss Federal Institute of Technology Lausanne (EPFL) found that writing dangerous prompts in the past tense bypassed the refusal training of the most advanced LLMs. AI models are commonly aligned using techniques like supervised fine-tuning (SFT) or reinforcement learning human feedback (RLHF) to make sure the model doesn\u2019t respond to dangerous or undesirable prompts. This refusal training kicks in when you ask ChatGPT for advice on how to make a bomb or drugs. We\u2019ve covered a range of interesting jailbreak techniques that bypass these guardrails but the method the EPFL researchers tested is by far the simplest.","og_url":"https:\/\/dailyai.com\/da\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","og_site_name":"DailyAI","article_published_time":"2024-07-22T10:04:27+00:00","og_image":[{"width":1792,"height":1024,"url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","type":"image\/webp"}],"author":"Eugene van der Watt","twitter_card":"summary_large_image","twitter_creator":"@DailyAIOfficial","twitter_site":"@DailyAIOfficial","twitter_misc":{"Skrevet af":"Eugene van der Watt","Estimeret l\u00e6setid":"4 minutter"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#article","isPartOf":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"},"author":{"name":"Eugene van der Watt","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa"},"headline":"LLM refusal training easily bypassed with past tense prompts","datePublished":"2024-07-22T10:04:27+00:00","mainEntityOfPage":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"},"wordCount":569,"publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"image":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","keywords":["AI risks","LLMS"],"articleSection":["Industry"],"inLanguage":"da-DK"},{"@type":"WebPage","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","url":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/","name":"LLM-afslagstr\u00e6ning kan nemt omg\u00e5s med prompter i datid | DailyAI","isPartOf":{"@id":"https:\/\/dailyai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"image":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage"},"thumbnailUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","datePublished":"2024-07-22T10:04:27+00:00","breadcrumb":{"@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#breadcrumb"},"inLanguage":"da-DK","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/"]}]},{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#primaryimage","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2024\/07\/Jailbreak-AI-model-with-past-tense.webp","width":1792,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/dailyai.com\/2024\/07\/llm-refusal-training-easily-bypassed-with-past-tense-prompts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dailyai.com\/"},{"@type":"ListItem","position":2,"name":"LLM refusal training easily bypassed with past tense prompts"}]},{"@type":"WebSite","@id":"https:\/\/dailyai.com\/#website","url":"https:\/\/dailyai.com\/","name":"DailyAI","description":"Din daglige dosis af AI-nyheder","publisher":{"@id":"https:\/\/dailyai.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dailyai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"da-DK"},{"@type":"Organization","@id":"https:\/\/dailyai.com\/#organization","name":"DailyAI","url":"https:\/\/dailyai.com\/","logo":{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/06\/Daily-Ai_TL_colour.png","width":4501,"height":934,"caption":"DailyAI"},"image":{"@id":"https:\/\/dailyai.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DailyAIOfficial","https:\/\/www.linkedin.com\/company\/dailyaiofficial\/","https:\/\/www.youtube.com\/@DailyAIOfficial"]},{"@type":"Person","@id":"https:\/\/dailyai.com\/#\/schema\/person\/7ce525c6d0c79838b7cc7cde96993cfa","name":"Eugene van der Watt","image":{"@type":"ImageObject","inLanguage":"da-DK","@id":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","url":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","contentUrl":"https:\/\/dailyai.com\/wp-content\/uploads\/2023\/07\/Eugine_Profile_Picture-96x96.png","caption":"Eugene van der Watt"},"description":"Eugene har en baggrund som elektronikingeni\u00f8r og elsker alt, hvad der har med teknologi at g\u00f8re. N\u00e5r han tager en pause fra at l\u00e6se AI-nyheder, kan du finde ham ved snookerbordet.","sameAs":["www.linkedin.com\/in\/eugene-van-der-watt-16828119"],"url":"https:\/\/dailyai.com\/da\/author\/eugene\/"}]}},"_links":{"self":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/13539","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/comments?post=13539"}],"version-history":[{"count":3,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/13539\/revisions"}],"predecessor-version":[{"id":13546,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/posts\/13539\/revisions\/13546"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/media\/13544"}],"wp:attachment":[{"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/media?parent=13539"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/categories?post=13539"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dailyai.com\/da\/wp-json\/wp\/v2\/tags?post=13539"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}