Image 536737f1eaaa...

EXPERT: gemma-3-27b-it-free VERSION 1

RUNTIME: google-free/gemma-3-27b-it
INTEL_VERIFIED
## Bar Chart: Attack Success Rate (ASR) by Prompt Injection Technique and Defense Mechanism

### Overview
This bar chart compares the Attack Success Rate (ASR) of various prompt injection techniques against different defense mechanisms. The chart is divided into two rows, each representing a set of prompt injection techniques. Each bar represents the ASR for a specific technique and defense combination.

### Components/Axes
*   **X-axis:** Prompt Injection Techniques: Deletion Characters, Diacritics, Emoji Smuggling, Full Width Text, Homoglyphs, Numbers, Bidirectional Text, Spaces, Underline Accent Marks, Unicode Tags Smuggling, Upside Down Text, Zero Width.
*   **Y-axis:** Attack Success Rate (ASR), ranging from 0.0 to 1.0.
*   **Legend:**
    *   Azure Prompt Shield (Light Green)
    *   Protect AI v1 (Pale Orange)
    *   Meta Prompt Guard (Medium Green)
    *   Vijil Prompt Injection (Blue)
    *   NeMo Guard Jailbreak Detect (Coral)

### Detailed Analysis

**Row 1: Deletion Characters to Zero Width**

*   **Deletion Characters:**
    *   Azure Prompt Shield: ~0.04
    *   Protect AI v1: ~0.24
    *   Meta Prompt Guard: ~0.68
    *   Vijil Prompt Injection: ~0.84
    *   NeMo Guard Jailbreak Detect: ~0.88
*   **Diacritics:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.04
    *   Meta Prompt Guard: ~0.08
    *   Vijil Prompt Injection: ~0.76
    *   NeMo Guard Jailbreak Detect: ~0.84
*   **Emoji Smuggling:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.00
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Full Width Text:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.00
    *   Vijil Prompt Injection: ~0.84
    *   NeMo Guard Jailbreak Detect: ~0.88
*   **Homoglyphs:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.00
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Numbers:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.00
    *   Vijil Prompt Injection: ~0.84
    *   NeMo Guard Jailbreak Detect: ~0.88

**Row 2: Bidirectional Text to Zero Width**

*   **Bidirectional Text:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.28
    *   Meta Prompt Guard: ~0.84
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Spaces:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.04
    *   Meta Prompt Guard: ~0.84
    *   Vijil Prompt Injection: ~0.84
    *   NeMo Guard Jailbreak Detect: ~0.88
*   **Underline Accent Marks:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.88
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Unicode Tags Smuggling:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.84
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Upside Down Text:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.00
    *   Meta Prompt Guard: ~0.84
    *   Vijil Prompt Injection: ~0.88
    *   NeMo Guard Jailbreak Detect: ~0.92
*   **Zero Width:**
    *   Azure Prompt Shield: ~0.00
    *   Protect AI v1: ~0.20
    *   Meta Prompt Guard: ~0.84
    *   Vijil Prompt Injection: ~0.84
    *   NeMo Guard Jailbreak Detect: ~0.88

### Key Observations

*   **Vijil Prompt Injection and NeMo Guard Jailbreak Detect** consistently exhibit the highest ASR across almost all prompt injection techniques, often nearing 1.0.
*   **Azure Prompt Shield** generally has the lowest ASR, often at or near 0.0, but is vulnerable to Deletion Characters and Bidirectional Text.
*   **Protect AI v1** shows moderate ASR for some techniques (Deletion Characters, Bidirectional Text, Zero Width) but is very effective against others.
*   **Meta Prompt Guard** demonstrates a relatively high ASR across most techniques, generally between 0.68 and 0.88.
*   Techniques like Emoji Smuggling, Full Width Text, Homoglyphs, and Numbers are particularly effective at bypassing defenses, resulting in high ASR values for most defense mechanisms.

### Interpretation

The data suggests that while some defense mechanisms (like Azure Prompt Shield) are effective against a broad range of prompt injection techniques, no single defense provides complete protection.  Vijil Prompt Injection and NeMo Guard Jailbreak Detect are highly effective at bypassing defenses, indicating a need for more robust protection strategies. The varying effectiveness of different techniques highlights the complexity of prompt injection attacks and the need for layered defense approaches. The consistent high ASR for techniques like Emoji Smuggling suggests that these methods exploit vulnerabilities in how input is processed and sanitized.  The relatively low ASR for Azure Prompt Shield against many techniques could indicate a more conservative approach to filtering, potentially at the cost of false positives.  The chart demonstrates a clear trade-off between security and usability, as more aggressive filtering may block legitimate inputs. Further investigation is needed to understand the specific vulnerabilities exploited by each technique and to develop more effective countermeasures.
DECODING INTELLIGENCE...
TECHNICAL ASSET FINGERPRINT

536737f1eaaaf9190028d9dd

FOUND IN PAPERS

EXPERT: gemma-3-27b-it-free VERSION 1