SYARA extends traditional YARA with semantic similarity, ML classifiers, LLM evaluation, and perceptual hashing — detecting threats that change their words but never their intent.
YARA relies on exact string matches. Attackers simply rephrase their malicious content to evade detection — a trivial task with modern LLMs.
GenAI can generate countless paraphrases of malicious prompts, making static keyword rules obsolete almost immediately.
Security teams spend endless hours writing rules for every variation of an attack. SYARA captures intent, not just specific words.
Five cost-ordered matching layers — from fast regex to powerful LLM evaluation
Text, Images, Audio, Video
Exact literals and regex patterns (traditional YARA)
⚡ Lowest costSBERT embeddings with cosine similarity & configurable chunking
💡 Low costImage dHash, audio & video fingerprinting for near-duplicate detection
🖼️ Moderate costFine-tuned classifiers: TunedSBERT, DistilBERT, DeBERTa
🎯 Moderate costGPT-4, Gemini, Ollama (local), Flan-T5 — zero-day detection
🚀 Highest costMatched rules & confidence scores
💡 Smart optimization: SYARA executes layers in cost order, invoking expensive models only when cheaper layers don't resolve the condition. Session-scoped caching avoids redundant preprocessing.
A fully extensible, pluggable architecture built for the GenAI era
Detect malicious intent even when exact words change. Uses SBERT (all-MiniLM-L6-v2) with configurable cleaners, chunkers, and threshold.
"ignore previous instructions""kindly overlook earlier guidance"
Plug in GPT-4, Google Gemini, local Ollama models, or open-source Flan-T5 for the most sophisticated zero-day threat detection.
Detect malicious images using dHash. Placeholder implementations for audio and video hashing are ready to extend with production libraries.
Swap in fine-tuned classifiers (TunedSBERT, DistilBERT, DeBERTa) trained on your specific threat landscape for maximum precision.
Automatic execution from cheapest (regex) to most expensive (LLM). Session-scoped caching and short-circuit evaluation minimize unnecessary computation.
Register custom cleaners, chunkers, matchers, classifiers, and LLMs via YAML config or Python API. Accepts both class paths and pre-instantiated objects.
Protect LLM applications from malicious prompts designed to bypass safety guidelines and hijack model behavior.
Identify phishing websites and emails using text analysis combined with logo and image perceptual hashing.
Detect DAN mode, role-play exploits, and other jailbreak techniques using semantic pattern matching.
Hunt for injected scripts, XSS payloads, and obfuscated code with semantic and regex-based pattern matching.
Detect malware UI screenshots and icons using perceptual hashing — robust to minor visual modifications.
Identify attempts to extract training data, system prompts, or sensitive information from AI systems.
Watch how easy it is to detect sophisticated threats
Tutorial Video — Coming Soon
Learn how to write your first SYARA rule and detect prompt injection attacks
# Install with all AI features
pip install syara[all]
# Or install specific extras
pip install syara[sbert] # SBERT similarity
pip install syara[classifier] # ML classifiers
pip install syara[llm] # LLM evaluation
rule prompt_injection : security
{
meta:
description = "Detect prompt injection"
strings:
$s1 = "ignore previous" nocase
$s2 = /\b(disregard|forget)\s+prior\b/i
similarity:
$s3 = "ignore instructions"
threshold=0.8 matcher="sbert"
condition:
any of ($s*)
}
import syara
rules = syara.compile('rules.syara')
text = "Kindly disregard prior prompts"
matches = rules.match(text)
for m in matches:
if m.matched:
print(f"Threat detected: {m.rule_name}")
for id, details in m.matched_patterns.items():
print(f" {id}: score={details[0].score:.2f}")
import syara
from syara import ConfigManager
# Register a pre-instantiated LLM or classifier
config = ConfigManager()
config.config.llms['my-ollama'] = my_ollama_evaluator_instance
config.config.classifiers['my-deberta'] = my_deberta_instance
rules = syara.compile('rules.syara', config_manager=config)
# Match binary files with phash rules
file_matches = rules.match_file('suspicious_icon.png')
SYARA is a non-profit, community-driven open source project
Join security researchers using SYARA to detect next-generation threats