A comprehensive, zero-dependency Python input validation library designed for web applications, APIs, and data processing pipelines.
- Simple API - Function-based validators that return
(is_valid, error_message)tuples - Fully Typed - Complete type hints for IDE autocomplete and static analysis
- Zero Dependencies - Uses only Python standard library
- Practical Validators - Email, URL, phone, UUID, IP addresses, credit cards, and more
- File Security - Path traversal prevention, filename validation, MIME type checking
- Sanitizers - Clean and normalize user input safely
This library provides input validation, not security controls. It helps you verify that data matches expected formats.
- Validates data formats (email, phone, UUID, IP address, etc.)
- Prevents path traversal attacks (
../sequences) - Validates and sanitizes filenames
- Provides HTML escaping for safe output
- Enforces business rules (length limits, allowed characters, etc.)
For injection attacks, you must use proper architectural defenses:
| Attack | Proper Defense | Why Validation Alone Fails |
|---|---|---|
| SQL Injection | Parameterized queries | Attackers can encode/obfuscate payloads infinitely |
| Command Injection | subprocess.run([...], shell=False) |
Shell metacharacters vary by context |
| XSS | Context-aware output encoding | Input validation can't predict output context |
This library intentionally does not include SQL/command/XSS "detection" validators because they create a false sense of security. A blocklist approach cannot catch all attack variants and may block legitimate input.
Copy validator.py to your project:
# Copy directly
curl -O https://raw.githubusercontent.com/yourusername/py-validator/main/validator.py
# Or clone the repo
git clone https://github.com/yourusername/py-validator.gitfrom validator import (
validate_email,
validate_path_safe,
validate_filename,
validate_ip_address,
sanitize_html,
)
# Validate an email
result = validate_email("[email protected]")
if result.is_valid:
print("Email is valid!")
else:
print(f"Error: {result.error}")
# Prevent path traversal attacks
result = validate_path_safe("../../../etc/passwd")
# ValidationResult(is_valid=False, error="Path contains traversal sequences")
# Validate uploaded filenames
result = validate_filename("report.pdf", allowed_extensions={".pdf", ".docx"})
# ValidationResult(is_valid=True, error=None)
# Validate IP addresses
result = validate_ip_address("192.168.1.1")
# ValidationResult(is_valid=True, error=None)
# Sanitize user input for HTML display (PRIMARY XSS defense)
safe_html = sanitize_html("<script>alert('xss')</script>")
# "<script>alert('xss')</script>"All validators return a ValidationResult named tuple:
from validator import ValidationResult
result: ValidationResult = validate_email("[email protected]")
result.is_valid # bool - True if validation passed
result.error # str | None - Error message if validation failedValidates email addresses against RFC 5322 (simplified).
validate_email("[email protected]") # ✓ Valid
validate_email("[email protected]") # ✓ Valid
validate_email("invalid-email") # ✗ Invalid email formatvalidate_url(url: str, allowed_schemes: list[str] = None, require_tld: bool = True) -> ValidationResult
Validates URL format with configurable scheme restrictions.
validate_url("https://example.com/path") # ✓ Valid
validate_url("ftp://files.example.com",
allowed_schemes=["ftp", "sftp"]) # ✓ Valid
validate_url("javascript:alert(1)") # ✗ Scheme not allowedValidates international phone number formats.
validate_phone("+1-555-123-4567") # ✓ Valid
validate_phone("+44 20 7123 4567") # ✓ Valid
validate_phone("(123) 456-7890") # ✓ Validvalidate_username(username: str, min_length: int = 3, max_length: int = 32, ...) -> ValidationResult
Validates usernames with configurable rules.
validate_username("john_doe") # ✓ Valid
validate_username("ab") # ✗ Too short (min 3)
validate_username("_invalid") # ✗ Must start with alphanumericValidates password complexity requirements.
validate_password_strength("SecureP@ss123") # ✓ Valid
validate_password_strength("weak") # ✗ Too short, missing requirementsValidates UUID format (versions 1-5).
validate_uuid("550e8400-e29b-41d4-a716-446655440000") # ✓ Valid
validate_uuid("not-a-uuid") # ✗ Invalid formatValidates string length constraints.
validate_length("hello", min_length=3, max_length=10) # ✓ Valid
validate_length("hi", min_length=3) # ✗ Too shortValidates that a string is not empty or whitespace-only.
validate_not_empty("hello") # ✓ Valid
validate_not_empty(" ") # ✗ Empty after strippingvalidate_alphanumeric(value: str, allow_spaces: bool = False, allow_underscores: bool = False) -> ValidationResult
Validates alphanumeric strings.
validate_alphanumeric("Hello123") # ✓ Valid
validate_alphanumeric("Hello World", allow_spaces=True) # ✓ Valid
validate_alphanumeric("Hello!") # ✗ Invalid characterValidates URL-friendly slugs (lowercase, numbers, hyphens).
validate_slug("my-blog-post") # ✓ Valid
validate_slug("my-post-123") # ✓ Valid
validate_slug("My Blog Post") # ✗ Invalid formatValidates hexadecimal strings.
validate_hex_string("deadbeef") # ✓ Valid
validate_hex_string("abc123", expected_length=6) # ✓ Valid
validate_hex_string("0xDEADBEEF") # ✓ Valid (0x prefix allowed)Validates that a string contains only ASCII characters.
validate_ascii("Hello, World!") # ✓ Valid
validate_ascii("Héllo") # ✗ Contains non-ASCIIValidates that a string contains only printable characters.
validate_printable("Hello\nWorld") # ✓ Valid
validate_printable("Hello\x00World") # ✗ Contains control characterValidates that a value is one of the allowed choices.
validate_choice("red", ["red", "green", "blue"]) # ✓ Valid
validate_choice("purple", ["red", "green", "blue"]) # ✗ Not in choicesValidates that a string contains only specified characters.
validate_contains_only("123-456", "0123456789-") # ✓ Valid
validate_contains_only("abc", "0123456789") # ✗ Invalid charactersValidates that a string is valid JSON.
validate_json('{"name": "John"}') # ✓ Valid
validate_json('{invalid}') # ✗ Invalid JSONValidates Base64 encoded strings.
validate_base64("SGVsbG8gV29ybGQ=") # ✓ Valid
validate_base64("not-valid!") # ✗ Invalid Base64Validates semantic version strings (SemVer 2.0.0).
validate_semver("1.0.0") # ✓ Valid
validate_semver("1.2.3-alpha.1+build.123") # ✓ Valid
validate_semver("1.2") # ✗ Invalid formatValidates credit card numbers using the Luhn algorithm.
validate_credit_card("4532015112830366") # ✓ Valid
validate_credit_card("4532-0151-1283-0366") # ✓ Valid (separators OK)
validate_credit_card("1234567890123456") # ✗ Invalid (fails Luhn)Validates against a custom regex pattern.
validate_regex("ABC123", r"^[A-Z]+\d+$") # ✓ Validvalidate_integer(value: int | str, min_value: int = None, max_value: int = None) -> ValidationResult
Validates integers with optional range constraints.
validate_integer(42) # ✓ Valid
validate_integer("42") # ✓ Valid (string coercion)
validate_integer(42, min_value=0, max_value=100) # ✓ Valid
validate_integer(-5, min_value=0) # ✗ Below minimumvalidate_float(value: float | str, min_value: float = None, max_value: float = None, max_decimals: int = None) -> ValidationResult
Validates floats with precision control.
validate_float(3.14) # ✓ Valid
validate_float(3.14, max_decimals=2) # ✓ Valid
validate_float(3.14159, max_decimals=2) # ✗ Too many decimalsValidates positive numbers.
validate_positive(42) # ✓ Valid
validate_positive(0, allow_zero=True) # ✓ Valid
validate_positive(-5) # ✗ Not positivevalidate_range(value: float | int | str, min_value: float, max_value: float, inclusive: bool = True) -> ValidationResult
Validates numeric range boundaries.
validate_range(50, 0, 100) # ✓ Valid
validate_range(100, 0, 100, inclusive=False) # ✗ Exclusive boundsValidates network port numbers (1-65535).
validate_port(8080) # ✓ Valid
validate_port(70000) # ✗ Out of rangeValidates IPv4 addresses.
validate_ipv4("192.168.1.1") # ✓ Valid
validate_ipv4("256.1.1.1") # ✗ Invalid octetValidates IPv6 addresses.
validate_ipv6("2001:0db8:85a3:0000:0000:8a2e:0370:7334") # ✓ Valid
validate_ipv6("::1") # ✓ ValidValidates any IP address (IPv4 or IPv6).
validate_ip_address("192.168.1.1") # ✓ Valid
validate_ip_address("::1") # ✓ ValidValidates IP networks in CIDR notation.
validate_ip_network("192.168.1.0/24") # ✓ Valid
validate_ip_network("10.0.0.0/8") # ✓ ValidValidates MAC addresses.
validate_mac_address("00:11:22:33:44:55") # ✓ Valid
validate_mac_address("00-11-22-33-44-55") # ✓ Valid
validate_mac_address("0011.2233.4455") # ✓ ValidValidates date strings against a format.
validate_date("2024-01-15") # ✓ Valid
validate_date("15/01/2024", format="%d/%m/%Y") # ✓ ValidValidates datetime strings.
validate_datetime("2024-01-15 14:30:00") # ✓ Validvalidate_date_range(date_string: str, min_date: str = None, max_date: str = None, format: str = "%Y-%m-%d") -> ValidationResult
Validates dates within a range.
validate_date_range("2024-06-15",
min_date="2024-01-01",
max_date="2024-12-31") # ✓ ValidValidates ISO 8601 datetime formats.
validate_iso8601("2024-01-15") # ✓ Valid
validate_iso8601("2024-01-15T14:30:00Z") # ✓ Valid
validate_iso8601("2024-01-15T14:30:00+00:00") # ✓ Valid
validate_iso8601("2024-01-15T14:30:00.123456Z") # ✓ ValidThese validators provide real security value by preventing common attack patterns.
Prevents path traversal attacks. Use this for any user-provided file paths.
validate_path_safe("data/file.txt") # ✓ Safe
validate_path_safe("../etc/passwd") # ✗ Traversal detected
validate_path_safe("%2e%2e%2fpasswd") # ✗ Encoded traversal
validate_path_safe("/etc/passwd") # ✗ Absolute pathvalidate_filename(filename: str, allowed_extensions: set[str] = None, block_dangerous: bool = True) -> ValidationResult
Validates filenames for safe filesystem use.
validate_filename("document.pdf") # ✓ Safe
validate_filename("script.exe") # ✗ Dangerous extension
validate_filename("photo.jpg",
allowed_extensions={".jpg", ".png"}) # ✓ Safe
validate_filename("../passwd") # ✗ Path separatorValidates against an extension whitelist.
validate_file_extension("photo.jpg", {".jpg", ".png", ".gif"}) # ✓ Valid
validate_file_extension("script.php", {".jpg", ".png"}) # ✗ Not allowedValidates MIME type matches file extension.
validate_mime_type("image/jpeg", "photo.jpg") # ✓ Valid
validate_mime_type("application/php", "image.jpg") # ✗ MismatchCleans and normalizes strings.
sanitize_string(" Hello World ") # "Hello World"
sanitize_string("text", max_length=3) # "tex"Primary defense against XSS. Escapes HTML special characters.
Always use this when displaying user input in HTML context.
sanitize_html("<script>alert('xss')</script>")
# "<script>alert('xss')</script>"
sanitize_html("5 > 3 && 2 < 4")
# "5 > 3 && 2 < 4"Makes filenames safe for all filesystems.
sanitize_filename("my<file>:name.txt") # "my_file__name.txt"
sanitize_filename("../../../etc/passwd") # "passwd"
sanitize_filename("CON.txt") # "_CON.txt" (Windows reserved)Normalizes and constrains paths to a base directory.
sanitize_path("data/file.txt") # "data/file.txt"
sanitize_path("../config/db.json", base_dir="/app") # "/app/config/db.json"
sanitize_path("../../../etc/passwd", base_dir="/app") # None (escapes base)Run multiple validators and collect all errors.
result = validate_all("ab@x", [
validate_email,
(validate_length, {"min_length": 10}),
])
# ValidationResult(is_valid=False, error="Invalid email format; Value must be at least 10 characters")Create a reusable composite validator.
email_validator = create_validator(
validate_email,
(validate_length, {"max_length": 100}),
)
result = email_validator("[email protected]")from validator import (
validate_filename,
validate_file_extension,
validate_mime_type,
validate_path_safe,
sanitize_filename,
)
def validate_upload(filename: str, mime_type: str, save_dir: str) -> tuple[bool, str | None, str | None]:
"""
Validate an uploaded file before saving.
Returns (is_valid, error_message, safe_filename).
"""
ALLOWED_EXTENSIONS = {".jpg", ".jpeg", ".png", ".gif", ".pdf"}
# 1. Validate filename is safe
result = validate_filename(filename, allowed_extensions=ALLOWED_EXTENSIONS)
if not result.is_valid:
return False, result.error, None
# 2. Verify MIME type matches extension
result = validate_mime_type(mime_type, filename)
if not result.is_valid:
return False, result.error, None
# 3. Validate save path doesn't escape upload directory
result = validate_path_safe(save_dir)
if not result.is_valid:
return False, result.error, None
# 4. Generate safe filename
safe_name = sanitize_filename(filename)
return True, None, safe_namefrom validator import (
validate_email,
validate_username,
validate_password_strength,
validate_choice,
sanitize_string,
sanitize_html,
)
def validate_registration(data: dict) -> dict[str, str]:
"""Validate user registration data. Returns dict of field errors."""
errors = {}
# Email
email = sanitize_string(data.get("email", ""))
result = validate_email(email)
if not result.is_valid:
errors["email"] = result.error
# Username
username = sanitize_string(data.get("username", ""))
result = validate_username(username)
if not result.is_valid:
errors["username"] = result.error
# Password
result = validate_password_strength(data.get("password", ""))
if not result.is_valid:
errors["password"] = result.error
# Role (must be valid choice)
role = data.get("role", "")
result = validate_choice(role, ["user", "admin", "moderator"])
if not result.is_valid:
errors["role"] = result.error
return errors
def display_user_profile(user: dict) -> dict:
"""Prepare user data for HTML display."""
return {
"name": sanitize_html(user["name"]), # XSS prevention
"bio": sanitize_html(user["bio"]),
"email": user["email"], # Already validated, not user-displayed
}from validator import (
validate_path_safe,
validate_filename,
sanitize_path,
)
from pathlib import Path
IMPORT_DIR = Path("/app/data/imports")
def process_import_request(user_path: str) -> Path:
"""
Safely resolve a user-provided file path for import.
Raises ValueError if path is unsafe.
"""
# 1. Validate no path traversal
result = validate_path_safe(user_path)
if not result.is_valid:
raise ValueError(f"Invalid path: {result.error}")
# 2. Validate filename
filename = Path(user_path).name
result = validate_filename(filename, allowed_extensions={".csv", ".json", ".xml"})
if not result.is_valid:
raise ValueError(f"Invalid file: {result.error}")
# 3. Constrain to import directory
safe_path = sanitize_path(user_path, base_dir=IMPORT_DIR)
if safe_path is None:
raise ValueError("Path escapes allowed directory")
return Path(safe_path)from validator import (
validate_ip_address,
validate_ip_network,
validate_port,
validate_mac_address,
)
def validate_network_config(config: dict) -> dict[str, str]:
"""Validate network configuration. Returns dict of errors."""
errors = {}
# Server IP
result = validate_ip_address(config.get("server_ip", ""))
if not result.is_valid:
errors["server_ip"] = result.error
# Port
result = validate_port(config.get("port", ""))
if not result.is_valid:
errors["port"] = result.error
# Allowed network (CIDR)
if "allowed_network" in config:
result = validate_ip_network(config["allowed_network"])
if not result.is_valid:
errors["allowed_network"] = result.error
# Device MAC
if "device_mac" in config:
result = validate_mac_address(config["device_mac"])
if not result.is_valid:
errors["device_mac"] = result.error
return errorsThis library is one part of a secure application. Here's how to handle common security concerns:
# WRONG - Never do this
query = f"SELECT * FROM users WHERE name = '{username}'"
# RIGHT - Use parameterized queries
cursor.execute("SELECT * FROM users WHERE name = ?", (username,))# WRONG - Never do this
os.system(f"convert {filename} output.png")
# RIGHT - Use subprocess with shell=False
import subprocess
subprocess.run(["convert", filename, "output.png"], shell=False)# Use sanitize_html when displaying user content
from validator import sanitize_html
user_comment = "<script>alert('xss')</script>"
safe_comment = sanitize_html(user_comment)
# Now safe to include in HTML: <script>...
# Or use a template engine with auto-escaping (Jinja2, Django templates)from validator import validate_path_safe, sanitize_path
user_file = request.args.get("file")
# Validate first
result = validate_path_safe(user_file)
if not result.is_valid:
return "Invalid path", 400
# Constrain to allowed directory
safe_path = sanitize_path(user_file, base_dir="/app/uploads")
if safe_path is None:
return "Access denied", 403- Python 3.10+
- No external dependencies
MIT License - see LICENSE for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
When contributing, please keep these principles in mind:
- Validators validate, they don't secure - Input validation checks format, not intent
- No blocklist-based security - Blocklists can always be bypassed
- Clear naming - Function names should describe exactly what they check
- Honest documentation - Don't oversell security properties