|
| 1 | +# RFC-011: Safari Data Storage |
| 2 | + |
| 3 | +**Author**: moonD4rk |
| 4 | +**Status**: Living Document |
| 5 | +**Created**: 2026-04-21 |
| 6 | + |
| 7 | +## 1. Overview |
| 8 | + |
| 9 | +Safari is **macOS-only** and sandboxed under App Sandbox. Most of Safari's user data lives inside `~/Library/Containers/com.apple.Safari/Data/Library/` (the container root) and requires **Full Disk Access (TCC)** for third-party processes to read. A few legacy files still reside at `~/Library/Safari/` for backwards compatibility. |
| 10 | + |
| 11 | +Unlike Chromium and Firefox, Safari does **not** encrypt bookmarks, history, cookies, downloads, or localStorage — all are stored in plaintext on disk. Passwords are the only encrypted category and are delegated to the macOS login Keychain (see [RFC-006](006-key-retrieval-mechanisms.md) §7). |
| 12 | + |
| 13 | +Safari 17 (September 2023) introduced **multi-profile support**. Profile discovery therefore has two layers: a synthetic "default" profile mapped to the pre-profile legacy paths, plus one or more named profiles enumerated from `SafariTabs.db`. |
| 14 | + |
| 15 | +## 2. Profile Structure |
| 16 | + |
| 17 | +Each `profileContext` (in `browser/safari/profiles.go`) tracks five fields: |
| 18 | + |
| 19 | +| Field | Meaning | |
| 20 | +|-------|---------| |
| 21 | +| `name` | Human-readable profile name, disambiguated for duplicates | |
| 22 | +| `uuidUpper` | UUID in uppercase (used by `Safari/Profiles/<UUID>/` directories) | |
| 23 | +| `uuidLower` | UUID in lowercase (used by `WebKit/WebsiteDataStore/<uuid>/` directories) | |
| 24 | +| `legacyHome` | `~/Library/Safari` | |
| 25 | +| `container` | `~/Library/Containers/com.apple.Safari/Data/Library` | |
| 26 | + |
| 27 | +Empty `uuidUpper` marks the synthetic default profile. |
| 28 | + |
| 29 | +### 2.1 Profile Discovery |
| 30 | + |
| 31 | +The default profile is always emitted first. Named profiles come from `SafariTabs.db`: |
| 32 | + |
| 33 | +```sql |
| 34 | +SELECT external_uuid, title FROM bookmarks |
| 35 | +WHERE subtype = 2 AND external_uuid != 'DefaultProfile' |
| 36 | +``` |
| 37 | + |
| 38 | +`DefaultProfile` is Safari's sentinel string for the pre-profile era; it is filtered out because it is already represented by the synthetic default. |
| 39 | + |
| 40 | +If the DB cannot be opened (missing, permission-denied), Safari falls back to scanning `Safari/Profiles/` for any directory whose name is a canonical 8-4-4-4-12 UUID and synthesizing the name as `profile-<uuid[:8]>`. This makes profile discovery robust even when TCC blocks the SQL read. |
| 41 | + |
| 42 | +Duplicate display names are disambiguated with `-2`, `-3`, … suffixes, deterministically by discovery order. |
| 43 | + |
| 44 | +### 2.2 UUID Case Asymmetry |
| 45 | + |
| 46 | +Safari uses two different casings for the same profile UUID across the container: |
| 47 | + |
| 48 | +| Path prefix | Casing | Example | |
| 49 | +|-------------|:------:|---------| |
| 50 | +| `Safari/Profiles/<UUID>/` | Uppercase | `5604E6F5-02ED-4E40-8249-63DE7BC986C8` | |
| 51 | +| `WebKit/WebsiteDataStore/<uuid>/` | Lowercase | `5604e6f5-02ed-4e40-8249-63de7bc986c8` | |
| 52 | + |
| 53 | +`profileContext` stores both to avoid case-folding at every call site. |
| 54 | + |
| 55 | +## 3. Data File Locations |
| 56 | + |
| 57 | +### 3.1 Default Profile |
| 58 | + |
| 59 | +| Category | Path | Format | |
| 60 | +|----------|------|--------| |
| 61 | +| History | `~/Library/Safari/History.db` | SQLite | |
| 62 | +| Cookie | `Container/Cookies/Cookies.binarycookies`, then `~/Library/Cookies/Cookies.binarycookies` | BinaryCookies | |
| 63 | +| Bookmark | `~/Library/Safari/Bookmarks.plist` | plist | |
| 64 | +| Download | `~/Library/Safari/Downloads.plist` | plist | |
| 65 | +| LocalStorage | `Container/WebKit/WebsiteData/Default/` | WebKit Origins dir | |
| 66 | +| Password | macOS Keychain | — | |
| 67 | + |
| 68 | +The Cookie path is resolved in priority order — the first candidate that exists wins. Modern (macOS 14+) installs keep cookies in the sandboxed container; the legacy path is kept as a fallback for upgraded systems. |
| 69 | + |
| 70 | +### 3.2 Named Profiles |
| 71 | + |
| 72 | +| Category | Path | Format | |
| 73 | +|----------|------|--------| |
| 74 | +| History | `Container/Safari/Profiles/<UUID>/History.db` | SQLite | |
| 75 | +| Cookie | `Container/WebKit/WebsiteDataStore/<uuid>/Cookies/Cookies.binarycookies` | BinaryCookies | |
| 76 | +| Download | `~/Library/Safari/Downloads.plist` (filtered by UUID) | plist | |
| 77 | +| LocalStorage | `Container/WebKit/WebsiteDataStore/<uuid>/Origins/` | WebKit Origins dir | |
| 78 | + |
| 79 | +Bookmark is intentionally **omitted** from named profiles: `Bookmarks.plist` is a shared plist with no per-entry profile tag, so it is attributed to the default profile only. Duplicate bookmarks would otherwise be emitted per profile. |
| 80 | + |
| 81 | +Downloads is shared across all profiles but each entry carries a `DownloadEntryProfileUUIDStringKey`; the extractor filters at read time so each profile sees only its own downloads. |
| 82 | + |
| 83 | +Passwords live in the user-scope Keychain, not on a per-profile basis — only the default profile emits passwords to avoid duplicates across the output. |
| 84 | + |
| 85 | +## 4. Data Storage Formats |
| 86 | + |
| 87 | +### 4.1 History (History.db — SQLite) |
| 88 | + |
| 89 | +```sql |
| 90 | +SELECT url, title, visit_count, visit_time |
| 91 | +FROM history_items |
| 92 | +LEFT JOIN history_visits ON history_items.id = history_visits.history_item |
| 93 | +``` |
| 94 | + |
| 95 | +Schema notes: |
| 96 | +- `visit_time` is a `REAL` column using the **Core Data epoch** (see Section 5) |
| 97 | +- One item → many visits; the extractor takes the most recent visit per item |
| 98 | +- Results are sorted by `visit_count` descending |
| 99 | + |
| 100 | +### 4.2 Cookies (Cookies.binarycookies — binary) |
| 101 | + |
| 102 | +Apple's proprietary BinaryCookies format — not SQLite, not a documented format. Parsed by the [go-binarycookies](https://github.com/moond4rk/go-binarycookies) library. |
| 103 | + |
| 104 | +High-level layout: |
| 105 | + |
| 106 | +``` |
| 107 | +| "cook" magic | page_count | page_sizes[] | pages[] | |
| 108 | +|--------------|------------|------------------|--------------------------| |
| 109 | +| 4B | 4B (BE) | page_count × 4B | variable | |
| 110 | +``` |
| 111 | + |
| 112 | +Each page is an index-of-cookies table followed by per-cookie records. A cookie record carries flags (`isSecure`, `isHTTPOnly`), URL/name/path/value offsets into the record, and creation / expiry timestamps in Core Data epoch. |
| 113 | + |
| 114 | +Cookie values are **plaintext** — no per-cookie encryption. This is a fundamental divergence from Chromium, which encrypts `encrypted_value` with the OS master key. |
| 115 | + |
| 116 | +### 4.3 Bookmarks (Bookmarks.plist — property list) |
| 117 | + |
| 118 | +A nested dictionary tree with a `WebBookmarkType` discriminator at each node: |
| 119 | + |
| 120 | +| Type | Meaning | Additional keys | |
| 121 | +|------|---------|-----------------| |
| 122 | +| `WebBookmarkTypeList` | Folder | `Children` (array) | |
| 123 | +| `WebBookmarkTypeLeaf` | URL entry | `URLString`, `URIDictionary.title` | |
| 124 | + |
| 125 | +The extractor walks the tree recursively, collecting leaf nodes into a flat list. Folder names are not preserved (only URL + title pairs are exported). |
| 126 | + |
| 127 | +### 4.4 Downloads (Downloads.plist — property list) |
| 128 | + |
| 129 | +A flat structure with a `DownloadHistory` array. Relevant keys per entry: |
| 130 | + |
| 131 | +| Key | Meaning | |
| 132 | +|-----|---------| |
| 133 | +| `DownloadEntryURL` | Source URL | |
| 134 | +| `DownloadEntryPath` | Local filesystem path | |
| 135 | +| `DownloadEntryBytesReceivedSoFar` | Bytes downloaded | |
| 136 | +| `DownloadEntryProfileUUIDStringKey` | Owning profile's uppercase UUID, or `"DefaultProfile"` | |
| 137 | + |
| 138 | +The extractor filters by the caller-provided owner UUID so each profile reports its own downloads. MIME type and start/end times are not stored by Safari — `MimeType` is always empty in the output. |
| 139 | + |
| 140 | +### 4.5 Passwords (macOS Keychain) |
| 141 | + |
| 142 | +Safari does **not** persist passwords to a file in its container. All credentials live in `login.keychain-db`, accessible via `InternetPassword` records. The extractor reads them directly through [keychainbreaker](https://github.com/moond4rk/keychainbreaker) and reconstructs the URL from `(protocol, server, port, path)`. |
| 143 | + |
| 144 | +Default port handling: |
| 145 | + |
| 146 | +| Protocol | Default port | URL rendering | |
| 147 | +|----------|-------------:|---------------| |
| 148 | +| `https` | 443 | `https://host/path` (port omitted) | |
| 149 | +| `http` | 80 | `http://host/path` (port omitted) | |
| 150 | +| `ftp` | 21 | `ftp://host/path` (port omitted) | |
| 151 | +| Other | — | `scheme://host:port/path` | |
| 152 | + |
| 153 | +The `htps` FourCC protocol code emitted by some Keychain entries is normalized to `https`. |
| 154 | + |
| 155 | +Partial-extraction mode: if the Keychain cannot be unlocked (no `--keychain-pw` supplied, or the password is wrong), metadata-only records are still emitted — URL, username, timestamps — with `PlainPassword` left blank. See [RFC-006](006-key-retrieval-mechanisms.md) §7 for the full credential-extraction architecture. |
| 156 | + |
| 157 | +### 4.6 LocalStorage (WebKit Origins — nested SQLite) |
| 158 | + |
| 159 | +Safari 17+ stores localStorage under a **partition-aware nested tree**, rooted at: |
| 160 | + |
| 161 | +| Profile | Root path | |
| 162 | +|---------|-----------| |
| 163 | +| Default | `Container/WebKit/WebsiteData/Default/` | |
| 164 | +| Named | `Container/WebKit/WebsiteDataStore/<uuid>/Origins/` | |
| 165 | + |
| 166 | +Under the root, two levels of hashed directories lead to the actual data: |
| 167 | + |
| 168 | +``` |
| 169 | +<root>/<top-frame-hash>/<frame-hash>/ |
| 170 | +├── origin ← binary-serialized origins (top + frame) |
| 171 | +└── LocalStorage/ |
| 172 | + ├── localstorage.sqlite3 ← ItemTable(key TEXT UNIQUE, value BLOB NOT NULL) |
| 173 | + ├── localstorage.sqlite3-shm |
| 174 | + └── localstorage.sqlite3-wal |
| 175 | +``` |
| 176 | + |
| 177 | +`top-frame-hash == frame-hash` for **first-party** storage. They differ for **partitioned third-party** storage (an iframe with a different origin than the top document). The named profile root additionally carries a `salt` sibling file used by WebKit's origin-hashing — skipped at traversal time. |
| 178 | + |
| 179 | +The flat `WebsiteDataStore/<uuid>/LocalStorage/<scheme>_<host>_<port>.localstorage` layout used by older WebKit is **empty on modern Safari** and is not supported. |
| 180 | + |
| 181 | +#### Origin file format |
| 182 | + |
| 183 | +Two `origin` blocks back-to-back — top-frame then frame. Each block: |
| 184 | + |
| 185 | +``` |
| 186 | +| scheme record | host record | port section | |
| 187 | +|--------------------------|--------------------------|-----------------| |
| 188 | +| uint32_le len | enc byte | uint32_le len | enc byte | 0x00 | |
| 189 | +| <len bytes> | <len bytes> | | |
| 190 | + or |
| 191 | + | 0x01 | uint16_le port | |
| 192 | +``` |
| 193 | + |
| 194 | +- `enc byte`: `0x01` = Latin-1/ASCII (common), `0x00` = UTF-16 LE |
| 195 | +- Port section: `0x00` marker means "use scheme default" (stored as port 0 in the parsed struct); `0x01` marker is followed by a 2-byte little-endian port |
| 196 | + |
| 197 | +The extractor reads both blocks and reports the **frame origin URL** — that is what JavaScript's `window.localStorage` actually exposes in the partitioned case. If only the top-frame block is parseable, the extractor falls back to it. |
| 198 | + |
| 199 | +#### ItemTable |
| 200 | + |
| 201 | +```sql |
| 202 | +SELECT key, value FROM ItemTable |
| 203 | +``` |
| 204 | + |
| 205 | +Schema: `(key TEXT UNIQUE ON CONFLICT REPLACE, value BLOB NOT NULL ON CONFLICT FAIL)`. |
| 206 | + |
| 207 | +Values are **UTF-16 LE** encoded JS strings. Oversized values (≥ 2048 bytes) are replaced with a size marker in the output — this matches the cap used by the Chromium extractor ([RFC-002](002-chromium-data-storage.md) §4.8) and keeps JSON/CSV exports bounded. |
| 208 | + |
| 209 | +## 5. Time Formats |
| 210 | + |
| 211 | +Safari uses the **Core Data epoch** — 2001-01-01 00:00:00 UTC, which is **978,307,200 seconds** after the Unix epoch. To convert a Core Data timestamp to Unix time, add `978307200` seconds. |
| 212 | + |
| 213 | +| Data Type | Field | Storage | |
| 214 | +|-----------|-------|---------| |
| 215 | +| History | `visit_time` | REAL seconds, Core Data epoch | |
| 216 | +| Cookies | `creation`, `expiry` | REAL seconds, Core Data epoch | |
| 217 | +| Downloads | — | No timestamp stored | |
| 218 | +| Passwords | Keychain `Created` | Already Unix time (via keychainbreaker) | |
| 219 | +| LocalStorage | — | No timestamp stored | |
| 220 | + |
| 221 | +Bookmarks carry no timestamp in Safari's plist representation. |
| 222 | + |
| 223 | +## 6. Encryption |
| 224 | + |
| 225 | +Safari's encryption story is deliberately thin: |
| 226 | + |
| 227 | +| Category | Encryption | |
| 228 | +|----------|------------| |
| 229 | +| History | None (plaintext SQLite) | |
| 230 | +| Cookies | None (plaintext binary format) | |
| 231 | +| Bookmarks | None (plaintext plist) | |
| 232 | +| Downloads | None (plaintext plist) | |
| 233 | +| LocalStorage | None (plaintext SQLite; UTF-16 LE is an encoding, not encryption) | |
| 234 | +| Passwords | macOS Keychain — see [RFC-006](006-key-retrieval-mechanisms.md) §7 | |
| 235 | + |
| 236 | +The only encrypted category is passwords. Because they are not stored in Safari's own files at all, there is no Safari-specific cipher, key derivation, or master-key retrieval to document. See RFC-006 for the `InternetPassword` extraction path. |
| 237 | + |
| 238 | +## 7. Platform Specifics |
| 239 | + |
| 240 | +- **macOS-only**. There is no Safari on Windows or Linux. |
| 241 | +- **Full Disk Access (TCC)** is required to read the sandboxed container. Without it, cookies / history / downloads / localStorage reads fail silently with permission errors at stat or open time. Legacy paths under `~/Library/Safari/` sometimes remain readable without FDA, but are mostly empty on modern systems. |
| 242 | +- **Live-file safety**: `SafariTabs.db`, `History.db`, and `localstorage.sqlite3` can be written to by a running Safari instance. All live SQL reads use `?mode=ro&immutable=1`, which disables WAL replay and locking — the extractor sees a consistent snapshot of the main DB as of read time. Uncommitted WAL content is intentionally not replayed to avoid race-induced corruption. |
| 243 | +- **Multi-profile availability**: requires Safari 17 (macOS 14 Sonoma) or newer. Older Safari versions have only the default profile; discovery degrades cleanly via the ReadDir fallback described in §2.1. |
| 244 | +- **File acquisition**: all per-profile files are copied into a `filemanager.Session` temp directory before extraction, except the discovery-time `SafariTabs.db` read which opens the live file directly. See [RFC-008](008-file-acquisition-and-platform-quirks.md) for the general pattern. |
| 245 | + |
| 246 | +## 8. Key Differences from Chromium and Firefox |
| 247 | + |
| 248 | +| Aspect | Chromium | Firefox | Safari | |
| 249 | +|--------|----------|---------|--------| |
| 250 | +| Platform | Cross-platform | Cross-platform | **macOS-only** | |
| 251 | +| Profile discovery | `Preferences` sentinel file | Any data file present | `SafariTabs.db` SQL + dir fallback | |
| 252 | +| Profile naming | `Default`, `Profile 1`, … | `<prefix>.default-release` | Human-readable title from SafariTabs.db | |
| 253 | +| Password storage | Encrypted SQLite (`Login Data`) | Encrypted JSON (`logins.json`) | **macOS Keychain** (no file) | |
| 254 | +| Cookie encryption | Encrypted with OS master key | Plaintext | **Plaintext** | |
| 255 | +| Cookie format | SQLite | SQLite | Proprietary BinaryCookies binary | |
| 256 | +| History | SQLite | SQLite (`places.sqlite`) | SQLite (Core Data epoch) | |
| 257 | +| Bookmark | JSON | SQLite (`places.sqlite`) | **plist** | |
| 258 | +| Download | SQLite (`History`, shared) | SQLite (`places.sqlite`, shared) | **plist** (filtered by UUID) | |
| 259 | +| LocalStorage | LevelDB | SQLite (`webappsstore.sqlite`) | Nested **WebKit Origins** SQLite | |
| 260 | +| LocalStorage partitioning | No | No | **Yes** (top-frame + frame hashes) | |
| 261 | +| CreditCard / SessionStorage | Supported | Not supported | Not supported | |
| 262 | +| Encryption scope | Passwords, cookies, credit cards | Passwords only | Passwords only | |
| 263 | +| Time format | WebKit microseconds since 1601 | Mixed (μs for most, ms for passwords) | Core Data seconds since 2001 | |
| 264 | + |
| 265 | +## Related RFCs |
| 266 | + |
| 267 | +| RFC | Topic | |
| 268 | +|-----|-------| |
| 269 | +| [RFC-001](001-project-architecture.md) | Project architecture and directory layout | |
| 270 | +| [RFC-006](006-key-retrieval-mechanisms.md) | §7 covers Safari Keychain credential extraction | |
| 271 | +| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition via `filemanager.Session` | |
0 commit comments