Why Base64 exists
Base64 is a binary-to-text encoding defined in RFC 4648. It transforms arbitrary binary data into a string of 64 safe ASCII characters — letters, digits, plus two symbols — so binary content can travel through channels that only accept text.
Historically, Base64 solved the problem of sending binary attachments through email, which in its original form (SMTP) was 7-bit text only. Modern usage extends to JSON payloads, data URLs, JWTs, and anywhere binary data must be transported as text.
The standard alphabet
RFC 4648’s standard Base64 alphabet is A-Z (indices 0-25), a-z (26-51), 0-9 (52-61), and two symbols: + (62) and / (63). The padding character = completes groups that are not a multiple of 3 bytes.
Each Base64 character encodes 6 bits. Three input bytes (24 bits) map to four output characters — a 33% size expansion. This is the unavoidable cost of text-safe encoding: you cannot represent 256 byte values in fewer than about 1.33 text characters on average.
Encoding walkthrough
To encode "Man" (M=77, a=97, n=110 in ASCII):
- Binary: 01001101 01100001 01101110 (24 bits)
- Regroup into 6-bit blocks: 010011 010110 000101 101110
- Decimal: 19, 22, 5, 46
- Map to alphabet: T, W, F, u
- Output: "TWFu" — 3 bytes become 4 characters exactly
Padding rules
When input length is not a multiple of 3, the last group is zero-padded to 18 or 12 bits and the output is padded with = characters to maintain a length divisible by 4.
Specifically: 1 input byte → 2 chars + "==", 2 input bytes → 3 chars + "=", 3 input bytes → 4 chars + no padding. Decoders use padding to verify the expected input length; some modern variants omit padding and compute it from the output length.
Variants for different contexts
Standard Base64 uses + and /, which collide with URL syntax (+ is URL-encoded as space, / is a path separator). RFC 4648 defines alternate alphabets to solve this.
- Standard Base64: uses + and /, with = padding — classic for MIME
- Base64URL: uses - and _ instead, with optional padding — safe in URLs and filenames
- MIME Base64: line-wrapped at 76 characters with CRLF — legacy email compatibility
- Base64 with no padding: seen in JWT, Git, and modern APIs; decoder reconstructs length
When Base64 is appropriate
Base64 is the right tool when binary must travel through a text-only channel or be embedded in a text-only format. It is not appropriate when a binary channel is available — the 33% overhead is pure cost with no benefit in that case.
- Embedding images in CSS data: URLs (inlining small icons)
- Encoding binary in JSON bodies (APIs, config files)
- JWT payloads (base64url-encoded JSON with a signature)
- Email attachments (MIME Content-Transfer-Encoding: base64)
- Storing binary blobs in databases that only accept text
Common mistakes
Base64 is not encryption. It is trivially reversible — anyone who sees a Base64 string can decode it in a browser console. Using Base64 to "hide" credentials or secrets is a recurring security mistake.
Other traps: mixing standard and URL-safe alphabets, stripping padding without telling the decoder, double-encoding (Base64 of Base64, inflating by a factor of 1.78), and encoding already-text content unnecessarily (often UTF-8 strings in JSON, which is already text-safe).
Founder of UtilizAí, with a background in Blockchain, Cryptocurrencies and Finance in the Digital Era, plus complementary studies in Theology, Philosophy and ongoing coursework in Speech-Language Pathology. Learn more.
Frequently asked questions
Is Base64 encryption?
No. Base64 is encoding, not encryption. It is a reversible, deterministic transformation that anyone can decode without a key. Use it for binary-to-text conversion only. For confidentiality, combine actual encryption (AES, ChaCha20) with Base64 as a transport layer.
Why the 33% size overhead?
Base64 uses 64 distinct symbols to represent 256 possible byte values. Since 64 = 2⁶, each character carries 6 bits, meaning 3 input bytes (24 bits) require 4 output characters. The ratio 4/3 ≈ 1.33 is the theoretical minimum for a 64-symbol alphabet.
When should I use Base64URL instead of standard Base64?
Anywhere the encoded string will appear in a URL, filename, HTTP header, or other context where + and / cause problems. JWTs use Base64URL exclusively for this reason. Most web APIs today have standardized on Base64URL.
Can I skip padding?
Technically yes, and many modern applications (JWT, for example) omit padding. But your decoder must handle the length reconstruction. If you control both ends, no padding is fine; if you interoperate with third parties, keep padding unless a specification explicitly says otherwise.
What is a data URL?
A URL scheme (RFC 2397) that embeds data directly in the URL using the format data:[<mediatype>][;base64],<data>. Small images in CSS or HTML often use data URLs to eliminate a round-trip to the server, at the cost of 33% more bytes in the HTML/CSS itself.
Related guides
Understand password entropy, how cracking attacks actually work, why length beats complexity, and how to use passphrases and password managers effectively.
A clear guide to QR codes: the anatomy of the code, capacity and error correction levels, common use cases, and best practices for reliable scanning.
A practical guide to image compression: how JPEG, PNG, WebP, and AVIF work, lossy vs lossless, best use cases per format, and quality-size tradeoffs.
Understand PDF encryption: standard vs strong encryption, user vs owner passwords, common security flags, how redaction works, and real-world limits of PDF security.