Base64 Encoding Explained: Alphabet, Padding, and When to Use It

Understand Base64 encoding: what problem it solves, the standard alphabet, padding rules, variants (URL-safe, MIME), performance tradeoffs, and common mistakes.

Why Base64 exists

Base64 is a binary-to-text encoding defined in RFC 4648. It transforms arbitrary binary data into a string of 64 safe ASCII characters — letters, digits, plus two symbols — so binary content can travel through channels that only accept text.

Historically, Base64 solved the problem of sending binary attachments through email, which in its original form (SMTP) was 7-bit text only. Modern usage extends to JSON payloads, data URLs, JWTs, and anywhere binary data must be transported as text.

The standard alphabet

RFC 4648’s standard Base64 alphabet is A-Z (indices 0-25), a-z (26-51), 0-9 (52-61), and two symbols: + (62) and / (63). The padding character = completes groups that are not a multiple of 3 bytes.

Each Base64 character encodes 6 bits. Three input bytes (24 bits) map to four output characters — a 33% size expansion. This is the unavoidable cost of text-safe encoding: you cannot represent 256 byte values in fewer than about 1.33 text characters on average.

Encoding walkthrough

To encode "Man" (M=77, a=97, n=110 in ASCII):

  • Binary: 01001101 01100001 01101110 (24 bits)
  • Regroup into 6-bit blocks: 010011 010110 000101 101110
  • Decimal: 19, 22, 5, 46
  • Map to alphabet: T, W, F, u
  • Output: "TWFu" — 3 bytes become 4 characters exactly

Padding rules

When input length is not a multiple of 3, the last group is zero-padded to 18 or 12 bits and the output is padded with = characters to maintain a length divisible by 4.

Specifically: 1 input byte → 2 chars + "==", 2 input bytes → 3 chars + "=", 3 input bytes → 4 chars + no padding. Decoders use padding to verify the expected input length; some modern variants omit padding and compute it from the output length.

Variants for different contexts

Standard Base64 uses + and /, which collide with URL syntax (+ is URL-encoded as space, / is a path separator). RFC 4648 defines alternate alphabets to solve this.

  • Standard Base64: uses + and /, with = padding — classic for MIME
  • Base64URL: uses - and _ instead, with optional padding — safe in URLs and filenames
  • MIME Base64: line-wrapped at 76 characters with CRLF — legacy email compatibility
  • Base64 with no padding: seen in JWT, Git, and modern APIs; decoder reconstructs length

When Base64 is appropriate

Base64 is the right tool when binary must travel through a text-only channel or be embedded in a text-only format. It is not appropriate when a binary channel is available — the 33% overhead is pure cost with no benefit in that case.

  • Embedding images in CSS data: URLs (inlining small icons)
  • Encoding binary in JSON bodies (APIs, config files)
  • JWT payloads (base64url-encoded JSON with a signature)
  • Email attachments (MIME Content-Transfer-Encoding: base64)
  • Storing binary blobs in databases that only accept text

Common mistakes

Base64 is not encryption. It is trivially reversible — anyone who sees a Base64 string can decode it in a browser console. Using Base64 to "hide" credentials or secrets is a recurring security mistake.

Other traps: mixing standard and URL-safe alphabets, stripping padding without telling the decoder, double-encoding (Base64 of Base64, inflating by a factor of 1.78), and encoding already-text content unnecessarily (often UTF-8 strings in JSON, which is already text-safe).

About the author
RC
Renato Candido dos Passos
Fundador e especialista em Blockchain, Fonoaudiologia e Finanças

Founder of UtilizAí, with a background in Blockchain, Cryptocurrencies and Finance in the Digital Era, plus complementary studies in Theology, Philosophy and ongoing coursework in Speech-Language Pathology. Learn more.

Frequently asked questions

Is Base64 encryption?

No. Base64 is encoding, not encryption. It is a reversible, deterministic transformation that anyone can decode without a key. Use it for binary-to-text conversion only. For confidentiality, combine actual encryption (AES, ChaCha20) with Base64 as a transport layer.

Why the 33% size overhead?

Base64 uses 64 distinct symbols to represent 256 possible byte values. Since 64 = 2⁶, each character carries 6 bits, meaning 3 input bytes (24 bits) require 4 output characters. The ratio 4/3 ≈ 1.33 is the theoretical minimum for a 64-symbol alphabet.

When should I use Base64URL instead of standard Base64?

Anywhere the encoded string will appear in a URL, filename, HTTP header, or other context where + and / cause problems. JWTs use Base64URL exclusively for this reason. Most web APIs today have standardized on Base64URL.

Can I skip padding?

Technically yes, and many modern applications (JWT, for example) omit padding. But your decoder must handle the length reconstruction. If you control both ends, no padding is fine; if you interoperate with third parties, keep padding unless a specification explicitly says otherwise.

What is a data URL?

A URL scheme (RFC 2397) that embeds data directly in the URL using the format data:[<mediatype>][;base64],<data>. Small images in CSS or HTML often use data URLs to eliminate a round-trip to the server, at the cost of 33% more bytes in the HTML/CSS itself.

Related guides