The Complete Guide to Base91 Encoding: Principles, Algorithm, and Real-World Applications
An in-depth guide to Base91 (basE91) encoding: understand how it works, compare it with Base64 and Base85, and discover its applications in data compression, embedded systems, and more. Includes an online Base91 encoder/decoder tool.
In the world of binary-to-text encoding, Base64 is the undisputed standard, and Base85 has carved out its niche with superior efficiency in domains like PDF and Git. However, there’s an even more efficient encoding scheme that remains relatively unknown — Base91 (also stylized as basE91). By leveraging 91 printable ASCII characters and a clever variable-length encoding strategy, Base91 achieves an encoding overhead of only ~23%, making it one of the most space-efficient printable ASCII encoding schemes available. This article provides a deep dive into Base91’s principles, algorithm details, and practical applications.
Need to encode or decode Base91 quickly? Try our Online Base91 Encoder/Decoder.
1. What is Base91?
Base91 is a binary-to-text encoding scheme designed by Joachim Henke. It uses 91 printable ASCII characters to represent binary data and is one of the most efficient encoding schemes that relies solely on printable ASCII characters.
Core idea: Use as many printable characters as possible to minimize encoding overhead.
Why 91?
ASCII has 94 visible printable characters in the range 0x21 to 0x7E (or 95 printable characters if you also count the space character at 0x20). basE91 builds its alphabet from those 94 visible printable characters and excludes 3 that are problematic in various contexts:
| Excluded Character | Reason |
|---|---|
- (hyphen) | Has special meaning in command-line arguments |
\ (backslash) | Escape character in most programming languages and shells |
' (single quote) | Used for string delimiting in shells and programming languages |
The Base91 alphabet is:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&()*+,./:;<=>?@[]^_`{|}~"
Encoding Efficiency Comparison
| Encoding | Overhead (worst case) | Overhead (average) | Encoded Size (1000-byte input) |
|---|---|---|---|
| Hexadecimal | 100% | 100% | 2000 chars |
| Base32 | 60% | 60% | 1600 chars |
| Base64 | 33% | 33% | 1336 chars |
| Base85 | 25% | 25% | 1250 chars |
| Base91 | 23% | ≈23% | ≈1230 chars |
2. How Base91 Works
Unlike Base64 and Base85, which use fixed-size grouping strategies, Base91 employs a variable-length encoding mechanism — the key to its superior efficiency.
2.1 Encoding Process
The Base91 encoding algorithm operates on a bit stream:
- Build a bit stream: Treat the input data as a continuous stream of bits, reading from the least significant bit (LSB) of each byte first.
- Extract 13 bits: Take 13 bits from the stream to form a value
v(range: 0–8191). - Choose 13 or 14 bits: If
v > 88, encode those 13 bits directly. Otherwise, read 1 additional bit and recomputevfrom the low 14 bits, so this output pair carries 14 bits. - Split into two characters: Divide
vby 91 to get quotientqand remainderr. Outputalphabet[r]andalphabet[q]as two characters. - Handle the tail: If the bit stream ends with fewer than 13 remaining bits, process them as the final value. If the value fits in a single character (i.e.,
< 91), output 1 character; otherwise output 2 characters.
2.2 Why is the Threshold 88?
This is the most elegant aspect of Base91’s design. The key insight is:
- 13 bits can represent values from 0 to 8191
- 14 bits can represent values from 0 to 16383
- 91² = 8281
If the low 13-bit value is between 89 and 8191, Base91 encodes exactly those 13 bits, because every such value already fits in the 0..8280 range addressable by 2 Base91 characters.
If the low 13-bit value is between 0 and 88, Base91 uses that range as a signal that this chunk should consume 14 bits instead. In that case the final 14-bit value is either 0..88 or 8192..8280, which still fits inside the 0..8280 range of 91² states.
This strategy allows the algorithm to pack as many data bits as possible into each pair of output characters without wasting encoding space.
2.3 Encoding Example
Let’s encode the string "Hi" (2 bytes):
Step 1: Get byte values and build the bit buffer
'H' = 72 = 0x48
'i' = 105 = 0x69
b = 72 + (105 << 8) = 26952
n = 16
Step 2: Extract 13 bits
Take the low 13 bits:
v = 26952 & 8191 = 2376
Step 3: Check threshold
2376 > 88, so encode 13 bits
Step 4: Divide by 91
2376 ÷ 91 = 26 remainder 10
r = 10 → alphabet[10] = 'K'
q = 26 → alphabet[26] = 'a'
Step 5: Handle remaining bits
b = 26952 >> 13 = 3
n = 3
3 < 91, output alphabet[3] = 'D'
Final encoded result: KaD
2.4 Decoding Process
Decoding is the reverse of encoding:
- Map each encoded character back to its index in the alphabet via a lookup table.
- Read 2 values at a time:
randq, and computev = q × 91 + r. - If
(v & 8191) > 88, the pair represents a 13-bit chunk; otherwise it represents a 14-bit chunk. - If only 1 character remains at the end, its index value is written directly to the low bits of the stream.
- Convert the bit stream back to bytes, 8 bits at a time.
3. Detailed Comparison with Other Encodings
3.1 Base91 vs Base64
| Feature | Base64 | Base91 |
|---|---|---|
| Alphabet size | 64 | 91 |
| Encoding overhead | 33% | ≈23% |
| Encoding strategy | Fixed (3→4) | Variable (13/14 bits → 2 chars) |
| Standardization | RFC 4648 | No formal RFC |
| Library support | Extremely broad | Limited |
| URL-safe variant | Base64url | None |
| Native browser support | Yes (btoa/atob) | No |
Encoded size comparison (1024 bytes of random data):
Raw data: 1024 bytes
Base64: 1368 characters (33.6% overhead)
Base91: ≈1259 characters (≈22.9% overhead)
Savings: approximately 8%
3.2 Base91 vs Base85
| Feature | Base85 | Base91 |
|---|---|---|
| Alphabet size | 85 | 91 |
| Encoding overhead | 25% | ≈23% |
| Encoding strategy | Fixed (4→5) | Variable length |
| Multiple variants | Yes (Ascii85, Z85, etc.) | Essentially one |
| Primary applications | PDF, Git | Data transfer, embedded systems |
The efficiency gain of Base91 over Base85 is modest (~2%), but Base91’s variable-length strategy can perform better with certain data patterns.
4. Programming Examples
4.1 JavaScript
// Base91 alphabet
const BASE91_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&()*+,./:;<=>?@[]^_`{|}~"';
// Build decode lookup table
const BASE91_DECODE_TABLE = new Array(256).fill(-1);
for (let i = 0; i < BASE91_ALPHABET.length; i++) {
BASE91_DECODE_TABLE[BASE91_ALPHABET.charCodeAt(i)] = i;
}
// Base91 encode
function base91Encode(data) {
const bytes = typeof data === 'string'
? new TextEncoder().encode(data)
: new Uint8Array(data);
let result = '';
let n = 0; // Number of bits in the buffer
let b = 0; // Bit buffer
for (let i = 0; i < bytes.length; i++) {
// Append byte to bit buffer (LSB first)
b |= bytes[i] << n;
n += 8;
// When buffer has more than 13 bits, extract and encode
if (n > 13) {
let v = b & 8191; // Take low 13 bits
if (v > 88) {
b >>= 13;
n -= 13;
} else {
// Low values 0..88 signal that this pair should carry 14 bits
v = b & 16383;
b >>= 14;
n -= 14;
}
// Output two characters
result += BASE91_ALPHABET[v % 91];
result += BASE91_ALPHABET[Math.floor(v / 91)];
}
}
// Handle remaining bits
if (n > 0) {
result += BASE91_ALPHABET[b % 91];
if (n > 7 || b > 90) {
result += BASE91_ALPHABET[Math.floor(b / 91)];
}
}
return result;
}
// Base91 decode
function base91Decode(str) {
const bytes = [];
let b = 0; // Bit buffer
let n = 0; // Number of bits in buffer
let v = -1; // Current value being assembled
for (let i = 0; i < str.length; i++) {
const d = BASE91_DECODE_TABLE[str.charCodeAt(i)];
if (d === -1) continue; // Skip invalid characters
if (v === -1) {
v = d; // Read first character (remainder)
} else {
v += d * 91; // Read second character (quotient), rebuild v
b |= v << n;
// Determine how many bits this value carries
n += (v & 8191) > 88 ? 13 : 14;
// Extract complete bytes from buffer
while (n >= 8) {
bytes.push(b & 0xFF);
b >>= 8;
n -= 8;
}
v = -1;
}
}
// Handle the last unpaired character
if (v !== -1) {
b |= v << n;
n += 8; // Flush the trailing single character as one final byte
while (n >= 8) {
bytes.push(b & 0xFF);
b >>= 8;
n -= 8;
}
}
return new Uint8Array(bytes);
}
// Usage example
const encoded = base91Encode('Hello, World!');
console.log(encoded);
// >OwJh>}AQ;r@@Y?F
const decoded = new TextDecoder().decode(base91Decode(encoded));
console.log(decoded);
// Hello, World!
4.2 Python
# Base91 alphabet
BASE91_ALPHABET = (
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
'abcdefghijklmnopqrstuvwxyz'
'0123456789'
'!#$%&()*+,./:;<=>?@[]^_`{|}~"'
)
# Build decode table
BASE91_DECODE_TABLE = {ch: i for i, ch in enumerate(BASE91_ALPHABET)}
def base91_encode(data: bytes) -> str:
"""Encode bytes to a Base91 string"""
result = []
n = 0 # Number of bits in buffer
b = 0 # Bit buffer
for byte in data:
b |= byte << n
n += 8
if n > 13:
v = b & 8191 # Take low 13 bits
if v > 88:
b >>= 13
n -= 13
else:
v = b & 16383 # Low values 0..88 mean this chunk uses 14 bits
b >>= 14
n -= 14
result.append(BASE91_ALPHABET[v % 91])
result.append(BASE91_ALPHABET[v // 91])
# Handle remaining bits
if n > 0:
result.append(BASE91_ALPHABET[b % 91])
if n > 7 or b > 90:
result.append(BASE91_ALPHABET[b // 91])
return ''.join(result)
def base91_decode(encoded: str) -> bytes:
"""Decode a Base91 string to bytes"""
result = []
b = 0
n = 0
v = -1
for ch in encoded:
d = BASE91_DECODE_TABLE.get(ch, -1)
if d == -1:
continue
if v == -1:
v = d
else:
v += d * 91
b |= v << n
n += 13 if (v & 8191) > 88 else 14
while n >= 8:
result.append(b & 0xFF)
b >>= 8
n -= 8
v = -1
if v != -1:
b |= v << n
n += 8
while n >= 8:
result.append(b & 0xFF)
b >>= 8
n -= 8
return bytes(result)
# Usage example
data = b"Hello, World!"
encoded = base91_encode(data)
print(f"Encoded: {encoded}")
# Encoded: >OwJh>}AQ;r@@Y?F
decoded = base91_decode(encoded)
print(f"Decoded: {decoded}")
# Decoded: b'Hello, World!'
5. Real-World Applications
5.1 Data Embedding and Transmission
When binary data must be transmitted through text-only channels, Base91 saves approximately 8% compared to Base64. For large-scale data transfers, this difference adds up to significant savings:
| Raw Data Size | Base64 Encoded | Base91 Encoded | Savings |
|---|---|---|---|
| 1 KB | 1.33 KB | 1.23 KB | ~8% |
| 100 KB | 133 KB | 123 KB | ~8% |
| 1 MB | 1.33 MB | 1.23 MB | ~8% |
| 10 MB | 13.3 MB | 12.3 MB | ~8% |
5.2 Embedded Systems
In storage- and bandwidth-constrained embedded systems, every byte matters. Base91’s high encoding efficiency makes it an ideal choice for firmware updates, configuration data transfers, and other scenarios where minimizing payload size is critical.
5.3 Logging and Debugging
When embedding binary data in log files (such as error dumps or core dump summaries), Base91 can represent more data with shorter text, reducing log file bloat.
5.4 Email and Messaging
In messaging systems that don’t support MIME, Base91 can serve as a more efficient alternative to Base64. However, since Base91 is not part of any email standard, compatibility requires additional consideration.
6. Advantages and Limitations
6.1 Advantages
- Best-in-class encoding efficiency: Among printable ASCII encoding schemes, Base91 has nearly the lowest overhead
- Simple algorithm: Despite being variable-length, the implementation is relatively straightforward
- Stream processing: Supports byte-by-byte input, suitable for streaming encode/decode operations
- No padding characters: Unlike Base64 which requires
=padding, Base91 output is compact
6.2 Limitations
- No formal standard: No RFC or international standard exists for Base91
- Limited library support: Very few programming languages natively support Base91; most require third-party libraries
- Character set concerns: Contains
"and other special characters that may need escaping in JSON, XML, and similar formats - Less universal than Base64: In mainstream areas like web development and email, Base64 remains the de facto standard
- Variable-length complexity: Compared to Base64’s fixed 3:4 mapping, Base91’s variable-length nature makes manual calculation and debugging more difficult
7. Frequently Asked Questions
Is there a difference between Base91 and basE91?
No. basE91 is the original name of the encoding scheme (as named by its author, Joachim Henke), where the capital E is a stylistic choice. Base91 is the more commonly used simplified name. Both refer to the same encoding scheme.
Is Base91 suitable for URLs?
Not really. Base91’s character set includes #, %, &, ?, /, and other characters that have special meanings in URLs. For encoding data in URLs, Base64url encoding is recommended instead.
Is there anything more efficient than Base91?
In theory, a Base95 encoding using all 95 printable ASCII characters could further reduce overhead, but at the cost of handling spaces, backslashes, and other highly problematic characters. Base91 represents an excellent balance between efficiency and practicality.
Is Base91 encoding the same as encryption?
No. Like all Base encodings, Base91 is purely an encoding scheme and provides no security whatsoever. Anyone can easily decode Base91 data. If you need to protect data, encrypt it first (e.g., using AES), then apply Base91 encoding.
Why does Base91 use variable-length encoding instead of fixed grouping?
This is the result of efficiency optimization. Since 91² = 8281 and 2¹³ = 8192, 2 Base91 characters can cover all 13-bit values plus 89 extra states. Base91 uses those extra states by treating low 13-bit values 0..88 as a signal to consume 14 bits instead, producing values in 0..88 or 8192..8280. That is how it squeezes slightly more information into many output pairs.
8. Conclusion
Base91 is a highly efficient binary-to-text encoding scheme that achieves near-optimal encoding efficiency within the constraint of 91 printable ASCII characters, thanks to its carefully designed variable-length encoding strategy.
| Encoding | Overhead | Best For |
|---|---|---|
| Hexadecimal | 100% | Debugging, low-level data inspection |
| Base32 | 60% | Case-insensitive contexts |
| Base64 | 33% | Web APIs, email, general purpose |
| Base85 | 25% | PDF, Git, ZeroMQ |
| Base91 | ≈23% | Bandwidth-sensitive, storage-constrained scenarios |
While Base91 doesn’t match Base64 in universality and ecosystem support, it’s an excellent choice when minimizing encoding overhead is a priority.
Want to try Base91 encoding and decoding yourself? Use our Online Base91 Encoder/Decoder for quick Base91 conversions.