Overview
Test Data: 4,916 production agent metadata from 8004scan database
Test Summary
Dataset Distribution
| Size Range | Count | Percentage | Average Size |
|---|---|---|---|
| < 1KB | 2,872 | 58.4% | 0.51 KB |
| 1-2KB | 2,015 | 41.0% | 1.19 KB |
| 2-3KB | 13 | 0.3% | 2.26 KB |
| 3-5KB | 15 | 0.3% | 3.76 KB |
| > 5KB | 1 | 0.02% | 5.76 KB |
Key Insight: 99.4% of real agents use metadata < 2KB.
Small Metadata (< 2KB) - 99.4% of Agents
Test: 30 samples, average 0.73 KB (751 bytes), ~33,000 gas uncompressed
| Algorithm | Level | Compression Ratio | Gas Saved | Compression Speed | Decompression Speed |
|---|---|---|---|---|---|
| Brotli | 11 | 47.7% | ~6,180 gas (17.4%) | 2.62ms | 0.02ms |
| Brotli | 9 | 41.2% | ~5,327 gas (15.0%) | 1.43ms | 0.02ms |
| Zstd | 22 | 34.9% | ~4,570 gas (12.8%) | 0.15ms | 0.01ms |
| Zstd | 15 | 34.8% | ~4,561 gas (12.8%) | 0.11ms | 0.01ms |
| Zstd | 9 | 34.4% | ~4,533 gas (12.7%) | 0.06ms | 0.01ms |
| Gzip | 9 | 34.7% | ~4,613 gas (12.9%) | 0.05ms | 0.02ms |
| Gzip | 6 | 34.7% | ~4,613 gas (12.9%) | 0.05ms | 0.03ms |
| LZ4 | 12 | 14.5% | ~2,110 gas (5.8%) | 0.04ms | 0.00ms |
| LZ4 | 9 | 14.5% | ~2,110 gas (5.8%) | 0.03ms | 0.00ms |
Recommendations for Small Metadata
| Priority | Algorithm | Reason |
|---|---|---|
| 1st Choice | Zstd-15 | Best balance: 34.8% ratio, 0.11ms speed, excellent cross-platform support |
| 2nd Choice | Brotli-11 | Highest ratio (47.7%) but slower (2.62ms), good for static content |
| Speed Priority | LZ4-9 | Fastest (0.03ms) but lowest ratio (14.5%), only if speed critical |
Medium Metadata (2-5KB) - 0.6% of Agents
Test: 2 samples, average 3.11 KB (3,182 bytes), ~71,920 gas uncompressed
| Algorithm | Level | Compression Ratio | Gas Saved | Compression Speed | Decompression Speed |
|---|---|---|---|---|---|
| Brotli | 11 | 66.8% | ~35,288 gas (46.9%) | 6.91ms | 0.04ms |
| Brotli | 9 | 62.3% | ~33,144 gas (43.8%) | 3.75ms | 0.03ms |
| Zstd | 22 | 59.3% | ~31,672 gas (41.8%) | 1.23ms | 0.02ms |
| Zstd | 15 | 59.2% | ~31,576 gas (41.6%) | 0.68ms | 0.02ms |
| Zstd | 9 | 58.9% | ~31,472 gas (41.5%) | 0.20ms | 0.03ms |
| Gzip | 9 | 59.5% | ~31,704 gas (41.9%) | 0.10ms | 0.04ms |
| Gzip | 6 | 59.5% | ~31,704 gas (41.9%) | 0.12ms | 0.07ms |
| LZ4 | 12 | 45.2% | ~24,920 gas (32.2%) | 0.13ms | 0.01ms |
| LZ4 | 9 | 45.2% | ~24,912 gas (32.1%) | 0.07ms | 0.01ms |
Recommendations for Medium Metadata
| Priority | Algorithm | Reason |
|---|---|---|
| 1st Choice | Zstd-15 | Excellent ratio (59.2%), fast (0.68ms), production-ready |
| 2nd Choice | Brotli-11 | Best ratio (66.8%) but slower (6.91ms), worth it for rare large metadata |
| Speed Priority | Gzip-9 | Very fast (0.10ms), good ratio (59.5%), best compatibility |
Key Findings
1. Real Compression Ratios Lower Than Expected
Previous Estimates: 60-70% compression ratio
Actual Results:
- Small metadata (<2KB): 35-48% compression ratio
- Medium metadata (2-5KB): 59-67% compression ratio
Reason: Real agent metadata is already fairly compact with minimal repetition.
2. Gas Savings Still Worthwhile
Despite lower compression ratios, gas savings remain valuable:
- Small metadata (99% of agents): Save 4,000-6,000 gas per registration
- Medium metadata (1% of agents): Save 31,000-35,000 gas per registration
For platforms with 1,000+ agents, cumulative savings are significant.
3. Zstd-15 is the Clear Winner
Why Zstd-15:
- ✅ Excellent compression ratio (35-59%)
- ✅ Fast speed (0.11-0.68ms)
- ✅ Cross-platform support (Python, Node.js, Rust, Go)
- ✅ Production-proven in many systems (Facebook, Linux kernel)
Brotli-11 Alternative:
- Better compression (48-67%) but 6-24x slower
- Good for static content, pre-computed compression
- Worse cross-platform support (native in browsers, libraries elsewhere)
4. LZ4 Not Recommended
LZ4 Results:
- Lowest compression ratio (14-45%)
- Marginal speed advantage (0.03ms vs 0.11ms for Zstd-15)
- Speed difference negligible for typical use cases
Conclusion: Zstd-15's superior compression ratio outweighs LZ4's minimal speed advantage.
Production Recommendations
Default Algorithm
Recommended: Zstd level 15
python
# Backend (Python)
import zstandard as zstd
compressor = zstd.ZstdCompressor(level=15)
compressed = compressor.compress(json_bytes)typescript
// Frontend (TypeScript)
// Note: Use gzip instead of zstd for browser compatibility
import { compress } from "fflate";
const compressed = compress(json_bytes, { level: 9 });When to Use Compression
| Metadata Size | Recommendation | Gas Saved |
|---|---|---|
| < 500 bytes | ❌ Don't compress | Minimal savings, overhead not worth it |
| 500-2000 bytes | ⚖️ Optional | ~2,000-6,000 gas |
| 2-5KB | ✅ Recommended | ~31,000-35,000 gas |
| > 5KB | ✅✅ Strongly recommended | 35,000+ gas |
Implementation Checklist
- [x] Parser supports
enc=zstdparameter in Data URI - [x] Zip bomb protection (100KB decompression limit)
- [x] Algorithm whitelist (zstd, gzip, br, lz4 only)
- [x] Async decompression in Celery workers
- [ ] Frontend compression UI with gas savings preview
- [ ] Analytics dashboard tracking compression adoption
Test Methodology
Data Source
- Database: 8004scan production PostgreSQL
- Table:
agents.metadata_json - Total Records: 4,916 agents
- Chains: Ethereum Sepolia + Base Sepolia
Test Process
- Fetch real metadata from database
- Test each algorithm at multiple compression levels
- Measure compression ratio, gas savings, speed
- Verify decompression correctness
- Aggregate statistics
Gas Calculation Formula
text
Gas = (data_size_bytes × 16) + 21,000- 16 gas/byte: EVM calldata cost
- 21,000 gas: Base transaction cost
Conclusion
TLDR:
- Real compression ratios (35-67%) are lower than theoretical estimates (60-70%)
- Gas savings (4,000-35,000 per agent) are still worthwhile for production
- Recommended: Zstd-15 for best balance of compression and speed
- Alternative: Brotli-11 for maximum compression (rare large metadata)
- 99% of agents use small metadata (<2KB), save ~4,500 gas each
Next Steps:
- Update documentation with real test data ✅
- Set default compression to Zstd-15 in backend
- Add frontend compression UI with gas preview
- Track compression adoption metrics