BLOG
JSON vs YAML vs TOML vs XML: A Developer's Decision Guide for 2026
A team I worked with spent four hours debugging a Kubernetes deployment that kept rolling back on staging but worked on the developer's laptop. The config looked identical. Every YAML validator said it was fine. The logs said nothing useful. The bug turned out to be a country code in a values file: country: NO. YAML 1.1 parsed that as the boolean false. The deployment controller interpreted the resulting false as "disable this feature" and rolled back. Four hours of engineering time for the string "NO." That's the kind of bug you get when your data format is trying to help you.
JSON, YAML, TOML, and XML all store structured data. They look similar from a distance. They behave very differently in practice. Picking the wrong one for the job costs time at best and correctness at worst. This guide walks through where each format wins, where each breaks, and how to convert between them without losing data.
The Quick Decision Matrix
If you only remember one thing from this article, remember this table:
| Use Case | Best Format | Why |
|---|---|---|
| Web API payload | JSON | Universal, fast, natively supported |
| Kubernetes / Helm / CI configs | YAML | Ecosystem lock-in (but know the gotchas) |
| Application config (Rust/Python/Go) | TOML | Safe, readable, comments allowed |
| Document markup (SVG, DOCX) | XML | Mixed content, rich attributes |
| Tabular data import/export | CSV | Spreadsheet-native, simple |
| Enterprise/government messaging | XML | Schema validation, namespaces |
| Database serialization | JSON | Native JSONB in Postgres, MongoDB, etc. |
| Infrastructure as Code | HCL or YAML | Depends on tool (Terraform vs Ansible) |
| LLM prompts & structured outputs | JSON | Native JSON mode in all major APIs |
| Human-edited settings file | TOML or JSONC | Readable, comments supported |
The rest of this guide explains why.
JSON: The Default Choice
JSON won the data format wars for web APIs sometime around 2012 and has held the crown for over a decade. In 2026 it's the default assumption for every new API, most databases (MongoDB, DynamoDB, CouchDB, Postgres JSONB), most configuration formats that don't have another reason to be different, and every LLM structured output mode.
What JSON gets right:
- Universal parsing. Every programming language has a built-in JSON parser. Every text editor syntax-highlights it. Every developer can read it.
- Fast. JSON parsing consistently beats YAML and XML by 2-10x in benchmarks across Node, Python, Go, and Java.
- Strict typing.
42is always a number."42"is always a string.trueis always a boolean. No ambiguity. - Simple grammar. Six primitives (string, number, boolean, null, object, array). No advanced features to trip over.
What JSON gets wrong:
- No comments. The original spec forbids them. This is JSON's single biggest limitation for configuration files. Workarounds: JSONC (JSON with comments, supported in VS Code settings) or JSON5 (looser variant with comments and trailing commas).
- Trailing commas forbidden. Makes editing arrays painful. Again JSON5 fixes this.
- Verbose for human editing. Every string needs quotes. Every key needs quotes. Every element needs a comma.
- No native date type. You store dates as ISO 8601 strings and parse them manually.
JSON syntax in four examples:
// Object
{
"name": "Alice",
"age": 32,
"email": "alice@example.com"
}
// Array
[1, 2, 3, 4, 5]
// Nested
{
"user": {
"id": 42,
"roles": ["admin", "editor"],
"active": true,
"last_login": "2026-04-17T14:32:00Z"
}
}
// Arrays of objects
[
{ "id": 1, "name": "Alice" },
{ "id": 2, "name": "Bob" }
]
Everyday JSON work gets easier with a JSON formatter for pretty-printing, a JSON validator for catching syntax errors, and a JSON minifier for shrinking payloads for production. To see how JSON maps to other structures, try JSON to TypeScript, JSON to SQL, or JSON to Go.
YAML: The Configuration Specialist (With Sharp Edges)
YAML dominates one specific territory: DevOps and cloud-native tooling. Kubernetes manifests, Docker Compose, GitHub Actions, GitLab CI, Ansible playbooks, Helm charts, CircleCI, AWS SAM, Serverless Framework. If your infrastructure tool expects a config file, it probably wants YAML.
What YAML gets right:
- Human-readable. Indentation-based structure reads like outlined prose.
- Comments supported. Unlike JSON.
- Multi-line strings. Block scalars (
|and>) handle long text cleanly. - Anchors and aliases. Reuse blocks within a file without duplicating them.
What YAML gets catastrophically wrong:
- Implicit type coercion (the "Norway problem"). Unquoted strings can be silently cast.
NObecomesfalsein YAML 1.1.yes,on,offall parse as booleans. - Ambiguous numeric parsing.
1.10in a version field becomes the number1.1, losing the trailing zero. - Time parsing surprises.
12:30becomes the integer750(base-60 interpretation). - Significant whitespace. Mix tabs and spaces, or miscount indentation, and your parser silently produces a different structure.
- Slow. YAML parsing is 3-10x slower than JSON in most languages.
- Complex spec. The YAML 1.2 specification is 80 pages. Most implementations subset it inconsistently.
The safety rule: always quote strings that could be misinterpreted as booleans, numbers, dates, or special values. If your YAML has a string that looks like true, false, yes, no, on, off, null, a country code (NO, YES, ON), a version number, or a timestamp, wrap it in quotes.
YAML syntax essentials:
# Comment
name: Alice
age: 32
email: alice@example.com
# Nested object
user:
id: 42
roles:
- admin
- editor
active: true
last_login: "2026-04-17T14:32:00Z" # quote dates!
# Array of objects
users:
- id: 1
name: Alice
- id: 2
name: Bob
# Multi-line string
description: |
This is a multi-line
string that preserves
newlines exactly.
# Anchor and alias
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults # inherit defaults
host: prod.example.com
Validate before you deploy. A YAML validator and YAML to JSON converter can round-trip your config and highlight surprises before production does.
TOML: The Configuration Language That Actually Wants to Help
TOML (Tom's Obvious, Minimal Language) was designed specifically to fix YAML's worst sins. If you're writing a human-edited configuration file for a new project, TOML is the 2026 default.
What TOML gets right:
- Explicit types.
NOis always a string.42is always an integer. No coercion surprises. - Comments supported.
- No significant whitespace. Indentation is optional and cosmetic.
- Native date/time types. Includes local date, local time, and offset datetime as first-class values.
- Reads cleanly. INI-file inspired layout that non-developers can understand.
- Simple spec. TOML's specification fits in a single page.
What TOML gets wrong:
- Deep nesting is awkward. Multiple levels of tables get verbose.
- No anchors/aliases. You can't reuse blocks like YAML allows.
- Still gaining adoption. Parser quality varies by language.
TOML example:
# Simple values at the top
title = "My Config"
debug = false
port = 8080
# Table (nested object)
[database]
host = "localhost"
port = 5432
user = "admin"
ssl = true
# Nested table
[database.connection_pool]
min_size = 5
max_size = 50
timeout_ms = 3000
# Array of tables
[[users]]
name = "Alice"
role = "admin"
[[users]]
name = "Bob"
role = "editor"
# Native date/time
created_at = 2026-04-17T14:32:00Z
local_date = 2026-04-17
local_time = 14:32:00
Where TOML shines today: Rust (every Cargo.toml), Python (pyproject.toml for packaging), Hugo (static site config), many CLI tools (starship, ripgrep, bat). If you're picking a config format for a new project in 2026, start with TOML and only move to YAML if you need ecosystem compatibility. Use a TOML to JSON converter when you need to interoperate with JSON-based tools.
XML: Alive and Well in Specific Domains
XML is dead for new web APIs but very much alive everywhere else. If you work in finance, healthcare, government, legal, or with Microsoft Office document formats, you need XML.
Where XML is mandatory in 2026:
- Financial messaging: ISO 20022, FIX protocol, SWIFT MX.
- Healthcare: HL7 CDA, FHIR (though FHIR also supports JSON).
- Government & legal: XBRL for financial filings, LegalRuleML, court filing systems.
- SOAP web services: Still prevalent in enterprise B2B integrations.
- Office documents: DOCX, XLSX, PPTX are all ZIP archives containing XML.
- SVG graphics: Vector graphics are XML.
- RSS/Atom feeds: Still the standard for syndication.
- Build and project files: Maven (pom.xml), .NET csproj, Android manifests.
What XML gets right:
- Rich attributes on elements. Separates metadata from content cleanly.
- Namespaces. Safely combine vocabularies from different standards.
- Schema validation. XSD and RelaxNG let you enforce complex constraints.
- Mature tooling. XPath, XSLT, and XQuery provide powerful query and transformation.
- Mixed content. Text and elements interleaved cleanly (like HTML).
What XML gets wrong:
- Verbose. Every element has an opening and closing tag.
- Slow to parse. DOM parsers build large in-memory trees; SAX parsers are fast but awkward.
- Complexity. Namespaces, DTDs, entities, CDATA, processing instructions. Easy to write; hard to write well.
- Entity expansion vulnerabilities. XXE attacks in misconfigured parsers.
Example:
<?xml version="1.0" encoding="UTF-8"?>
<user id="42" active="true">
<name>Alice</name>
<email>alice@example.com</email>
<roles>
<role>admin</role>
<role>editor</role>
</roles>
<last_login>2026-04-17T14:32:00Z</last_login>
</user>
When converting between XML and JSON, the semantic mismatch can bite: attributes vs elements, mixed content, repeated elements becoming arrays. Use an XML to JSON converter and JSON to XML converter with awareness that the mapping is lossy in both directions.
Size and Performance Comparison
Same data, four formats. Approximate file sizes and parsing times:
| Format | Bytes (simple) | Bytes (nested) | Parse time (Node.js) | Write time |
|---|---|---|---|---|
| JSON | 87 | 412 | 1.0x (baseline) | 1.0x |
| JSON (minified) | 65 | 298 | 1.0x | 1.0x |
| YAML | 78 | 385 | 3.8x | 4.2x |
| TOML | 92 | 438 | 1.5x | 1.7x |
| XML | 142 | 672 | 2.5x | 2.3x |
The practical implication: for high-throughput systems (millions of messages per second), JSON is the only reasonable choice among human-readable formats. For lower throughput where human editing matters, any of the four works. For sub-millisecond requirements, skip all of these and use binary formats (Protocol Buffers, MessagePack, CBOR, Avro).
Conversion Pitfalls
Converting between these formats loses data in predictable ways. Know the pitfalls:
JSON ↔ YAML
- JSON comments (if using JSONC) are stripped when converting to YAML.
- YAML anchors/aliases don't exist in JSON. They expand to duplicated data.
- YAML multi-line strings become single-line strings with
\nescapes in JSON. - YAML's implicit typing can corrupt data when going JSON to YAML and back.
JSON ↔ XML
- XML attributes don't map cleanly to JSON. Common conventions:
@attrprefix or a separateattributesobject. - XML mixed content (text + child elements) has no natural JSON equivalent.
- Repeated XML elements become arrays; single elements become objects. Order-of-elements may flip.
- XML namespaces are usually lost in JSON conversion.
JSON ↔ TOML
- TOML has no null value. JSON
nullbecomes an omitted key or empty string. - TOML date/time types don't exist in JSON. They become ISO 8601 strings.
- Deeply nested JSON objects become hard-to-read TOML tables.
YAML ↔ TOML
- YAML boolean/number coercions may need explicit quoting in TOML to preserve string type.
- YAML's anchors and complex types (sets, ordered maps) don't exist in TOML.
Which Format for Which Language Ecosystem
Language communities have strong preferences:
- JavaScript/TypeScript: JSON for everything. package.json is the universal config.
- Python: JSON for APIs, TOML for project config (pyproject.toml since PEP 621), YAML for anything Ansible-adjacent.
- Rust: TOML everywhere. Cargo, rust-analyzer, most CLI tools.
- Go: JSON for APIs, YAML for Kubernetes-related configs, TOML or YAML for application config.
- Java: XML for legacy (pom.xml), YAML for Spring Boot config, JSON for REST APIs.
- C#/.NET: JSON for everything modern (appsettings.json), XML for legacy and csproj.
- Ruby: YAML historically, JSON for APIs.
When to Use Binary Formats Instead
If you need extreme throughput, smaller payloads, or schema evolution, reach for a binary format instead:
- Protocol Buffers (protobuf): Schema-first, language-agnostic, excellent tooling (especially in gRPC contexts).
- MessagePack: JSON-like data model in binary. Much smaller and faster than JSON.
- CBOR: Like MessagePack, standardized by IETF.
- Apache Avro: Data-heavy pipelines (Kafka, Hadoop ecosystem).
- FlatBuffers: Zero-copy deserialization for real-time use cases.
The tradeoff is always readability. You can't cat a protobuf file and understand it. For human-editable configuration, stay with text formats. For service-to-service data exchange at scale, binary often wins.
Validation and Schema Tools
Whatever format you pick, validate it. Tools by format:
- JSON Schema — the standard for JSON validation. Use Ajv (Node), jsonschema (Python), or similar.
- OpenAPI — API schema that uses JSON Schema for request/response shapes.
- YAML Schema — typically YAML wraps JSON Schema. VS Code's YAML extension auto-validates against known schemas.
- TOML validation — typed TOML libraries in Rust/Python catch errors at parse time. Less tooling than JSON Schema.
- XML Schema (XSD) — mature and powerful. RelaxNG is a simpler alternative.
For generating JSON Schema from sample data, try a JSON Schema generator. For validation in the browser, a JSON validator and YAML validator catch syntax issues fast.
Real-World Recommendations
- Building an API: JSON. No exceptions.
- Deploying to Kubernetes: YAML. Quote every string that could be ambiguous.
- Writing a Rust/Python/Go CLI: TOML for config.
- Integrating with banking or healthcare: XML. Learn XPath and XSD.
- Editing Office documents programmatically: Unzip, edit XML, rezip.
- Storing data in a database: Native types where possible, JSON for flexible documents.
- Sending data between microservices at scale: JSON for simplicity, protobuf for performance.
- LLM structured output: JSON with a schema.
Tools for Working With Data Formats
- JSON formatter · JSON validator · JSON minifier
- YAML to JSON · YAML validator · JSON to YAML
- TOML to JSON
- XML formatter · XML to JSON · JSON to XML
- JSON to CSV · CSV to JSON · Excel to JSON
- JSON Schema generator · JSON to TypeScript · JSON to Go
- JSON Path tester for querying JSON with JSONPath expressions.
Frequently Asked Questions
Which data format is best for web APIs?
JSON, decisively. Over 90% of public APIs use JSON. It's fast, universally supported, and native in every browser. Use JSON unless you have a specific reason to use something else (SOAP for enterprise B2B, protobuf for gRPC performance).
Why is YAML dangerous for configuration?
Implicit type coercion. NO becomes false. 1.10 becomes the number 1.1. Dates and times coerce silently. These bugs are hard to catch and production-breaking. Always quote ambiguous strings in YAML, or prefer TOML where the coercion doesn't happen.
When should I use TOML?
Human-edited configuration files where readability matters and you don't want YAML's type surprises. Rust's Cargo, Python's pyproject.toml, Hugo's config, and many CLI tools use TOML. It's the safest choice for new projects' config.
Is XML obsolete?
Dead for new web APIs, very alive in specific domains: finance (ISO 20022, FIX), healthcare (HL7), government systems, SOAP services, Office documents, SVG graphics, RSS feeds. If you work in these domains you need XML. If you're building a new web service, use JSON.
Can I mix formats in a project?
Yes, and most projects do. A web app might use JSON for its API, YAML for its deployment config, TOML for its build tool, and XML for icons (SVG). Pick the right format for each layer.
What's JSON5? Should I use it?
JSON5 extends JSON with comments, trailing commas, unquoted keys, and single-quoted strings. It's not a web standard but it's useful for config files. VS Code settings and many JavaScript tools accept JSON5. For API payloads, stick with strict JSON.
Further Reading
- json.org — the canonical JSON reference.
- YAML 1.2 specification — the current spec.
- toml.io — the TOML spec and tutorial.
- W3C XML — XML specifications and related standards.
- Data Format Cheat Sheet: JSON, CSV, XML, YAML — the conversion-focused companion.
- 10 JSON Mistakes Developers Make — common pitfalls to avoid.
- Free JSON Tools: Format, Validate, Convert — the practical JSON workflow.
Data formats are like programming languages: picking the wrong one doesn't always kill a project, but picking the right one makes everything easier. JSON for APIs, TOML for config, YAML when Kubernetes makes you, XML when the industry makes you. Most of the time you won't need to think about it. The one time you will, quote your strings.