Report Format

Every Reconlify comparison produces a JSON report file. By default it is written to report.json, or you can specify a path with --out.

The report is deterministic — the same config and input files always produce the same result. This makes reports suitable for CI/CD validation, audit trails, and automated pipelines.

How to read a Reconlify report

This section walks through a complete report from a real comparison. The scenario: you are reconciling a payment processor export (6 transactions) against an internal ledger (5 transactions). One transaction has a wrong amount, and one is missing from the ledger.

The full report

{
  "type": "tabular",
  "version": "1.3",
  "generated_at": "2026-03-09T14:00:00+00:00",
  "config_hash": "a1b2c3d4e5f6...",
  "summary": {
    "source_rows": 6,
    "target_rows": 5,
    "missing_in_target": 1,
    "missing_in_source": 0,
    "rows_with_mismatches": 1,
    "mismatched_cells": 1,
    "comparison_time_seconds": 0.03
  },
  "details": {
    "format": "csv",
    "keys": ["pay_id"],
    "compared_columns": ["amount", "currency", "merchant", "method", "pay_date"],
    "column_mapping": {
      "pay_id": "transaction_id",
      "merchant": "vendor_name",
      "amount": "total_amount"
    },
    "read_rows_source": 6,
    "read_rows_target": 5,
    "filters_applied": {},
    "column_stats": {
      "amount": { "mismatched_count": 1 },
      "currency": { "mismatched_count": 0 },
      "merchant": { "mismatched_count": 0 },
      "method": { "mismatched_count": 0 },
      "pay_date": { "mismatched_count": 0 }
    }
  },
  "samples": {
    "value_mismatches": [
      {
        "line_number_source": 5,
        "line_number_target": 4,
        "key": { "pay_id": "P-3004" },
        "columns": {
          "amount": {
            "source": "320.75",
            "target": "230.75"
          }
        }
      }
    ],
    "missing_in_target": [
      {
        "line_number_source": 7,
        "key": { "pay_id": "P-3006" },
        "row": {
          "pay_id": "P-3006",
          "merchant": "Stark Industries",
          "amount": "7500.00",
          "currency": "USD",
          "pay_date": "2026-02-18",
          "method": "wire"
        }
      }
    ],
    "missing_in_source": [],
    "excluded": []
  },
  "warnings": []
}

Summary metrics

Start with the summary section. It tells you whether the datasets match and what kind of differences exist.

"summary": {
  "source_rows": 6,
  "target_rows": 5,
  "missing_in_target": 1,
  "missing_in_source": 0,
  "rows_with_mismatches": 1,
  "mismatched_cells": 1,
  "comparison_time_seconds": 0.03
}

Field	Value	What it means
`source_rows`	6	The source file has 6 rows (after filtering)
`target_rows`	5	The target file has 5 rows (after filtering)
`missing_in_target`	1	One source row has no matching key in the target
`missing_in_source`	0	No target rows are unmatched — every target row has a source counterpart
`rows_with_mismatches`	1	One row exists on both sides but has a value difference
`mismatched_cells`	1	One individual cell differs across all mismatched rows
`comparison_time_seconds`	0.03	The comparison took 30 milliseconds

If all difference counts are zero, the datasets match and the exit code is 0. Here, two counts are non-zero, so the exit code is 1.

Mismatch samples

The samples.value_mismatches array shows rows that matched by key but have different values. Each entry identifies the row and shows only the columns that differ.

{
  "line_number_source": 5,
  "line_number_target": 4,
  "key": { "pay_id": "P-3004" },
  "columns": {
    "amount": {
      "source": "320.75",
      "target": "230.75"
    }
  }
}

Field	What it tells you
`line_number_source`	Row is on line 5 of the source file (including header)
`line_number_target`	Same row is on line 4 of the target file
`key`	The business key that matched this row — `pay_id: P-3004`
`columns.amount.source`	Source value: `320.75`
`columns.amount.target`	Target value: `230.75`

Only columns with differences appear in columns. The other four compared columns (currency, merchant, method, pay_date) matched and are omitted.

The key and columns fields use logical (source-side) column names. If you configured column_mapping, the report still shows amount — not the target's total_amount. Check details.column_mapping to see the translation.

Missing rows

The samples.missing_in_target array shows rows that exist in the source but have no matching key in the target.

{
  "line_number_source": 7,
  "key": { "pay_id": "P-3006" },
  "row": {
    "pay_id": "P-3006",
    "merchant": "Stark Industries",
    "amount": "7500.00",
    "currency": "USD",
    "pay_date": "2026-02-18",
    "method": "wire"
  }
}

Field	What it tells you
`line_number_source`	Row is on line 7 of the source file
`key`	The key that was not found in the target — `pay_id: P-3006`
`row`	The full row data from the source file

There is no line_number_target because the row does not exist in the target. The row field includes all columns — not just compared ones — so you have full context for investigating why this row is missing.

samples.missing_in_source works the same way but for target rows with no matching source key. Each entry has line_number_target and row from the target file.

What to do with this report

Check summary first. If all counts are zero, you are done.
Check details.column_stats. If mismatches are concentrated in one column, the fix may be a targeted rule (tolerance, string rule, or normalization) rather than a data issue.
Inspect samples.value_mismatches. The key and line numbers let you find the exact rows in the original files. The columns section shows exactly what differs.
Inspect samples.missing_in_target and missing_in_source. Missing rows indicate data that exists on one side only — investigate whether it is a timing issue, a failed import, or unexpected data.

Report root object

{
  "type": "tabular",
  "version": "1.3",
  "generated_at": "2026-01-15T12:00:00+00:00",
  "config_hash": "abc123...",
  "summary": { },
  "details": { },
  "samples": { },
  "warnings": [],
  "error": { }
}

Field	Description
type	`"tabular"` or `"text"` — matches the config type
version	Report schema version (currently `"1.3"`)
generated_at	ISO-8601 UTC timestamp of when the report was created
config_hash	SHA-256 hash of the config — identical configs produce the same hash
summary	High-level result counts
details	Metadata about what was compared and how
samples	Example rows or lines that differed
warnings	Non-fatal advisory messages (omitted when empty)
error	Error details, present only when the run fails (exit code 2)

Exit codes

Code	Meaning	Report state
0	No differences found	Summary counts are all zero
1	Differences found	Summary contains non-zero difference counts
2	Error (bad config, missing file, etc.)	`error` object is present, summary is zeroed

Exit code 1 is not an error — it means the comparison ran successfully and found differences.

Tabular reports

Produced when type is "tabular".

Summary

{
  "source_rows": 1000,
  "target_rows": 998,
  "missing_in_target": 2,
  "missing_in_source": 0,
  "rows_with_mismatches": 5,
  "mismatched_cells": 8,
  "comparison_time_seconds": 0.42
}

source_rows / target_rows — row counts after filtering
missing_in_target — rows in source with no matching key in target
missing_in_source — rows in target with no matching key in source
rows_with_mismatches — rows matched by key but with column differences
mismatched_cells — total cell-level mismatches across all rows
comparison_time_seconds — wall-clock time for the comparison

Details

{
  "format": "csv",
  "keys": ["order_id"],
  "compared_columns": ["amount", "status"],
  "column_mapping": { "amount": "total_amount" },
  "read_rows_source": 1003,
  "read_rows_target": 998,
  "filters_applied": { },
  "column_stats": { "amount": { "mismatched_count": 5 } },
  "csv": { "delimiter": ",", "encoding": "utf-8", "header": true }
}

keys — columns used to match rows
compared_columns — columns checked for differences (sorted, uses logical names)
column_mapping — source-to-target column name translations (included only when mapping is configured)
read_rows_source / read_rows_target — raw row counts before filtering. Compare with summary.source_rows to see how many rows were excluded.
filters_applied — breakdown of rows excluded by exclude_keys and row_filters, with per-side counts
column_stats — per-column mismatch counts (see Column statistics)
csv — effective CSV parsing settings

Tabular samples

Samples are organized into four categories:

{
  "missing_in_target": [],
  "missing_in_source": [],
  "value_mismatches": [],
  "excluded": []
}

missing_in_target / missing_in_source — rows present on one side only. Each entry includes the key values, full row data, and the line number from the file where the row exists.

value_mismatches — rows matched by key but with differing values. Each entry shows the key, line numbers on both sides, and only the columns that differ (with source and target values).

excluded — rows removed by filters. Each entry shows which side the row came from, the key, row data, and the reason ("exclude_keys" or "row_filters").

Samples are capped at a configurable limit. Disable with output.include_row_samples: false for summary-only reports.

Text reports

Produced when type is "text".

Summary

{
  "total_lines_source": 150,
  "total_lines_target": 148,
  "different_lines": 3,
  "comparison_time_seconds": 0.001
}

total_lines_source / total_lines_target — line counts after normalization and line dropping
different_lines — number of line-level differences

Details

{
  "mode": "line_by_line",
  "read_lines_source": 155,
  "read_lines_target": 150,
  "ignored_blank_lines_source": 5,
  "ignored_blank_lines_target": 2,
  "rules_applied": {
    "drop_lines_count": 0,
    "replace_rules_count": 1,
    "replacement_lines_affected": 42,
    "replacement_applications": 42
  },
  "normalize": { },
  "unordered_stats": { }
}

read_lines_source / read_lines_target — raw line counts before processing
ignored_blank_lines — lines removed by ignore_blank_lines
rules_applied — how many lines were dropped by regex, how many regex replacement rules are configured, and how many lines and substitutions were affected
normalize — the effective normalization settings used
unordered_stats — present only in unordered_lines mode (see below)

Text samples (line_by_line)

Each entry includes raw and processed content for both sides, plus original file line numbers:

{
  "line_number_source": 10,
  "line_number_target": 10,
  "raw_source": "2026-03-01 10:00:04 [INFO] Response sent: 200 OK (95ms)",
  "raw_target": "2026-03-09 14:22:13 [WARN] Response sent: 404 Not Found (52ms)",
  "processed_source": "<TS> [INFO] Response sent: 200 OK (<DUR>)",
  "processed_target": "<TS> [WARN] Response sent: 404 Not Found (<DUR>)"
}

raw_source / raw_target — original file content
processed_source / processed_target — values after normalization and regex replacements

The comparison is performed on the processed values. The raw values are included so you can see the original content and trace back to the file.

Text samples (unordered_lines)

In unordered_lines mode, samples is always empty. Instead, aggregated data appears in samples_agg:

{
  "samples_agg": [
    {
      "line": "[worker-1] Batch C complete: 150 items",
      "source_count": 1,
      "target_count": 0,
      "source_line_numbers": [6],
      "target_line_numbers": []
    }
  ]
}

Each entry shows a distinct line whose occurrence count differs between source and target. Entries are sorted by largest count difference first.

The unordered_stats section in details provides an aggregate breakdown:

{
  "source_only_lines": 1,
  "target_only_lines": 0,
  "distinct_mismatched_lines": 1
}

source_only_lines — excess line occurrences in source
target_only_lines — excess line occurrences in target
distinct_mismatched_lines — how many unique line contents have differing counts

Line number metadata

Both samples and samples_agg entries include line number fields that point back to the original files. These fields are especially useful when investigating log differences, where you need to find specific occurrences in files that may contain thousands of lines.

In line_by_line samples, each entry has scalar line numbers:

line_number_source — the line number in the source file
line_number_target — the line number in the target file

These are single values because line_by_line mode compares one source line against one target line at each position.

In unordered_lines samples_agg, each entry has array line numbers:

source_line_numbers — line numbers in the source file where this line appears
target_line_numbers — line numbers in the target file where this line appears

These fields are arrays because the same line may appear multiple times in a file. For example:

{
  "line": "WARN: retry limit exceeded",
  "source_count": 3,
  "target_count": 1,
  "source_line_numbers": [5, 12, 48],
  "target_line_numbers": [9]
}

This means the line "WARN: retry limit exceeded" appears three times in the source file (at lines 5, 12, and 48) and once in the target file (at line 9). The count difference of two is what Reconlify flags as a mismatch.

The line numbers refer to positions in the original file before any normalization or line dropping. This lets you open the raw file, go to the reported line number, and see the original content — even if normalization changed or removed surrounding lines during comparison.

Error reports

When a run fails (exit code 2), the report includes an error object:

{
  "code": "DUPLICATE_KEYS",
  "message": "Duplicate key found in source file",
  "details": "Key { order_id: '1042' } appears 2 times in source"
}

Error codes:

Code	Description
`CONFIG_VALIDATION_ERROR`	Invalid YAML or schema violation
`RUNTIME_ERROR`	File not found, I/O failure, or unexpected exception
`DUPLICATE_KEYS`	Non-unique keys after filtering (tabular only)
`INVALID_ROW_FILTERS`	Filter references a column that does not exist
`INVALID_COLUMN_MAPPING`	Mapped target column missing or creates a collision

When error is present, all summary counts are zeroed and samples are empty. The comparison did not complete.

Column statistics

When enabled (default), the details.column_stats section shows per-column mismatch counts:

{
  "amount": { "mismatched_count": 5 },
  "status": { "mismatched_count": 3 },
  "currency": { "mismatched_count": 0 }
}

This tells you which columns contribute the most differences. If 500 rows have mismatches but column_stats shows they are all in amount, you know exactly where to focus — and can likely fix it with a tolerance rule or normalization.

Disable with output.include_column_stats: false.