Report Format

Every Reconlify comparison produces a JSON report file. By default it is written to report.json, or you can specify a path with --out.

The report is deterministic — the same config and input files always produce the same result. This makes reports suitable for CI/CD validation, audit trails, and automated pipelines.

How to read a Reconlify report

This section walks through a complete report from a real comparison. The scenario: you are reconciling a payment processor export (6 transactions) against an internal ledger (5 transactions). One transaction has a wrong amount, and one is missing from the ledger.

The full report

{
  "type": "tabular",
  "version": "1.3",
  "generated_at": "2026-03-09T14:00:00+00:00",
  "config_hash": "a1b2c3d4e5f6...",
  "summary": {
    "source_rows": 6,
    "target_rows": 5,
    "missing_in_target": 1,
    "missing_in_source": 0,
    "rows_with_mismatches": 1,
    "mismatched_cells": 1,
    "comparison_time_seconds": 0.03
  },
  "details": {
    "format": "csv",
    "keys": ["pay_id"],
    "compared_columns": ["amount", "currency", "merchant", "method", "pay_date"],
    "column_mapping": {
      "pay_id": "transaction_id",
      "merchant": "vendor_name",
      "amount": "total_amount"
    },
    "read_rows_source": 6,
    "read_rows_target": 5,
    "filters_applied": {},
    "column_stats": {
      "amount": { "mismatched_count": 1 },
      "currency": { "mismatched_count": 0 },
      "merchant": { "mismatched_count": 0 },
      "method": { "mismatched_count": 0 },
      "pay_date": { "mismatched_count": 0 }
    }
  },
  "samples": {
    "value_mismatches": [
      {
        "line_number_source": 5,
        "line_number_target": 4,
        "key": { "pay_id": "P-3004" },
        "columns": {
          "amount": {
            "source": "320.75",
            "target": "230.75"
          }
        }
      }
    ],
    "missing_in_target": [
      {
        "line_number_source": 7,
        "key": { "pay_id": "P-3006" },
        "row": {
          "pay_id": "P-3006",
          "merchant": "Stark Industries",
          "amount": "7500.00",
          "currency": "USD",
          "pay_date": "2026-02-18",
          "method": "wire"
        }
      }
    ],
    "missing_in_source": [],
    "excluded": []
  },
  "warnings": []
}

Summary metrics

Start with the summary section. It tells you whether the datasets match and what kind of differences exist.

"summary": {
  "source_rows": 6,
  "target_rows": 5,
  "missing_in_target": 1,
  "missing_in_source": 0,
  "rows_with_mismatches": 1,
  "mismatched_cells": 1,
  "comparison_time_seconds": 0.03
}
Field Value What it means
source_rows 6 The source file has 6 rows (after filtering)
target_rows 5 The target file has 5 rows (after filtering)
missing_in_target 1 One source row has no matching key in the target
missing_in_source 0 No target rows are unmatched — every target row has a source counterpart
rows_with_mismatches 1 One row exists on both sides but has a value difference
mismatched_cells 1 One individual cell differs across all mismatched rows
comparison_time_seconds 0.03 The comparison took 30 milliseconds

If all difference counts are zero, the datasets match and the exit code is 0. Here, two counts are non-zero, so the exit code is 1.

Mismatch samples

The samples.value_mismatches array shows rows that matched by key but have different values. Each entry identifies the row and shows only the columns that differ.

{
  "line_number_source": 5,
  "line_number_target": 4,
  "key": { "pay_id": "P-3004" },
  "columns": {
    "amount": {
      "source": "320.75",
      "target": "230.75"
    }
  }
}
Field What it tells you
line_number_source Row is on line 5 of the source file (including header)
line_number_target Same row is on line 4 of the target file
key The business key that matched this row — pay_id: P-3004
columns.amount.source Source value: 320.75
columns.amount.target Target value: 230.75

Only columns with differences appear in columns. The other four compared columns (currency, merchant, method, pay_date) matched and are omitted.

The key and columns fields use logical (source-side) column names. If you configured column_mapping, the report still shows amount — not the target's total_amount. Check details.column_mapping to see the translation.

Missing rows

The samples.missing_in_target array shows rows that exist in the source but have no matching key in the target.

{
  "line_number_source": 7,
  "key": { "pay_id": "P-3006" },
  "row": {
    "pay_id": "P-3006",
    "merchant": "Stark Industries",
    "amount": "7500.00",
    "currency": "USD",
    "pay_date": "2026-02-18",
    "method": "wire"
  }
}
Field What it tells you
line_number_source Row is on line 7 of the source file
key The key that was not found in the target — pay_id: P-3006
row The full row data from the source file

There is no line_number_target because the row does not exist in the target. The row field includes all columns — not just compared ones — so you have full context for investigating why this row is missing.

samples.missing_in_source works the same way but for target rows with no matching source key. Each entry has line_number_target and row from the target file.

What to do with this report

  1. Check summary first. If all counts are zero, you are done.
  2. Check details.column_stats. If mismatches are concentrated in one column, the fix may be a targeted rule (tolerance, string rule, or normalization) rather than a data issue.
  3. Inspect samples.value_mismatches. The key and line numbers let you find the exact rows in the original files. The columns section shows exactly what differs.
  4. Inspect samples.missing_in_target and missing_in_source. Missing rows indicate data that exists on one side only — investigate whether it is a timing issue, a failed import, or unexpected data.

Report root object

{
  "type": "tabular",
  "version": "1.3",
  "generated_at": "2026-01-15T12:00:00+00:00",
  "config_hash": "abc123...",
  "summary": { },
  "details": { },
  "samples": { },
  "warnings": [],
  "error": { }
}
Field Description
type "tabular" or "text" — matches the config type
version Report schema version (currently "1.3")
generated_at ISO-8601 UTC timestamp of when the report was created
config_hash SHA-256 hash of the config — identical configs produce the same hash
summary High-level result counts
details Metadata about what was compared and how
samples Example rows or lines that differed
warnings Non-fatal advisory messages (omitted when empty)
error Error details, present only when the run fails (exit code 2)

Exit codes

Code Meaning Report state
0 No differences found Summary counts are all zero
1 Differences found Summary contains non-zero difference counts
2 Error (bad config, missing file, etc.) error object is present, summary is zeroed

Exit code 1 is not an error — it means the comparison ran successfully and found differences.


Tabular reports

Produced when type is "tabular".

Summary

{
  "source_rows": 1000,
  "target_rows": 998,
  "missing_in_target": 2,
  "missing_in_source": 0,
  "rows_with_mismatches": 5,
  "mismatched_cells": 8,
  "comparison_time_seconds": 0.42
}
  • source_rows / target_rows — row counts after filtering
  • missing_in_target — rows in source with no matching key in target
  • missing_in_source — rows in target with no matching key in source
  • rows_with_mismatches — rows matched by key but with column differences
  • mismatched_cells — total cell-level mismatches across all rows
  • comparison_time_seconds — wall-clock time for the comparison

Details

{
  "format": "csv",
  "keys": ["order_id"],
  "compared_columns": ["amount", "status"],
  "column_mapping": { "amount": "total_amount" },
  "read_rows_source": 1003,
  "read_rows_target": 998,
  "filters_applied": { },
  "column_stats": { "amount": { "mismatched_count": 5 } },
  "csv": { "delimiter": ",", "encoding": "utf-8", "header": true }
}
  • keys — columns used to match rows
  • compared_columns — columns checked for differences (sorted, uses logical names)
  • column_mapping — source-to-target column name translations (included only when mapping is configured)
  • read_rows_source / read_rows_target — raw row counts before filtering. Compare with summary.source_rows to see how many rows were excluded.
  • filters_applied — breakdown of rows excluded by exclude_keys and row_filters, with per-side counts
  • column_stats — per-column mismatch counts (see Column statistics)
  • csv — effective CSV parsing settings

Tabular samples

Samples are organized into four categories:

{
  "missing_in_target": [],
  "missing_in_source": [],
  "value_mismatches": [],
  "excluded": []
}

missing_in_target / missing_in_source — rows present on one side only. Each entry includes the key values, full row data, and the line number from the file where the row exists.

value_mismatches — rows matched by key but with differing values. Each entry shows the key, line numbers on both sides, and only the columns that differ (with source and target values).

excluded — rows removed by filters. Each entry shows which side the row came from, the key, row data, and the reason ("exclude_keys" or "row_filters").

Samples are capped at a configurable limit. Disable with output.include_row_samples: false for summary-only reports.


Text reports

Produced when type is "text".

Summary

{
  "total_lines_source": 150,
  "total_lines_target": 148,
  "different_lines": 3,
  "comparison_time_seconds": 0.001
}
  • total_lines_source / total_lines_target — line counts after normalization and line dropping
  • different_lines — number of line-level differences

Details

{
  "mode": "line_by_line",
  "read_lines_source": 155,
  "read_lines_target": 150,
  "ignored_blank_lines_source": 5,
  "ignored_blank_lines_target": 2,
  "rules_applied": {
    "drop_lines_count": 0,
    "replace_rules_count": 1,
    "replacement_lines_affected": 42,
    "replacement_applications": 42
  },
  "normalize": { },
  "unordered_stats": { }
}
  • read_lines_source / read_lines_target — raw line counts before processing
  • ignored_blank_lines — lines removed by ignore_blank_lines
  • rules_applied — how many lines were dropped by regex, how many regex replacement rules are configured, and how many lines and substitutions were affected
  • normalize — the effective normalization settings used
  • unordered_stats — present only in unordered_lines mode (see below)

Text samples (line_by_line)

Each entry includes raw and processed content for both sides, plus original file line numbers:

{
  "line_number_source": 10,
  "line_number_target": 10,
  "raw_source": "2026-03-01 10:00:04 [INFO] Response sent: 200 OK (95ms)",
  "raw_target": "2026-03-09 14:22:13 [WARN] Response sent: 404 Not Found (52ms)",
  "processed_source": "<TS> [INFO] Response sent: 200 OK (<DUR>)",
  "processed_target": "<TS> [WARN] Response sent: 404 Not Found (<DUR>)"
}
  • raw_source / raw_target — original file content
  • processed_source / processed_target — values after normalization and regex replacements

The comparison is performed on the processed values. The raw values are included so you can see the original content and trace back to the file.

Text samples (unordered_lines)

In unordered_lines mode, samples is always empty. Instead, aggregated data appears in samples_agg:

{
  "samples_agg": [
    {
      "line": "[worker-1] Batch C complete: 150 items",
      "source_count": 1,
      "target_count": 0,
      "source_line_numbers": [6],
      "target_line_numbers": []
    }
  ]
}

Each entry shows a distinct line whose occurrence count differs between source and target. Entries are sorted by largest count difference first.

The unordered_stats section in details provides an aggregate breakdown:

{
  "source_only_lines": 1,
  "target_only_lines": 0,
  "distinct_mismatched_lines": 1
}
  • source_only_lines — excess line occurrences in source
  • target_only_lines — excess line occurrences in target
  • distinct_mismatched_lines — how many unique line contents have differing counts

Line number metadata

Both samples and samples_agg entries include line number fields that point back to the original files. These fields are especially useful when investigating log differences, where you need to find specific occurrences in files that may contain thousands of lines.

In line_by_line samples, each entry has scalar line numbers:

  • line_number_source — the line number in the source file
  • line_number_target — the line number in the target file

These are single values because line_by_line mode compares one source line against one target line at each position.

In unordered_lines samples_agg, each entry has array line numbers:

  • source_line_numbers — line numbers in the source file where this line appears
  • target_line_numbers — line numbers in the target file where this line appears

These fields are arrays because the same line may appear multiple times in a file. For example:

{
  "line": "WARN: retry limit exceeded",
  "source_count": 3,
  "target_count": 1,
  "source_line_numbers": [5, 12, 48],
  "target_line_numbers": [9]
}

This means the line "WARN: retry limit exceeded" appears three times in the source file (at lines 5, 12, and 48) and once in the target file (at line 9). The count difference of two is what Reconlify flags as a mismatch.

The line numbers refer to positions in the original file before any normalization or line dropping. This lets you open the raw file, go to the reported line number, and see the original content — even if normalization changed or removed surrounding lines during comparison.


Error reports

When a run fails (exit code 2), the report includes an error object:

{
  "code": "DUPLICATE_KEYS",
  "message": "Duplicate key found in source file",
  "details": "Key { order_id: '1042' } appears 2 times in source"
}

Error codes:

Code Description
CONFIG_VALIDATION_ERROR Invalid YAML or schema violation
RUNTIME_ERROR File not found, I/O failure, or unexpected exception
DUPLICATE_KEYS Non-unique keys after filtering (tabular only)
INVALID_ROW_FILTERS Filter references a column that does not exist
INVALID_COLUMN_MAPPING Mapped target column missing or creates a collision

When error is present, all summary counts are zeroed and samples are empty. The comparison did not complete.


Column statistics

When enabled (default), the details.column_stats section shows per-column mismatch counts:

{
  "amount": { "mismatched_count": 5 },
  "status": { "mismatched_count": 3 },
  "currency": { "mismatched_count": 0 }
}

This tells you which columns contribute the most differences. If 500 rows have mismatches but column_stats shows they are all in amount, you know exactly where to focus — and can likely fix it with a tolerance rule or normalization.

Disable with output.include_column_stats: false.