json-repair 0.58.7


pip install json-repair

  Latest version

Released: Mar 26, 2026


Meta
Author: Stefano Baccianella
Requires Python: >=3.10

Classifiers

Programming Language
  • Python :: 3

Operating System
  • OS Independent

PyPI Python version PyPI downloads PyPI Downloads Github Sponsors GitHub Repo stars

English | 中文

This simple package can be used to fix an invalid json string. To know all cases in which this package will work, check out the unit test.

banner


Think about sponsoring this library!

This library is free for everyone and it's maintained and developed as a side project so, if you find this library useful for your work, consider becoming a sponsor via this link: https://github.com/sponsors/mangiucugna

Premium sponsors


Demo

If you are unsure if this library will fix your specific problem, or simply want your json validated online, you can visit the demo site on GitHub pages: https://mangiucugna.github.io/json_repair/

Or hear an audio deepdive generate by Google's NotebookLM for an introduction to the module


Motivation

Some LLMs are a bit iffy when it comes to returning well formed JSON data, sometimes they skip a parentheses and sometimes they add some words in it, because that's what an LLM does. Luckily, the mistakes LLMs make are simple enough to be fixed without destroying the content.

I searched for a lightweight python package that was able to reliably fix this problem but couldn't find any.

So I wrote one

Supported use cases

Fixing Syntax Errors in JSON

  • Missing quotes, misplaced commas, unescaped characters, and incomplete key-value pairs.
  • Missing quotation marks, improperly formatted values (true, false, null), and repairs corrupted key-value structures.

Repairing Malformed JSON Arrays and Objects

  • Incomplete or broken arrays/objects by adding necessary elements (e.g., commas, brackets) or default values (null, "").
  • The library can process JSON that includes extra non-JSON characters like comments or improperly placed characters, cleaning them up while maintaining valid structure.

Auto-Completion for Missing JSON Values

  • Automatically completes missing values in JSON fields with reasonable defaults (like empty strings or null), ensuring validity.

How to use

Install the library with pip

pip install json-repair

then you can use use it in your code like this

from json_repair import repair_json

good_json_string = repair_json(bad_json_string)
# If the string was super broken this will return an empty string

You can use this library to completely replace json.loads():

import json_repair

decoded_object = json_repair.loads(json_string)

or just

import json_repair

decoded_object = json_repair.repair_json(json_string, return_objects=True)

Avoid this antipattern

Some users of this library adopt the following pattern:

obj = {}
try:
    obj = json.loads(string)
except json.JSONDecodeError as e:
    obj = json_repair.loads(string)
    ...

This is wasteful because json_repair already does that strict json.loads() check for you by default. The normal flow is:

  • try the built-in json.loads() / json.load() first
  • if that succeeds, return the decoded object
  • if that fails, run the repair parser

Use the default call unless you explicitly want to skip that initial validation step:

import json_repair

decoded_object = json_repair.loads(json_string)

Read json from a file or file descriptor

JSON repair provides also a drop-in replacement for json.load():

import json_repair

try:
    file_descriptor = open(fname, 'rb')
except OSError:
    ...

with file_descriptor:
    decoded_object = json_repair.load(file_descriptor)

and another method to read from a file:

import json_repair

try:
    decoded_object = json_repair.from_file(json_file)
except OSError:
    ...
except IOError:
    ...

Keep in mind that the library will not catch any IO-related exception and those will need to be managed by you

Non-Latin characters

When working with non-Latin characters (such as Chinese, Japanese, or Korean), you need to pass ensure_ascii=False to repair_json() in order to preserve the non-Latin characters in the output.

Here's an example using Chinese characters:

repair_json("{'test_chinese_ascii':'统一码'}")

will return

{"test_chinese_ascii": "\u7edf\u4e00\u7801"}

Instead passing ensure_ascii=False:

repair_json("{'test_chinese_ascii':'统一码'}", ensure_ascii=False)

will return

{"test_chinese_ascii": "统一码"}

JSON dumps parameters

More in general, repair_json will accept all parameters that json.dumps accepts and just pass them through (for example indent)

Performance considerations

By default, json_repair first tries the standard-library JSON loader and only falls back to the repair parser when strict JSON parsing fails.

If you already know the input is invalid JSON and want to skip that initial validation step, pass skip_json_loads=True:

from json_repair import repair_json

good_json_string = repair_json(bad_json_string, skip_json_loads=True)

This is an explicit tradeoff:

  • default behavior: validate with stdlib JSON first, then repair only if needed
  • skip_json_loads=True: skip the validation fast path and go straight to the repair parser

json_repair intentionally keeps the validation path on the standard library. It does not auto-detect or auto-use third-party JSON libraries, which keeps behavior predictable and avoids extra overhead on the common path.

Some rules of thumb to use:

  • Setting return_objects=True will always be faster because the parser returns an object already and it doesn't have serialize that object to JSON
  • skip_json_loads=True is faster only if you 100% know that the string is not a valid JSON
  • If you are having issues with escaping pass the string as raw string like: r"string with escaping\""

When to use your own JSON library

If you want non-stdlib JSON semantics or a different performance profile, use your preferred JSON library yourself instead of expecting json_repair to switch parsers automatically. orjson is a common example people ask about, and the same pattern applies to any other JSON library.

Recommended patterns:

Strict JSON first, repair only if needed:

import json_repair

decoded_object = json_repair.loads(json_string)

Known-bad input, so skip the validation step:

from json_repair import repair_json

decoded_object = repair_json(bad_json_string, return_objects=True, skip_json_loads=True)

orjson first, json_repair only as a fallback:

import json_repair
import orjson

try:
    decoded_object = orjson.loads(json_string)
except orjson.JSONDecodeError:
    decoded_object = json_repair.loads(json_string, skip_json_loads=True)

Strict mode

By default json_repair does its best to “fix” input, even when the JSON is far from valid.
In some scenarios you want the opposite behavior and need the parser to error out instead of repairing; pass strict=True to repair_json, loads, load, or from_file to enable that mode:

from json_repair import repair_json

repair_json(bad_json_string, strict=True)

The CLI exposes the same behavior with json_repair --strict input.json (or piping data via stdin).

In strict mode the parser raises ValueError as soon as it encounters structural issues such as duplicate keys, missing : separators, empty keys/values introduced by stray commas, multiple top-level elements, or other ambiguous constructs. This is useful when you just need validation with friendlier error messages while still benefiting from json_repair’s resilience elsewhere in your stack.

Strict mode still honors skip_json_loads=True; combining them lets you skip the initial json.loads check but still enforce strict parsing rules.

Schema-guided repairs

Schema-guided repairs are currently considered in beta. Bugs are to be expected.

You can guide repairs with a JSON Schema (or a Pydantic v2 model). When enabled, the parser will:

  • Fill missing values (defaults, required fields).
  • Coerce scalars where safe (e.g., "1"1 for integer fields, and "yes"/"no"/1/0 for booleans).
  • Drop properties/items that the schema disallows.

Schema mode can be selected with schema_repair_mode:

  • standard (default): existing schema-guided behavior.
  • salvage: includes standard and also:
    • drops invalid array items when individual items cannot be repaired;
    • maps arrays to objects by property order when schema/object shape is unambiguous.
    • unwraps a root single-item array to an object when the root schema expects an object ([{...}] -> {...});
    • fills missing required fields only when a safe value can be inferred (default, const, first enum, or empty array/object when allowed by schema constraints).

This is especially useful when you need deterministic, schema-valid outputs for downstream validation, storage, or typed processing. If the input cannot be repaired to satisfy the schema, json_repair raises ValueError.

Install the optional dependencies:

pip install 'json-repair[schema]'

(For CLI usage, you can also use pipx install 'json-repair[schema]'.)

When schema is provided, schema guidance is always applied (for both valid and invalid JSON). Schema guidance is mutually exclusive with strict=True.

from json_repair import repair_json

schema = {
    "type": "object",
    "properties": {"value": {"type": "integer"}},
    "required": ["value"],
}

repair_json('{"value": "1"}', schema=schema, return_objects=True)

repair_json(
    '{"items":[{"id":1,"score":85.6},{"id":2,"score":"N/A"}]}',
    schema={
        "type": "object",
        "properties": {
            "items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {"id": {"type": "integer"}, "score": {"type": "number"}},
                    "required": ["id", "score"],
                },
            }
        },
        "required": ["items"],
    },
    schema_repair_mode="salvage",
    return_objects=True,
)

Pydantic v2 model example:

from pydantic import BaseModel, Field
from json_repair import repair_json


class Payload(BaseModel):
    value: int
    tags: list[str] = Field(default_factory=list)


repair_json(
    '{"value": "1", "tags": }',
    schema=Payload,
    skip_json_loads=True,
    return_objects=True,
)

Use json_repair with streaming

Sometimes you are streaming some data and want to repair the JSON coming from it. Normally this won't work but you can pass stream_stable to repair_json() or loads() to make it work:

stream_output = repair_json(stream_input, stream_stable=True)

Use json_repair from CLI

Install the library for command-line with:

pipx install json-repair

to know all options available:

$ json_repair -h
usage: json_repair [-h] [-i] [-o TARGET] [--ensure_ascii] [--indent INDENT]
                   [--skip-json-loads] [--schema SCHEMA] [--schema-model MODEL]
                   [--strict] [--schema-repair-mode {standard,salvage}] [filename]

Repair and parse JSON files.

positional arguments:
  filename              The JSON file to repair (if omitted, reads from stdin)

options:
  -h, --help            show this help message and exit
  -i, --inline          Replace the file inline instead of returning the output to stdout
  -o TARGET, --output TARGET
                        If specified, the output will be written to TARGET filename instead of stdout
  --ensure_ascii        Pass ensure_ascii=True to json.dumps()
  --indent INDENT       Number of spaces for indentation (Default 2)
  --skip-json-loads     Skip initial json.loads validation
  --schema SCHEMA       Path to a JSON Schema file that guides repairs
  --schema-model MODEL  Pydantic v2 model in 'module:ClassName' form that guides repairs
  --strict              Raise on duplicate keys, missing separators, empty keys/values, and similar structural issues instead of repairing them
  --schema-repair-mode {standard,salvage}
                        Schema repair mode: standard (default) or salvage (best-effort array/object salvage)

Adding to requirements

Please pin this library only on the major version!

We use TDD and strict semantic versioning, there will be frequent updates and no breaking changes in minor and patch versions. To ensure that you only pin the major version of this library in your requirements.txt, specify the package name followed by the major version and a wildcard for minor and patch versions. For example:

json_repair==0.*

In this example, any version that starts with 0. will be acceptable, allowing for updates on minor and patch versions.


How to cite

If you are using this library in your academic work (as I know many folks are) please find the BibTex here:

@software{Baccianella_JSON_Repair_-_2025,
    author  = "Stefano {Baccianella}",
    month   = "feb",
    title   = "JSON Repair - A python module to repair invalid JSON, commonly used to parse the output of LLMs",
    url     = "https://github.com/mangiucugna/json_repair",
    version = "0.39.1",
    year    = 2025
}

Thank you for citing my work and please send me a link to the paper if you can!


How it works

This module will parse the JSON file following the BNF definition:

<json> ::= <primitive> | <container>

<primitive> ::= <number> | <string> | <boolean>
; Where:
; <number> is a valid real number expressed in one of a number of given formats
; <string> is a string of valid characters enclosed in quotes
; <boolean> is one of the literal strings 'true', 'false', or 'null' (unquoted)

<container> ::= <object> | <array>
<array> ::= '[' [ <json> *(', ' <json>) ] ']' ; A sequence of JSON values separated by commas
<object> ::= '{' [ <member> *(', ' <member>) ] '}' ; A sequence of 'members'
<member> ::= <string> ': ' <json> ; A pair consisting of a name, and a JSON value

If something is wrong (a missing parentheses or quotes for example) it will use a few simple heuristics to fix the JSON string:

  • Add the missing parentheses if the parser believes that the array or object should be closed
  • Quote strings or add missing single quotes
  • Adjust whitespaces and remove line breaks

I am sure some corner cases will be missing, if you have examples please open an issue or even better push a PR

Contributing

If you want to contribute, start with CONTRIBUTING.md and read the Code Wiki writeup for a tour of the codebase and key entry points: https://codewiki.google/github.com/mangiucugna/json_repair

How to develop

Use uv to set up the dev environment and run tooling:

uv sync --group dev
uv run pre-commit run --all-files
uv run pytest

Make sure that the Github Actions running after pushing a new commit don't fail as well.

How to release

You will need owner access to this repository

  • Edit pyproject.toml and update the version number appropriately using semver notation
  • Commit and push all changes to the repository before continuing or the next steps will fail
  • Run python -m build
  • Create a new release in Github, making sure to tag all the issues solved and contributors. Create the new tag, same as the one in the build configuration
  • Once the release is created, a new Github Actions workflow will start to publish on Pypi, make sure it didn't fail

Docs demo API deployment (PythonAnywhere)

  • The docs site is deployed by GitHub Pages (pages-build-deployment).
  • After a successful Pages deployment on main, .github/workflows/pythonanywhere-sync.yml uploads docs/app.py to PythonAnywhere at /home/mangiucugna/json_repair/app.py and reloads mangiucugna.pythonanywhere.com.
  • Required repository Actions secret: PythonAnywhere API token (PYTHONANYWHERE_API_TOKEN).

Repair JSON in other programming languages


Star History

Star History Chart

0.58.7 Mar 26, 2026
0.58.6 Mar 16, 2026
0.58.5 Mar 07, 2026
0.58.4 Mar 05, 2026
0.58.3 Mar 03, 2026
0.58.2 Mar 02, 2026
0.58.1 Feb 28, 2026
0.58.0 Feb 17, 2026
0.57.1 Feb 08, 2026
0.57.0 Feb 07, 2026
0.56.0 Feb 03, 2026
0.55.2 Feb 02, 2026
0.55.1 Jan 23, 2026
0.55.0 Jan 01, 2026
0.54.3 Dec 15, 2025
0.54.2 Nov 25, 2025
0.54.1 Nov 19, 2025
0.54 Nov 18, 2025
0.53.1 Nov 18, 2025
0.53.0 Nov 08, 2025
0.52.5 Nov 06, 2025
0.52.4 Nov 01, 2025
0.52.3 Oct 22, 2025
0.52.2 Oct 20, 2025
0.52.1 Oct 18, 2025
0.52.0 Oct 05, 2025
0.51.0 Sep 19, 2025
0.50.1 Sep 06, 2025
0.50.0 Aug 20, 2025
0.49.0 Aug 10, 2025
0.48.0 Jul 25, 2025
0.47.8 Jul 17, 2025
0.47.7 Jul 13, 2025
0.47.6 Jul 01, 2025
0.47.5 Jun 30, 2025
0.47.4 Jun 27, 2025
0.47.3 Jun 24, 2025
0.47.2 Jun 23, 2025
0.47.1 Jun 19, 2025
0.47.0 Jun 19, 2025
0.46.2 Jun 06, 2025
0.46.1 Jun 04, 2025
0.46.0 May 22, 2025
0.45.1 May 21, 2025
0.45.0 May 20, 2025
0.44.1 Apr 30, 2025
0.44.0 Apr 29, 2025
0.43.0 Apr 28, 2025
0.42.0 Apr 22, 2025
0.41.1 Apr 14, 2025
0.41.0 Apr 10, 2025
0.40.0 Mar 19, 2025
0.39.1 Feb 23, 2025
0.39.0 Feb 18, 2025
0.38.0 Feb 17, 2025
0.37.0 Feb 16, 2025
0.36.1 Feb 13, 2025
0.36.0 Feb 11, 2025
0.35.0 Dec 31, 2024
0.34.0 Dec 26, 2024
0.33.0 Dec 23, 2024
0.32.0 Dec 18, 2024
0.31.0 Dec 13, 2024
0.30.3 Dec 04, 2024
0.30.2 Nov 14, 2024
0.30.1 Nov 05, 2024
0.30.0 Oct 09, 2024
0.29.10 Oct 07, 2024
0.29.9 Oct 07, 2024
0.29.8 Oct 04, 2024
0.29.7 Sep 29, 2024
0.29.6 Sep 28, 2024
0.29.5 Sep 26, 2024
0.29.4 Sep 22, 2024
0.29.3 Sep 22, 2024
0.29.2 Sep 09, 2024
0.29.1 Sep 05, 2024
0.29.0 Sep 04, 2024
0.28.4 Aug 28, 2024
0.28.3 Aug 19, 2024
0.28.2 Aug 19, 2024
0.28.1 Aug 19, 2024
0.28.0 Aug 16, 2024
0.27.2 Aug 11, 2024
0.27.1 Aug 11, 2024
0.27.0 Aug 08, 2024
0.26.0 Aug 02, 2024
0.25.3 Jul 10, 2024
0.25.2 Jun 27, 2024
0.25.1 Jun 20, 2024
0.25.0 Jun 19, 2024
0.24.0 Jun 18, 2024
0.23.1 Jun 02, 2024
0.23.0 Jun 02, 2024
0.22.0 Jun 01, 2024
0.21.0 May 30, 2024
0.20.1 May 26, 2024
0.20.0 May 25, 2024
0.19.2 May 21, 2024
0.19.1 May 13, 2024
0.19.0 May 12, 2024
0.18.0 May 09, 2024
0.17.4 May 08, 2024
0.17.3 May 07, 2024
0.17.2 May 07, 2024
0.17.1 May 06, 2024
0.17.0 May 03, 2024
0.16.3 Apr 30, 2024
0.16.2 Apr 30, 2024
0.16.1 Apr 30, 2024
0.16.0 Apr 29, 2024
0.15.6 Apr 29, 2024
0.15.5 Apr 28, 2024
0.15.4 Apr 28, 2024
0.15.3 Apr 25, 2024
0.15.2 Apr 23, 2024
0.15.1 Apr 23, 2024
0.15.0 Apr 21, 2024
0.14.0 Apr 19, 2024
0.13.1 Apr 18, 2024
0.13.0 Apr 11, 2024
0.12.3 Apr 10, 2024
0.12.2 Apr 09, 2024
0.12.1 Apr 08, 2024
0.12.0 Apr 08, 2024
0.11.1 Apr 02, 2024
0.11.0 Apr 01, 2024
0.10.1 Mar 06, 2024
0.10.0 Mar 06, 2024
0.9.0 Feb 24, 2024
0.8.1 Feb 12, 2024
0.8.0 Jan 28, 2024
0.7.0 Jan 27, 2024
0.6.2 Jan 24, 2024
0.6.1 Jan 23, 2024
0.6.0 Jan 22, 2024
0.5.1 Jan 18, 2024
0.5.0 Jan 16, 2024
0.4.5 Dec 06, 2023
0.4.4 Dec 05, 2023
0.4.3 Nov 22, 2023
0.4.2 Nov 22, 2023
0.4.1 Nov 20, 2023
0.4.0 Nov 20, 2023
0.3.0 Nov 16, 2023
0.2.0 Oct 17, 2023
0.1.10 Oct 16, 2023
0.1.9 Oct 16, 2023
0.1.8 Oct 11, 2023
0.1.7 Sep 08, 2023
0.1.6 Sep 07, 2023
0.1.5 Sep 07, 2023

Wheel compatibility matrix

Platform Python 3
any

Files in release

Extras:
Dependencies: