identify 2.3.6


pip install identify==2.3.6

Project Links

Meta
Author: Chris Kuehl
Requires Python: >=3.6.1

Classifiers

License
  • OSI Approved :: MIT License

Programming Language
  • Python :: 3
  • Python :: 3 :: Only
  • Python :: 3.6
  • Python :: 3.7
  • Python :: 3.8
  • Python :: 3.9
  • Python :: 3.10
  • Python :: Implementation :: CPython
  • Python :: Implementation :: PyPy

identify

Build Status Azure DevOps coverage pre-commit.ci status PyPI version

File identification library for Python.

Given a file (or some information about a file), return a set of standardized tags identifying what the file is.

Installation

pip install identify

Usage

With a file on disk

If you have an actual file on disk, you can get the most information possible (a superset of all other methods):

>>> from identify import identify
>>> identify.tags_from_path('/path/to/file.py')
{'file', 'text', 'python', 'non-executable'}
>>> identify.tags_from_path('/path/to/file-with-shebang')
{'file', 'text', 'shell', 'bash', 'executable'}
>>> identify.tags_from_path('/bin/bash')
{'file', 'binary', 'executable'}
>>> identify.tags_from_path('/path/to/directory')
{'directory'}
>>> identify.tags_from_path('/path/to/symlink')
{'symlink'}

When using a file on disk, the checks performed are:

  • File type (file, symlink, directory, socket)
  • Mode (is it executable?)
  • File name (mostly based on extension)
  • If executable, the shebang is read and the interpreter interpreted

If you only have the filename

>>> identify.tags_from_filename('file.py')
{'text', 'python'}

If you only have the interpreter

>>> identify.tags_from_interpreter('python3.5')
{'python', 'python3'}
>>> identify.tags_from_interpreter('bash')
{'shell', 'bash'}
>>> identify.tags_from_interpreter('some-unrecognized-thing')
set()

As a cli

$ identify-cli --help
usage: identify-cli [-h] [--filename-only] path

positional arguments:
  path

optional arguments:
  -h, --help       show this help message and exit
  --filename-only
$ identify-cli setup.py; echo $?
["file", "non-executable", "python", "text"]
0
$ identify setup.py --filename-only; echo $?
["python", "text"]
0
$ identify-cli wat.wat; echo $?
wat.wat does not exist.
1
$ identify-cli wat.wat --filename-only; echo $?
1

Identifying LICENSE files

identify also has an api for determining what type of license is contained in a file. This routine is roughly based on the approaches used by licensee (the ruby gem that github uses to figure out the license for a repo).

The approach that identify uses is as follows:

  1. Strip the copyright line
  2. Normalize all whitespace
  3. Return any exact matches
  4. Return the closest by edit distance (where edit distance < 5%)

To use the api, install via pip install identify[license]

>>> from identify import identify
>>> identify.license_id('LICENSE')
'MIT'

The return value of the license_id function is an SPDX id. Currently licenses are sourced from choosealicense.com.

How it works

A call to tags_from_path does this:

  1. What is the type: file, symlink, directory? If it's not file, stop here.
  2. Is it executable? Add the appropriate tag.
  3. Do we recognize the file extension? If so, add the appropriate tags, stop here. These tags would include binary/text.
  4. Peek at the first X bytes of the file. Use these to determine whether it is binary or text, add the appropriate tag.
  5. If identified as text above, try to read and interpret the shebang, and add appropriate tags.

By design, this means we don't need to partially read files where we recognize the file extension.

2.6.15 Oct 02, 2025
2.6.14 Sep 06, 2025
2.6.13 Aug 09, 2025
2.6.12 May 23, 2025
2.6.11 May 23, 2025
2.6.10 Apr 19, 2025
2.6.9 Mar 08, 2025
2.6.8 Feb 22, 2025
2.6.7 Feb 08, 2025
2.6.6 Jan 20, 2025
2.6.5 Jan 04, 2025
2.6.4 Dec 29, 2024
2.6.3 Nov 25, 2024
2.6.2 Nov 09, 2024
2.6.1 Sep 14, 2024
2.6.0 Jul 07, 2024
2.5.36 Apr 20, 2024
2.5.35 Feb 18, 2024
2.5.34 Feb 10, 2024
2.5.33 Dec 07, 2023
2.5.32 Nov 18, 2023
2.5.31 Oct 28, 2023
2.5.30 Sep 30, 2023
2.5.29 Sep 15, 2023
2.5.28 Sep 11, 2023
2.5.27 Aug 21, 2023
2.5.26 Jul 22, 2023
2.5.25 Jul 19, 2023
2.5.24 May 03, 2023
2.5.23 Apr 25, 2023
2.5.22 Mar 24, 2023
2.5.21 Mar 16, 2023
2.5.20 Mar 11, 2023
2.5.19 Mar 08, 2023
2.5.18 Feb 13, 2023
2.5.17 Jan 30, 2023
2.5.16 Jan 28, 2023
2.5.15 Jan 23, 2023
2.5.13 Jan 11, 2023
2.5.12 Jan 03, 2023
2.5.11 Dec 19, 2022
2.5.10 Dec 15, 2022
2.5.9 Nov 18, 2022
2.5.8 Oct 27, 2022
2.5.7 Oct 25, 2022
2.5.6 Oct 03, 2022
2.5.5 Sep 06, 2022
2.5.4 Sep 05, 2022
2.5.3 Aug 03, 2022
2.5.2 Jul 20, 2022
2.5.1 May 19, 2022
2.5.0 Apr 27, 2022
2.4.12 Mar 16, 2022
2.4.11 Feb 22, 2022
2.4.10 Feb 14, 2022
2.4.9 Feb 09, 2022
2.4.8 Feb 03, 2022
2.4.7 Feb 01, 2022
2.4.6 Jan 27, 2022
2.4.5 Jan 23, 2022
2.4.4 Jan 13, 2022
2.4.3 Jan 11, 2022
2.4.2 Jan 06, 2022
2.4.1 Dec 28, 2021
2.4.0 Nov 19, 2021
2.3.7 Nov 18, 2021
2.3.6 Nov 16, 2021
2.3.5 Nov 08, 2021
2.3.4 Nov 05, 2021
2.3.3 Oct 31, 2021
2.3.2 Oct 30, 2021
2.3.1 Oct 22, 2021
2.3.0 Oct 02, 2021
2.2.15 Sep 19, 2021
2.2.14 Sep 09, 2021
2.2.13 Aug 06, 2021
2.2.12 Aug 04, 2021
2.2.11 Jul 09, 2021
2.2.10 Jun 07, 2021
2.2.9 Jun 05, 2021
2.2.8 Jun 03, 2021
2.2.7 May 31, 2021
2.2.6 May 25, 2021
2.2.5 May 21, 2021
2.2.4 Apr 21, 2021
2.2.3 Apr 09, 2021
2.2.2 Mar 28, 2021
2.2.1 Mar 25, 2021
2.2.0 Mar 20, 2021
2.1.4 Mar 18, 2021
2.1.3 Mar 14, 2021
2.1.2 Mar 12, 2021
2.1.1 Mar 09, 2021
2.1.0 Mar 03, 2021
2.0.0 Mar 01, 2021
1.6.2 Mar 01, 2021
1.6.1 Feb 27, 2021
1.6.0 Feb 27, 2021
1.5.14 Feb 20, 2021
1.5.13 Jan 17, 2021
1.5.12 Jan 09, 2021
1.5.11 Dec 31, 2020
1.5.10 Nov 23, 2020
1.5.9 Nov 03, 2020
1.5.8 Nov 03, 2020
1.5.7 Nov 02, 2020
1.5.6 Oct 10, 2020
1.5.5 Sep 24, 2020
1.5.4 Sep 21, 2020
1.5.3 Sep 17, 2020
1.5.2 Sep 13, 2020
1.5.1 Sep 12, 2020
1.5.0 Sep 05, 2020
1.4.30 Sep 01, 2020
1.4.29 Aug 24, 2020
1.4.28 Aug 16, 2020
1.4.27 Aug 16, 2020
1.4.26 Aug 14, 2020
1.4.25 Jul 21, 2020
1.4.24 Jul 19, 2020
1.4.23 Jul 10, 2020
1.4.22 Jul 10, 2020
1.4.21 Jul 01, 2020
1.4.20 Jun 21, 2020
1.4.19 Jun 04, 2020
1.4.18 May 30, 2020
1.4.17 May 26, 2020
1.4.16 May 21, 2020
1.4.15 Apr 23, 2020
1.4.14 Apr 03, 2020
1.4.13 Mar 24, 2020
1.4.12 Mar 21, 2020
1.4.11 Jan 25, 2020
1.4.10 Jan 14, 2020
1.4.9 Dec 23, 2019
1.4.8 Dec 04, 2019
1.4.7 Aug 27, 2019
1.4.6 Aug 12, 2019
1.4.5 Jun 15, 2019
1.4.4 Jun 11, 2019
1.4.3 May 10, 2019
1.4.2 Apr 28, 2019
1.4.1 Mar 27, 2019
1.4.0 Mar 01, 2019
1.3.0 Feb 23, 2019
1.2.2 Feb 14, 2019
1.2.1 Jan 26, 2019
1.2.0 Jan 18, 2019
1.1.8 Jan 02, 2019
1.1.7 Oct 05, 2018
1.1.6 Sep 16, 2018
1.1.5 Sep 07, 2018
1.1.4 Jul 25, 2018
1.1.3 Jul 11, 2018
1.1.2 Jul 09, 2018
1.1.1 Jul 05, 2018
1.1.0 Jun 09, 2018
1.0.18 May 24, 2018
1.0.17 May 23, 2018
1.0.16 May 08, 2018
1.0.15 May 05, 2018
1.0.14 May 03, 2018
1.0.13 Apr 16, 2018
1.0.12 Apr 14, 2018
1.0.11 Apr 11, 2018
1.0.10 Apr 11, 2018
1.0.9 Apr 04, 2018
1.0.8 Mar 11, 2018
1.0.7 Nov 13, 2017
1.0.6 Sep 21, 2017
1.0.5 Jul 28, 2017
1.0.4 Jul 28, 2017
1.0.3 Jul 06, 2017
1.0.2 Jul 03, 2017
1.0.1 Jul 03, 2017
1.0.0 Jul 02, 2017
0.0.4 Jul 02, 2017
0.0.3 Jul 02, 2017
0.0.2 Feb 17, 2017
0.0.1 Feb 16, 2017

Wheel compatibility matrix

Platform Python 2 Python 3
any

Files in release

Extras:
Dependencies: