fugashi 1.5.1


pip install fugashi

  Latest version

Released: Jun 05, 2025

Project Links

Meta
Author: Paul O'Leary McCann
Requires Python: >=3.9

Classifiers

Environment
  • Console

Intended Audience
  • Developers
  • Science/Research

Natural Language
  • Japanese

Operating System
  • POSIX :: Linux
  • MacOS :: MacOS X

Programming Language
  • Cython
  • Python :: 3

Topic
  • Text Processing :: Linguistic

Open in Streamlit Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

fugashi by Irasutoya

fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX (Intel), and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

Check out the interactive demo, see the blog post for background on why fugashi exists and some of the design decisions, or see this guide for a basic introduction to Japanese tokenization.

If you are on a platform for which wheels are not provided, you'll need to install MeCab first. It's recommended you install from source. If you need to build from source on Windows, @chezou's fork is recommended; see issue #44 for an explanation of the problems with the official repo.

Known platforms without wheels:

  • musl-based distros like alpine #77
  • PowerPC
  • Windows 32bit

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a slightly modified version 2.1.2 of Unidic (from 2013) that's relatively small
  • unidic, the latest UniDic 3.1.0, which is 770MB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install 'fugashi[unidic-lite]'

# The full version of UniDic requires a separate download step
pip install 'fugashi[unidic]'
python -m unidic download

For more information on the different MeCab dictionaries available, see this article.

Dictionary Use

fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Citation

If you use fugashi in research, it would be appreciated if you cite this paper. You can read it at the ACL Anthology or on Arxiv.

@inproceedings{mccann-2020-fugashi,
    title = "fugashi, a Tool for Tokenizing {J}apanese in Python",
    author = "McCann, Paul",
    booktitle = "Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.nlposs-1.7",
    pages = "44--51",
    abstract = "Recent years have seen an increase in the number of large-scale multilingual NLP projects. However, even in such projects, languages with special processing requirements are often excluded. One such language is Japanese. Japanese is written without spaces, tokenization is non-trivial, and while high quality open source tokenizers exist they can be hard to use and lack English documentation. This paper introduces fugashi, a MeCab wrapper for Python, and gives an introduction to tokenizing Japanese.",
}

Alternatives

If you have a problem with fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try pymecab-ko or KoNLPy.

License and Copyright Notice

fugashi is released under the terms of the MIT license. Please copy it far and wide.

fugashi is a wrapper for MeCab, and fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

1.5.2.dev0 Oct 20, 2025
1.5.1 Jun 05, 2025
1.5.1.dev0 Jun 05, 2025
1.5.0 Jun 04, 2025
1.5.0.dev0 Jun 04, 2025
1.4.4.dev2 Jun 04, 2025
1.4.4.dev1 Jun 03, 2025
1.4.4.dev0 Jun 03, 2025
1.4.3 May 26, 2025
1.4.3.dev0 May 26, 2025
1.4.2 May 26, 2025
1.4.1 May 26, 2025
1.4.1.dev0 May 24, 2025
1.4.0 Nov 11, 2024
1.4.0rc1 Nov 11, 2024
1.3.3 Oct 31, 2024
1.3.3.dev1 Oct 31, 2024
1.3.2 Apr 15, 2024
1.3.2.dev0 Apr 15, 2024
1.3.1 Mar 09, 2024
1.3.1.dev0 Mar 08, 2024
1.3.0 Aug 25, 2023
1.3.0.dev0 Aug 11, 2023
1.2.1 Dec 06, 2022
1.2.0 Sep 04, 2022
1.2.0.dev0 Sep 04, 2022
1.1.2 Feb 16, 2022
1.1.2a7 Feb 15, 2022
1.1.2a6 Feb 12, 2022
1.1.2a5 Dec 23, 2021
1.1.2a4 Dec 21, 2021
1.1.2a3 Nov 25, 2021
1.1.2a2 Nov 25, 2021
1.1.2a1 Nov 08, 2021
1.1.1 Jul 24, 2021
1.1.1a1 Jun 14, 2021
1.1.0 Jan 25, 2021
1.1.0a2 Dec 29, 2020
1.1.0a1 Dec 27, 2020
1.0.5 Oct 22, 2020
1.0.5a6 Oct 22, 2020
1.0.5a5 Oct 22, 2020
1.0.5a4 Oct 22, 2020
1.0.5a3 Oct 22, 2020
1.0.5a2 Oct 22, 2020
1.0.5a1 Oct 21, 2020
1.0.4 Aug 12, 2020
1.0.4a1 Aug 11, 2020
1.0.3 Aug 10, 2020
1.0.3a2 Aug 10, 2020
1.0.3a1 Aug 09, 2020
1.0.2 Jul 26, 2020
1.0.2a9 Jul 26, 2020
1.0.2a8 Jul 26, 2020
1.0.2a7 Jul 26, 2020
1.0.2a6 Jul 26, 2020
1.0.2a5 Jul 26, 2020
1.0.2a4 Jul 26, 2020
1.0.1 Jul 16, 2020
1.0.1rc1 Jul 16, 2020
1.0.0 Jun 28, 2020
0.2.3 Jun 04, 2020
0.2.2 May 25, 2020
0.2.2rc1 May 25, 2020
0.2.1 May 18, 2020
0.2.0 May 18, 2020
0.1.12 Apr 15, 2020
0.1.12rc5 Apr 10, 2020
0.1.12rc4 Apr 10, 2020
0.1.12rc3 Apr 07, 2020
0.1.12rc2 Apr 07, 2020
0.1.12rc1 Apr 07, 2020
0.1.11 Apr 02, 2020
0.1.10 Mar 23, 2020
0.1.10rc2 Feb 01, 2020
0.1.10rc1 Jan 10, 2020
0.1.9 Jan 07, 2020
0.1.9rc2 Jan 07, 2020
0.1.8 Dec 27, 2019
0.1.7 Dec 27, 2019
0.1.6 Dec 20, 2019
0.1.5 Nov 28, 2019
0.1.4 Nov 10, 2019
0.1.3 Nov 06, 2019
0.1.2 Nov 06, 2019
0.1.1 Oct 14, 2019
0.1.0 Oct 14, 2019
0.0.0 Jul 26, 2020

Wheel compatibility matrix

Platform CPython 3.9 CPython 3.10 CPython 3.11 CPython 3.12 CPython 3.13 PyPy 3.9 (pp73) PyPy 3.10 (pp73) PyPy 3.11 (pp73)
macosx_10_13_universal2
macosx_10_13_x86_64
macosx_10_15_x86_64
macosx_10_9_universal2
macosx_10_9_x86_64
macosx_11_0_arm64
manylinux2014_aarch64
manylinux2014_x86_64
manylinux_2_17_aarch64
manylinux_2_17_x86_64
win_amd64

Files in release

fugashi-1.5.1-cp310-cp310-macosx_10_9_universal2.whl (548.8KiB)
fugashi-1.5.1-cp310-cp310-macosx_10_9_x86_64.whl (495.2KiB)
fugashi-1.5.1-cp310-cp310-macosx_11_0_arm64.whl (490.8KiB)
fugashi-1.5.1-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (638.5KiB)
fugashi-1.5.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (656.3KiB)
fugashi-1.5.1-cp310-cp310-win_amd64.whl (501.1KiB)
fugashi-1.5.1-cp311-cp311-macosx_10_9_universal2.whl (549.2KiB)
fugashi-1.5.1-cp311-cp311-macosx_10_9_x86_64.whl (495.5KiB)
fugashi-1.5.1-cp311-cp311-macosx_11_0_arm64.whl (490.9KiB)
fugashi-1.5.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (665.4KiB)
fugashi-1.5.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (682.0KiB)
fugashi-1.5.1-cp311-cp311-win_amd64.whl (501.3KiB)
fugashi-1.5.1-cp312-cp312-macosx_10_13_universal2.whl (549.3KiB)
fugashi-1.5.1-cp312-cp312-macosx_10_13_x86_64.whl (495.6KiB)
fugashi-1.5.1-cp312-cp312-macosx_11_0_arm64.whl (491.6KiB)
fugashi-1.5.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (660.0KiB)
fugashi-1.5.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (681.5KiB)
fugashi-1.5.1-cp312-cp312-win_amd64.whl (501.3KiB)
fugashi-1.5.1-cp313-cp313-macosx_10_13_universal2.whl (547.7KiB)
fugashi-1.5.1-cp313-cp313-macosx_10_13_x86_64.whl (494.8KiB)
fugashi-1.5.1-cp313-cp313-macosx_11_0_arm64.whl (490.9KiB)
fugashi-1.5.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (656.6KiB)
fugashi-1.5.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (677.7KiB)
fugashi-1.5.1-cp313-cp313-win_amd64.whl (501.1KiB)
fugashi-1.5.1-cp39-cp39-macosx_10_9_universal2.whl (550.3KiB)
fugashi-1.5.1-cp39-cp39-macosx_10_9_x86_64.whl (496.0KiB)
fugashi-1.5.1-cp39-cp39-macosx_11_0_arm64.whl (491.4KiB)
fugashi-1.5.1-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (642.0KiB)
fugashi-1.5.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (658.8KiB)
fugashi-1.5.1-cp39-cp39-win_amd64.whl (502.3KiB)
fugashi-1.5.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl (483.7KiB)
fugashi-1.5.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl (481.0KiB)
fugashi-1.5.1-pp311-pypy311_pp73-macosx_10_15_x86_64.whl (483.9KiB)
fugashi-1.5.1-pp311-pypy311_pp73-macosx_11_0_arm64.whl (481.2KiB)
fugashi-1.5.1-pp39-pypy39_pp73-macosx_10_15_x86_64.whl (483.6KiB)
fugashi-1.5.1-pp39-pypy39_pp73-macosx_11_0_arm64.whl (480.8KiB)
fugashi-1.5.1.tar.gz (331.8KiB)
Extras:
Dependencies: