nagisa 0.2.12


pip install nagisa

  Latest version

Released: Feb 12, 2026

Project Links

Meta
Author: Taishi Ikeda

Classifiers

License
  • OSI Approved :: MIT License

Natural Language
  • Japanese

Programming Language
  • Python :: 2.7
  • Python :: 3.5
  • Python :: 3.6
  • Python :: 3.7
  • Python :: 3.8
  • Python :: 3.9
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13
  • Python :: 3.14

Operating System
  • Unix
  • MacOS :: MacOS X
  • Microsoft :: Windows

Topic
  • Text Processing :: Linguistic
  • Software Development :: Libraries :: Python Modules


Python package Coverage Status Documentation Status GitHub License PyPI Hugging Face Spaces Downloads

Nagisa is a python module for Japanese word segmentation/POS-tagging.

It is designed to be a simple and easy-to-use tool.

This tool has the following features.

  • Based on recurrent neural networks.
  • The word segmentation model uses character- and word-level features [池田+].
  • The POS-tagging model uses tag dictionary information [Inoue+].

For more details refer to the following links.

  • The documentation is available here.
  • The article in Japanese is available here.
  • The presentation slide at PyCon JP (2022) is available here.

Installation

You can install nagisa using pip:

pip install nagisa

Supported Platforms:

  • 🐧 Linux: Python 3.6 - 3.14
  • 🍎 macOS: Python 3.9 - 3.14
  • 🪟 Windows: Python 3.9 - 3.14

Basic usage

Sample of word segmentation and POS-tagging for Japanese. The output tokens are normalized using Unicode NFKC normalization.

import nagisa

text = 'Pythonで簡単に使えるツールです'
words = nagisa.tagging(text)
print(words)
#=> Python/名詞 で/助詞 簡単/形状詞 に/助動詞 使える/動詞 ツール/名詞 です/助動詞

# Get a list of words
print(words.words)
#=> ['Python', 'で', '簡単', 'に', '使える', 'ツール', 'です']

# Get a list of POS-tags
print(words.postags)
#=> ['名詞', '助詞', '形状詞', '助動詞', '動詞', '名詞', '助動詞']

Post-processing functions

Filter and extarct words by the specific POS tags.

import nagisa

# Filter the words of the specific POS tags.
words = nagisa.filter(text, filter_postags=['助詞', '助動詞'])
print(words)
#=> Python/名詞 簡単/形状詞 使える/動詞 ツール/名詞

# Extarct only nouns.
words = nagisa.extract(text, extract_postags=['名詞'])
print(words)
#=> Python/名詞 ツール/名詞

# This is a list of available POS-tags in nagisa.
print(nagisa.tagger.postags)
#=> ['補助記号', '名詞', ... , 'URL']

Add the user dictionary in easy way.

import nagisa

# default
text = "3月に見た「3月のライオン」"
print(nagisa.tagging(text))
#=> 3/名詞 月/名詞 に/助詞 見/動詞 た/助動詞 「/補助記号 3/名詞 月/名詞 の/助詞 ライオン/名詞 」/補助記号

# If a word ("3月のライオン") is included in the single_word_list, it is recognized as a single word.
new_tagger = nagisa.Tagger(single_word_list=['3月のライオン'])
print(new_tagger.tagging(text))
#=> 3/名詞 月/名詞 に/助詞 見/動詞 た/助動詞 「/補助記号 3月のライオン/名詞 」/補助記号

Nagisa provides a built-in Japanese stopwords list.

import nagisa

# default
text = "日本語のストップワードを簡単に利用できます。"
tokens = nagisa.tagging(text)
print(tokens.words)
#=> ['日本', '語', 'の', 'ストップ', 'ワード', 'を', '簡単', 'に', '利用', 'でき', 'ます', '。']

# Filter out stopwords from the tokenized result
words = [word for word in tokens.words if word not in nagisa.stopwords]
print(words)
#=> ['日本', '語', 'ストップ', 'ワード', '簡単', '利用', '。']

Train a model

Nagisa provides a simple train method for a joint word segmentation and sequence labeling (e.g, POS-tagging, NER) model.

The format of the train/dev/test files is tsv. Each line is word and tag and one line is represented by word \t(tab) tag. Note that you put EOS between sentences. Refer to sample datasets and tutorial (Train a model for Universal Dependencies).

$ cat sample.train
唯一	NOUN
の	ADP
趣味	NOU
は	ADP
料理	NOUN
EOS
とても	ADV
おいしかっ	ADJ
た	AUX
です	AUX
。	PUNCT
EOS
ドル	NOUN
は	ADP
主要	ADJ
通貨	NOUN
EOS
import nagisa

# After finish training, save the three model files (*.vocabs, *.params, *.hp).
nagisa.fit(train_file="sample.train", dev_file="sample.dev", test_file="sample.test", model_name="sample")

# Build the tagger by loading the trained model files.
sample_tagger = nagisa.Tagger(vocabs='sample.vocabs', params='sample.params', hp='sample.hp')

text = "福岡・博多の観光情報"
words = sample_tagger.tagging(text)
print(words)
#> 福岡/PROPN ・/SYM 博多/PROPN の/ADP 観光/NOUN 情報/NOUN

Wheel compatibility matrix

Platform CPython 2.7 CPython 3.5 CPython 3.6 CPython 3.7 CPython 3.8 CPython 3.9 CPython 3.10 CPython 3.11 CPython 3.12 CPython 3.13 CPython 3.14 CPython (additional flags: t) 3.14 CPython (wide) 2.7
macosx_10_13_x86_64
macosx_10_15_x86_64
macosx_10_9_x86_64
macosx_11_0_arm64
manylinux1_i686
manylinux1_x86_64
manylinux2010_i686
manylinux2010_x86_64
manylinux2014_aarch64
manylinux2014_i686
manylinux2014_x86_64
manylinux_2_17_aarch64
manylinux_2_17_i686
manylinux_2_17_x86_64
manylinux_2_28_aarch64
manylinux_2_28_i686
manylinux_2_28_x86_64
manylinux_2_5_i686
manylinux_2_5_x86_64
musllinux_1_2_aarch64
musllinux_1_2_i686
musllinux_1_2_x86_64
win_amd64

Files in release

nagisa-0.2.12-cp27-cp27m-manylinux1_i686.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27m-manylinux1_x86_64.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27m-manylinux2010_i686.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27m-manylinux2010_x86_64.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27mu-manylinux1_i686.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27mu-manylinux1_x86_64.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27mu-manylinux2010_i686.whl (20.5MiB)
nagisa-0.2.12-cp27-cp27mu-manylinux2010_x86_64.whl (20.5MiB)
nagisa-0.2.12-cp310-cp310-macosx_10_9_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp310-cp310-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp310-cp310-win_amd64.whl (20.4MiB)
nagisa-0.2.12-cp311-cp311-macosx_10_9_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp311-cp311-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp311-cp311-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp311-cp311-win_amd64.whl (20.4MiB)
nagisa-0.2.12-cp312-cp312-macosx_10_13_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp312-cp312-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp312-cp312-win_amd64.whl (20.4MiB)
nagisa-0.2.12-cp313-cp313-macosx_10_13_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp313-cp313-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp313-cp313-win_amd64.whl (20.4MiB)
nagisa-0.2.12-cp314-cp314-macosx_10_15_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp314-cp314-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp314-cp314-win_amd64.whl (20.6MiB)
nagisa-0.2.12-cp314-cp314t-macosx_10_15_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp35-cp35m-manylinux1_i686.whl (20.5MiB)
nagisa-0.2.12-cp35-cp35m-manylinux1_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp35-cp35m-manylinux2010_i686.whl (20.5MiB)
nagisa-0.2.12-cp35-cp35m-manylinux2010_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp35-cp35m-manylinux2014_aarch64.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-musllinux_1_2_aarch64.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-musllinux_1_2_i686.whl (20.6MiB)
nagisa-0.2.12-cp36-cp36m-musllinux_1_2_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-musllinux_1_2_aarch64.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-musllinux_1_2_i686.whl (20.6MiB)
nagisa-0.2.12-cp37-cp37m-musllinux_1_2_x86_64.whl (20.6MiB)
nagisa-0.2.12-cp38-cp38-macosx_10_9_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp38-cp38-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp38-cp38-win_amd64.whl (20.4MiB)
nagisa-0.2.12-cp39-cp39-macosx_10_9_x86_64.whl (20.4MiB)
nagisa-0.2.12-cp39-cp39-macosx_11_0_arm64.whl (20.4MiB)
nagisa-0.2.12-cp39-cp39-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-musllinux_1_2_aarch64.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-musllinux_1_2_i686.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-musllinux_1_2_x86_64.whl (20.7MiB)
nagisa-0.2.12-cp39-cp39-win_amd64.whl (20.4MiB)
nagisa-0.2.12.tar.gz (20.0MiB)
Extras: None
Dependencies:
six
numpy
DyNet