!date
!python --version
Thu Jun  6 02:11:57 AM UTC 2024
Python 3.10.12

22. 係り受けを用いた目的語抽出+ネットワーク描画の例#

自然言語処理における代表的な情報抽出には、(1) 表面的な文字そのものを指定する方法、(2) 品詞を指定する方法、(3) 係り受け関係を指定する方法、(4) 分散表現による距離を参照する方法がある。

ここでは係り受け関係の例として「目的語 => 係り受け先」という関係を抽出してみよう。例えば「例題を取り入れて理解しやすくしてほしい。」という文においては「例題 => 取り入れ」がこの関係に相当する。この例では「例題」が目的語(UDではtoken.dep_ == obj)となり、「取り入れ」が係り受け先(UDではtoken.head)に相当する。

  • NOTE

    • ネットワーク描画のために networkx, pyviz を利用している。これらは pip install でインストール可能。

    • networkx はネットワーク描画や分析等に使われるライブラリ。本来ならこれだけで済むことが多いが、日本語には未対応のため pyviz (グラフ描画ライブラリ)も利用している。

    • pyvis注意点。

      • Google Colab ではやや取り扱いに難がある。具体的には pip install 後に一旦カーネルに再接続し直す必要がある。

      • Pythonスクリプトとして実行するか、ノートブック(ipynb)として実行するかによりコードがやや異なる。具体的にはコード内コメントを参照。

  • 参考

# spacy, ginzaインストール
!pip install -U ginza ja_ginza

# matplotlib で日本語フォントを使うための環境構築
!pip install japanize-matplotlib
!apt-get -y install fonts-ipafont-gothic # 日本語フォント

# ネットワーク分析
!pip install networkx
!pip install pyvis    # グラフ描画ライブラリ。networkxと組み合わせて日本語化対応用。
Collecting ginza
  Downloading ginza-5.2.0-py3-none-any.whl (21 kB)
Collecting ja_ginza
  Downloading ja_ginza-5.2.0-py3-none-any.whl (59.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.1/59.1 MB 12.5 MB/s eta 0:00:00
?25hRequirement already satisfied: spacy<4.0.0,>=3.4.4 in /usr/local/lib/python3.10/dist-packages (from ginza) (3.7.4)
Collecting plac>=1.3.3 (from ginza)
  Downloading plac-1.4.3-py2.py3-none-any.whl (22 kB)
Collecting SudachiPy<0.7.0,>=0.6.2 (from ginza)
  Downloading SudachiPy-0.6.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 24.3 MB/s eta 0:00:00
?25hCollecting SudachiDict-core>=20210802 (from ginza)
  Downloading SudachiDict_core-20240409-py3-none-any.whl (72.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.0/72.0 MB 6.2 MB/s eta 0:00:00
?25hRequirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (8.2.3)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (0.9.4)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (6.4.0)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (4.66.4)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (2.31.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (2.7.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (3.1.4)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (67.7.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (24.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (3.4.0)
Requirement already satisfied: numpy>=1.19.0 in /usr/local/lib/python3.10/dist-packages (from spacy<4.0.0,>=3.4.4->ginza) (1.25.2)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.10/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy<4.0.0,>=3.4.4->ginza) (1.2.0)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<4.0.0,>=3.4.4->ginza) (0.7.0)
Requirement already satisfied: pydantic-core==2.18.4 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<4.0.0,>=3.4.4->ginza) (2.18.4)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<4.0.0,>=3.4.4->ginza) (4.12.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<4.0.0,>=3.4.4->ginza) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<4.0.0,>=3.4.4->ginza) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<4.0.0,>=3.4.4->ginza) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<4.0.0,>=3.4.4->ginza) (2024.6.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<4.0.0,>=3.4.4->ginza) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<4.0.0,>=3.4.4->ginza) (0.1.5)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /usr/local/lib/python3.10/dist-packages (from typer<0.10.0,>=0.3.0->spacy<4.0.0,>=3.4.4->ginza) (8.1.7)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.4.0,>=0.1.0->spacy<4.0.0,>=3.4.4->ginza) (0.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy<4.0.0,>=3.4.4->ginza) (2.1.5)
Requirement already satisfied: marisa-trie>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<4.0.0,>=3.4.4->ginza) (1.1.1)
Installing collected packages: SudachiPy, plac, SudachiDict-core, ginza, ja_ginza
Successfully installed SudachiDict-core-20240409 SudachiPy-0.6.8 ginza-5.2.0 ja_ginza-5.2.0 plac-1.4.3
Collecting japanize-matplotlib
  Downloading japanize-matplotlib-1.1.3.tar.gz (4.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 28.8 MB/s eta 0:00:00
?25h  Preparing metadata (setup.py) ... ?25l?25hdone
Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from japanize-matplotlib) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (1.2.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (4.53.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (1.4.5)
Requirement already satisfied: numpy>=1.20 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (24.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (3.1.2)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib->japanize-matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib->japanize-matplotlib) (1.16.0)
Building wheels for collected packages: japanize-matplotlib
  Building wheel for japanize-matplotlib (setup.py) ... ?25l?25hdone
  Created wheel for japanize-matplotlib: filename=japanize_matplotlib-1.1.3-py3-none-any.whl size=4120257 sha256=7ff9483c41dc806854d65678a7cbf1c1f67330bd3a9b0067678465822372ecf4
  Stored in directory: /root/.cache/pip/wheels/61/7a/6b/df1f79be9c59862525070e157e62b08eab8ece27c1b68fbb94
Successfully built japanize-matplotlib
Installing collected packages: japanize-matplotlib
Successfully installed japanize-matplotlib-1.1.3
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  fonts-ipafont-mincho
The following NEW packages will be installed:
  fonts-ipafont-gothic fonts-ipafont-mincho
0 upgraded, 2 newly installed, 0 to remove and 45 not upgraded.
Need to get 8,237 kB of archives.
After this operation, 28.7 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-gothic all 00303-21ubuntu1 [3,513 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 fonts-ipafont-mincho all 00303-21ubuntu1 [4,724 kB]
Fetched 8,237 kB in 1s (8,699 kB/s)
Selecting previously unselected package fonts-ipafont-gothic.
(Reading database ... 121918 files and directories currently installed.)
Preparing to unpack .../fonts-ipafont-gothic_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-gothic (00303-21ubuntu1) ...
Selecting previously unselected package fonts-ipafont-mincho.
Preparing to unpack .../fonts-ipafont-mincho_00303-21ubuntu1_all.deb ...
Unpacking fonts-ipafont-mincho (00303-21ubuntu1) ...
Setting up fonts-ipafont-mincho (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-mincho/ipam.ttf to provide /usr/share/fonts/truetype/fonts-japanese-mincho.ttf (fonts-japanese-mincho.ttf) in auto mode
Setting up fonts-ipafont-gothic (00303-21ubuntu1) ...
update-alternatives: using /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf to provide /usr/share/fonts/truetype/fonts-japanese-gothic.ttf (fonts-japanese-gothic.ttf) in auto mode
Processing triggers for fontconfig (2.13.1-4.2ubuntu5) ...
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (3.3)
Collecting pyvis
  Downloading pyvis-0.3.2-py3-none-any.whl (756 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 756.0/756.0 kB 9.7 MB/s eta 0:00:00
?25hRequirement already satisfied: ipython>=5.3.0 in /usr/local/lib/python3.10/dist-packages (from pyvis) (7.34.0)
Requirement already satisfied: jinja2>=2.9.6 in /usr/local/lib/python3.10/dist-packages (from pyvis) (3.1.4)
Requirement already satisfied: jsonpickle>=1.4.1 in /usr/local/lib/python3.10/dist-packages (from pyvis) (3.0.4)
Requirement already satisfied: networkx>=1.11 in /usr/local/lib/python3.10/dist-packages (from pyvis) (3.3)
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (67.7.2)
Collecting jedi>=0.16 (from ipython>=5.3.0->pyvis)
  Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 57.5 MB/s eta 0:00:00
?25hRequirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (4.4.2)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (0.7.5)
Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (5.7.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (3.0.45)
Requirement already satisfied: pygments in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (2.16.1)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (0.2.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython>=5.3.0->pyvis) (4.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2>=2.9.6->pyvis) (2.1.5)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython>=5.3.0->pyvis) (0.8.4)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython>=5.3.0->pyvis) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=5.3.0->pyvis) (0.2.13)
Installing collected packages: jedi, pyvis
Successfully installed jedi-0.19.1 pyvis-0.3.2

22.1. 利用ライブラリの用意、データセット準備#

事前に、load_r_assesment.ipynb でデータセットを作成し、pkl形式でファイル保存(r_assesment.pkl)しておく。今回は作成済みファイルをダウンロードして利用することにする。

r_assesment.pklは授業評価アンケートの自由記述欄をpd.DataFrame形式で保存したもので、授業名(title)、学年(grade)、必修か否か(required)、質問番号(q_id)、コメント(comment)で構成される。

!curl -O https://ie.u-ryukyu.ac.jp/~tnal/2022/dm/static/r_assesment.pkl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 34834  100 34834    0     0  11591      0  0:00:03  0:00:03 --:--:-- 11595
import collections
import numpy as np
import pandas as pd
import spacy

import networkx as nx
import matplotlib.pyplot as plt
import japanize_matplotlib

nlp = spacy.load("ja_ginza")

assesment_df = pd.read_pickle('r_assesment.pkl')
assesment_df.head()
title grade required q_id comment
0 工業数学Ⅰ 1 True Q21 (1) 特になし
1 工業数学Ⅰ 1 True Q21 (2) 正直わかりずらい。むだに間があるし。
2 工業数学Ⅰ 1 True Q21 (2) 例題を取り入れて理解しやすくしてほしい。
3 工業数学Ⅰ 1 True Q21 (2) 特になし
4 工業数学Ⅰ 1 True Q21 (2) スライドに書く文字をもう少しわかりやすくして欲しいです。
from spacy import displacy

sample = assesment_df['comment'][2]
print('sample = ', sample)

doc = nlp(sample)
displacy.render(doc, style="dep", options={"compact":True})
sample =  例題を取り入れて理解しやすくしてほしい。
例題 NOUN ADP 取り入れ VERB SCONJ 理解 NOUN AUX やすく AUX VERB SCONJ ほしい。 AUX obj case advcl mark obl aux aux mark fixed
from spacy.symbols import obj

for token in doc:
    if token.dep == obj:
        print(token.text, token.head.text)
例題 取り入れ

22.2. 目的語とその係り受け先を抽出してみよう#

{‘例題_取り入れ’:1], ,,,}のように obj と token.head をアンダースコアで結んだ文字列を作り、その文字列が出現した回数も同時にカウントしてみる。

from spacy.symbols import obj

result = {}
for text in assesment_df['comment']:
    doc = nlp(text)
    for token in doc:
        if token.dep == obj:
            target = token.lemma_ + '_' + token.head.lemma_
            if target not in result:
                result[target] = 1
            else:
                result[target] += 1

print(collections.Counter(result))
Counter({'課題_出す': 3, '講義_受ける': 3, 'こと_学ぶ': 2, '講義_通す': 2, '話_聞く': 2, '予習_する': 2, '解答_出す': 2, '単位_落とす': 2, '知識_獲得': 2, '仮説_立てる': 2, '手法_使う': 2, '例題_取り入れる': 1, '文字_する': 1, '行列_扱う': 1, '演習_出す': 1, '時間_おく': 1, '何_話す': 1, '道徳_学ぶ': 1, '想い_考える': 1, '品性_伸ばす': 1, 'こと_学べる': 1, 'こと_知れる': 1, '考え方_持てる': 1, '興味_持つ': 1, '心_学べる': 1, '側面_見れる': 1, 'これ_考える': 1, '計画_立てる': 1, '将来_想像': 1, 'アドバイス_もらえる': 1, 'マナー_学べる': 1, 'ある_知る': 1, 'ニーズ_聞ける': 1, '友達_作る': 1, '作成_通す': 1, '将来_見通す': 1, '機会_設ける': 1, '対面_増やす': 1, '課題_変更': 1, 'プログラミング_学べる': 1, 'それ_完成': 1, '復習_こなせる': 1, '基礎_学ぶ': 1, '課題_見る': 1, 'コード_良い': 1, 'プログラミング_触る': 1, '予想_上回る': 1, '復習_する': 1, '言語_学ぶ': 1, '課題_やる': 1, 'プログラミング_学ぶ': 1, 'こと_覚える': 1, '知識_生かす': 1, '演習_通す': 1, '基礎_定着': 1, 'デンチュウ_動かす': 1, 'denchu_通す': 1, '方_知る': 1, 'コード_書く': 1, 'denchu_動かす': 1, 'プログラミング_楽しい': 1, 'こと_実践': 1, '事_落とし込む': 1, 'メモ書き_消す': 1, 'こと_書く': 1, 'こと_聞く': 1, '環境_作る': 1, '点数_教える': 1, '提出_行う': 1, 'コマンド_打つ': 1, '方_招待': 1, '資料_理解': 1, '授業_理解': 1, 'テスト_延期': 1, '試験_続ける': 1, 'まとめ方_する': 1, '内容_補強': 1, '不便_感ずる': 1, 'メール_読む': 1, 'チャンネル_開設': 1, '内容_アウトプット': 1, 'ペン_使う': 1, '説明_聞く': 1, '何_言う': 1, '点数_公開': 1, '点_送る': 1, '採点_願う': 1, '試験_実施': 1, '試験_行なう': 1, 'ら_用いる': 1, '勉強_する': 1, 'ところ_調べる': 1, '採点_使う': 1, 'プログラム_作る': 1, '内容_網羅': 1, '提出期間_過ぎる': 1, '試験_する': 1, '解決策_考える': 1, '問題_解決': 1, '試験_受ける': 1, '環境_整える': 1, '回答_出す': 1, '理解_測る': 1, 'こと_する': 1, '試験_もつ': 1, '間違い_指摘': 1, '授業_受ける': 1, '説明_書く': 1, '難易度_する': 1, '過去問_配布': 1, '要点_押さえる': 1, '不満_書く': 1, '部分_ミス': 1, 'こと_望む': 1, '自学_増やす': 1, '質問_する': 1, '説明_する': 1, '動き_理解': 1, '欲_言う': 1, '緊張感_持てる': 1, '課題_再提出': 1, '間隔_空ける': 1, '時間_作る': 1, '知識_利用': 1, '知識_使う': 1, '研究_踏まえる': 1, '講義_する': 1, '録画_残す': 1, '大切_学ぶ': 1, '成績_決定': 1, '資料_写す': 1, '使う_学ぶ': 1, '配分_間違える': 1, '結果_出す': 1, 'こと_使う': 1, 'こと_実感': 1, 'アイディア_出す': 1, '物事_伝える': 1, 'スキル_向上': 1, '書_紹介': 1, '力_活かす': 1, '情報_得る': 1, '時間_とる': 1, '顔_合わせる': 1, '説明_行く': 1, '手_焼く': 1, '報告会_行う': 1, 'javascript_用いる': 1, 'アプリケーション_開発': 1, 'コーディング_教える': 1, 'フレームワーク_試す': 1})

22.3. ネットワーク描画してみる#

「目的語=>係り受け先単語」の関係を「単語間動詞を接続したエッジ」として可視化してみよう。まずはpd.DataFrame形式で「係り受け元単語、係り受け先単語、出現回数」として保存し直す。

columns = ['from', 'to', 'weight']
df = pd.DataFrame(columns=columns)

index = 0
nodes = []
for key, value in result.items():
    node1, node2 = key.split('_')
    if node1 not in nodes:
        nodes.append(node1)
    if node2 not in nodes:
        nodes.append(node2)
    df.loc[index] = [node1, node2, value]
    index += 1

df
from to weight
0 例題 取り入れる 1
1 文字 する 1
2 行列 扱う 1
3 演習 出す 1
4 時間 おく 1
... ... ... ...
146 報告会 行う 1
147 javascript 用いる 1
148 アプリケーション 開発 1
149 コーディング 教える 1
150 フレームワーク 試す 1

151 rows × 3 columns

22.4. networkx + pyvizで描画#

  • networkxはネットワークの描画・操作・解析などに使われるライブラリ。

  • pyvizはグラフ描画ライブラリ。

  • グラフ描画するだけなら networkx + matplotlib でも可能だが、日本語には未対応で文字化けしてしまうため、ここではpyvizを採用している。

import networkx as nx
from pyvis.network import Network

G = nx.from_pandas_edgelist(df, source='from', target='to', edge_attr='weight')

pyvis_G = Network(notebook=True, cdn_resources='in_line') # .pyで実行するならFalse
pyvis_G.from_nx(G)
pyvis_G.save_graph("mygraph.html")
#pyvis_G.show('mygraph.html')

22.5. グラフ参照する手順#

  • “mygraph.html” をダウンロード

  • ブラウザで新規タブを作成

  • mygraph.html をダブルクリック。もしくはブラウザ新規タブにドラッグ&ドロップ。