2017-06-07

playground.tensorflow.orgの仕組み

WebUIで機械学習を学習できるplaygroundというツールが、公開されている。このツールを使ったチュートリアルも散見される。
ここでは、該当ソフトウェア構成を見てみた。ソースコードは、GitHub - tensorflow/playground: Play with neural networks!にある。
動作としては、以下の通りである。

dist以下のディレクトリに、html/css/jsファイル等を配置する。
npmにより起動されたhttpサーバが、dist以下のファイルを提供する。
提供されたhtml/css/js等のファイルは、ブラウザ側で動く。(サーバ側の負荷はなし)

なお、ニューラルネットワークの計算は、src/nn.tsで行っている。

その他の関連ファイル

npm(Javascript用パッケージ管理ツール)
1. package.json (npmの設定ファイル。依存ファイル名等記載)
  1. npm-package.json | npm Documentation
TypeScript(Microsoftが作ったJavascript用メタ言語)
1. Documentation · TypeScript
2. tsconfig.json (TypeScriptの設定ファイル)
  1. http://www.typescriptlang.org/docs/handbook/tsconfig-json.html
3. tslint.json (TypeScriptのチェックツール)
  1. TSLint
4. typings.json (TypeScriptの型定義ツール)
  1. GitHub - typings/typings: *DEPRECATED* The TypeScript Definition Manager

2016-11-02

PythonのGlobal Interpreter Lock (GIL) について

ソフトウェア開発

Python (CPython) では、インタープリターはひとつしかない。このため、複数のスレッドを走らせた場合、ある時点ではひとつのスレッドしか動かない。このためのスレッド同期機構を、Global Interpreter Lock (GIL) という。PythonでのGIL自体は、1992年の初期実装から変更が無い。しかし、Lockのスレッド間引渡し方法が、Python 3.2から変更されている。これにより、CPUインテンシブプロセスでの複数コアで複数スレッドを動かした場合の性能劣化を改善している。
ここでは、2.7と3.5を比較してどのように違うかを説明する。Python 2.7は、interpreter_lock (GIL) と _Py_Ticker　(GIL監視間隔　大まかに命令数100が標準)からなる。なお、切り替え間隔は、sys.setcheckintervalで設定することが出来る。この方式では、スレッド間でロックを取り合って実行を継続する。別コアで動いているスレッド間の場合、ロックの開放情報をシグナルで別コアに送るのは時間がかかる。このため、CPUインテンシブプロセス間のコンテキストスイッチは、ほぼ永遠に発生しない。このため、全体として処理効率が落ちる。
Python 3.5は、gil_locked (GIL) 、gil_drop_request (GIL要求)および gil_interval (GIL監視間隔　5000マイクロ秒 (5ミリ秒) が標準) からなる。なお、切り替え間隔は、sys.setswitchintervalで設定することが出来る。この場合、gil_drop_requestをスレッド間のフラグとして用いる。待ちスレッドがフラグを設定して要求し、一定時間後実行スレッドが、フラグを確認しスレッドのロックを開放し、待ちスレッドに実行件を譲る。これにより、マルチコアでも、効率的にコンテキストスイッチを行うことが出来る。
とはいえ、Python3.2以降に簡単に切り替えられない(たとえば、Python 2.xベースであり、Python 3.xへ移行できない)という場合もありうる。この場合、Pythonプロセスの実行するコアを指定する等の対策が必要と考えられる。言うまでも無く、CPUインテンシブプロセスが6つあれば、6つのコアにそれぞれ指定する等が考えられる。

マニュアル

Python/C API Reference Manual — Python 3.7.3 documentation
- The Very High Level Layer — Python 3.7.3 documentation
- Initialization, Finalization, and Threads — Python 3.7.3 documentation

ソースコード

3.5
- gil_locked/gil_interval (microsec)/INTERVAL等、関数としてはtake_gil等
  - cpython/ceval_gil.h at 3.5 · python/cpython · GitHub
- PyInterpreterState/PyThreadState等
  - cpython/pystate.h at 3.5 · python/cpython · GitHub
- PyEval_EvalFrameEx　=>ループとして、切り替えて動く
  - cpython/ceval.c at 3.5 · python/cpython · GitHub
- コンテキストスイッチ部分
  - cpython/ceval.c at 3.5 · python/cpython · GitHub
- Py_Main
  - cpython/main.c at 3.5 · python/cpython · GitHub
2.7
- コンテキストスイッチ部分
  - cpython/ceval.c at 2.7 · python/cpython · GitHub

その他の参考資料

3.2でのGIL更新(マルチコアでの性能劣化の話が記載されている)
- http://www.dabeaz.com/python/NewGIL.pdf
3.2前後での性能比較等
- Pycon11: Python threads: Dive into GIL!
古いGILでのトラブル例
- linux - Why is my Python app stalled with 'system' / kernel CPU time - Stack Overflow

2016-10-20

　Pythonのバイトコードまで

ソフトウェア開発

Python スクリプトからPythonのバイトコードまで、どうなっているかを追いかけてみた。
具体的には、Python スクリプトから、pycファイルが保存されるまでである。
コマンドラインやスクリプトで実行する場合は以下に相当する

# python -m compileall file.py

import py_compile
py_compile.compile("file.py")

Pythonのスクリプトは、パースされ、AST (抽象構文木) を生成し、CFG (制御フローグラフ) となり、バイトコードとして生成される。CFGから、バイトコードは、assemble()で生成する。
pycの書式は、以下からなっている。(_code_to_bytecode()関数にて)

MAGIC_NUMBER

mtime

size

code

ついでながら、codeは、marshall(シリアライズ)している。

ソースコード
- コンパイルするためのスクリプト
  - cpython/py_compile.py at 3.5 · python/cpython · GitHub
  - 上記のコンパイル部分
    - cpython/_bootstrap_external.py at 3.5 · python/cpython · GitHub
    - cpython/_bootstrap.py at 3.5 · python/cpython · GitHub
- pycの書込み
  - cpython/_bootstrap_external.py at 3.5 · python/cpython · GitHub
- pycを実行するコード
  - cpython/pythonrun.c at 3.5 · python/cpython · GitHub
- MAGIC_NUMBER
  - cpython/_bootstrap_external.py at 3.5 · python/cpython · GitHub
- pycへの書き込みデータ (_code_to_bytecode)
  - cpython/_bootstrap_external.py at 3.5 · python/cpython · GitHub
- バイトコードへのコンパイルから実行までの主なコード群
  - cpython/ceval.c at 3.5 · python/cpython · GitHub(VM engine)
  - cpython/ceval.h at 3.5 · python/cpython · GitHub
  - cpython/compile.c at 3.5 · python/cpython · GitHub(Bytecode compiler)
  - cpython/compile.h at 3.5 · python/cpython · GitHub
  - cpython/frameobject.c at 3.5 · python/cpython · GitHub(execution frames)
  - cpython/frameobject.h at 3.5 · python/cpython · GitHub
  - cpython/opcode.h at 3.5 · python/cpython · GitHub(bytecodes)
  - cpython/code.h at 3.5 · python/cpython · GitHub(PyCodeObject)
  - cpython/pystate.c at 3.5 · python/cpython · GitHub(interpreter state)
  - cpython/pystate.h at 3.5 · python/cpython · GitHub
  - cpython/pythonrun.c at 3.5 · python/cpython · GitHub(entry point)
  - cpython/pythonrun.h at 3.5 · python/cpython · GitHub
ドキュメント
- DevGuide
  - 25. Design of CPython’s Compiler — Python Developer's Guide
- Library
その他
- CPythonVmInternals - Python Wiki
- Pythonの内部構造の授業(10時間)
  - Philip Guo - CPython internals: A ten-hour codewalk through the Python interpreter source code
- Python VM (ceval.c) の動きを説明している。(PyFrameObjectの動き等々)
  - dis/inspect モジュールを使った Python のハッキング
- 動的情報をどこにおいてあるかの説明(PyFrameObject => PyCodeObject)
  - http://jasonleaster.github.io/blog/2016/02/21/architecture-of-python-virtual-machine/
- python2.7の場合
  - Welcome to Python VM Internals Tutorial’s documentation! — Python VM Internals Tutorials 1.0.0 documentation(pycまでの簡単な説明)
  - Understand .pyc files - My notes - Quora
- https://www.usenix.org/legacy/event/woot08/tech/full_papers/portnoy/portnoy.pdf
- python2.5の場合
  - Python internals: Working with Python ASTs - Eli Bendersky's website
  - https://troeger.eu/files/teaching/pythonvm08.pdf#page=11(ソースコードも参照して丁寧に説明)
- 500 Lines or Less | A Python Interpreter Written in Python

2016-09-05

Heat/Autoscaling

ソフトウェア開発

HeatのAutoScalingは、Ceilometer (Newtonからは、Aodh)のサーバ等の負荷の信号を用いて、スケールさせる。
フローとしては、以下の通り

Heat EngineからCeilometerにシグナルを登録
Ceilometerから、Heat_API_CFNに変更を通知
Heat_API_CFNから、Heat Engineに変更を通知 (handle_signalが稼動する)
Heat Engineは、スケールインもしくはアウトする (サーバの起動又は停止)

なお、スケールの変更可否は、Cooldownクラスで行う。

参考資料

AutoScaleの分割
- Split scaling policy into separate files · openstack/heat@91300e4 · GitHub
AutoScaleのテンプレート
- heat-templates/autoscaling.yaml at master · openstack/heat-templates · GitHub
spec
- Reorganize the code structure of resources folder — heat-specs 0.0.1.dev342 documentation
- Reorg AutoScalingGroup Implementation — heat-specs 0.0.1.dev342 documentation
開発者用文書
- OpenStack Docs: Heat Resource Plug-in Development Guide
解説例 (HEAT Engine/API_CFNとCeilometerの連携等)
- Openstack heat & How Autoscaling works
- http://cs.utdallas.edu/wp-content/uploads/2015/09/AutoScaling.pdf

コマンド

deployment-create
- OpenStack Docs: OpenStackClient

ソースコード

handle_signal
heat.api.cfn.v1
- https://github.com/openstack/heat/blob/stable/mitaka/heat/api/cfn/v1/__init__.py
- signal(SignalController)
  - https://github.com/openstack/heat/blob/stable/mitaka/heat/api/cfn/v1/signal.py#L20
その他関連コード
- SIGNAL_TYPES
  - https://github.com/openstack/heat/blob/stable/mitaka/heat/engine/resources/signal_responder.py#L30
- Attributes (FnAttr)
  - https://github.com/openstack/heat/blob/stable/mitaka/heat/engine/attributes.py#L128
- default_deployment_signal_transport
  - https://github.com/openstack/heat/blob/stable/mitaka/heat/common/config.py#L202

2016-09-03

Python2*3(six)での違い(printの改行)

ソフトウェア開発

Pythonの版数が、2から3になったことに伴いprint文の位置づけが、ステートメントから関数に変わっている。このためか改行の方法も、変わっている。
版数違いの互換性を確保するためには、sixを使って動くように直す必要がある。

2016-08-19

OpenStackでのoslo_service/eventlet/greenlet

ソフトウェア開発

OpenStackの各コンポでWSGI (Web Service Gateway Interface) をどう実現しているか?の観点で見ると、eventlet.wsgiを使っている場合が多い。
そして、eventlet.wsgiを効率的に動かすために、Greenletを使っている。なお、Greenletは、C言語拡張モジュールによるPython用のGreenThread実装である。このため、Pythonで実装したよりも、スレッド間切り替えが早い。
また、oslo_serviceを介してeventletを間接的に呼び出しているコンポもある。
2016年8月20日時点 (Newtonリリースの前) のコードでは、eventletのWSGI フレームワークの使用状況は以下の通りである。

コンポ名	eventlet使用している?	oslo_serviceから使用している?
cinder	○	○
glance	○	×
heat	○	×
ironic	○	○
keystone	×	×
neutron	○	○
nova	○	×

参考資料

greenlet
eventlet
- デザインパターン(OpenStack/Python一般)
  - Design Patterns — Eventlet 0.24.1 documentation
  - Eventlet Best Practices — openstack-specs 0.0.1.dev43 documentation
- API
  - Basic Usage — Eventlet 0.24.1 documentation
  - greenpool – Green Thread Pools — Eventlet 0.24.1 documentation
- 履歴
  - https://blog.eventlet.net/2010/08/05/0-9-10-out/
- その他
  - eventlet.zipkinは、0.19時点でマージされず、0.20で追加予定らしい
    - http://events.linuxfoundation.org/sites/events/files/slides/linuxcon15_bando.pdf
    - New feature: Add zipkin tracing to eventlet by yuichib · Pull Request #218 · eventlet/eventlet · GitHub
oslo_service
- eventletのwsgiをOpenStack用にラップしている。
  - oslo.service/wsgi.py at 1.15.0 · openstack/oslo.service · GitHub

一般
- Cinder
  - (Mitakaより)wsgiがeventletからoslo_serviceに切り替わった。
    - Remove eventlet WSGI functionality · openstack/cinder@b4c8bb3 · GitHub
    - Move wsgi to oslo_service.wsgi · openstack/cinder@082235d · GitHub
- Keystone
  - eventletは、Kilo (2015.1.0) から標準外となり、Mitakaから外され、Webサーバはpasteのみに変わった。
    - Gerrit Code Review
  - wsgiとしては、wsgirefが使われている。
    - wsgiref — WSGI Utilities and Reference Implementation — Python 3.7.3 documentation
- Neutron
  - eventlet(legacy)かpecan(pecan)を選択できるようになっている。web_framworkオプションで設定する。
  - eventletは、0.18.3は駄目らしい。
    - neutron/requirements.txt at 8.0.0 · openstack/neutron · GitHub
  - Agentは、Windowsでも稼動するためmonkey_patchの設定修正
    - neutron/eventlet_utils.py at 7.0.1 · openstack/neutron · GitHub

2016-08-13

OpenStackでのPythonコーディング規約チェック

ソフトウェア開発

OpenStackの場合hackingモジュールでPythonコード規約をチェックする。ただし、個別のコード規約は、モジュール毎のHACKING.rstに記載されている。そして、実際の個別チェックツールもhackingディレクトリにスクリプトがおかれている。出力されるエラーコードは、PEP8ベースのE/W系およびhackingベースのH系に分かれる。そのほかに、各コンポごとに設定されたエラーコードが出力される。ふと見るとNは、Nova/Neutronでコード番号がかぶっている。

参考資料
- 各コンポ
  - N系 (Nova) のエラーの説明がある
    - nova/HACKING.rst at master · openstack/nova · GitHub
  - Heat系のエラーの説明がある
    - heat/HACKING.rst at master · openstack/heat · GitHub
  - N系およびC系 (Cinder) のエラーの説明がある(もともとNovaから分岐したためN系がある)
    - cinder/HACKING.rst at master · openstack/cinder · GitHub
  - G系 (Glance) のエラーの説明がある。
    - glance/HACKING.rst at master · openstack/glance · GitHub
  - keystone/HACKING.rst at master · openstack/keystone · GitHub
  - N系 (Neutron) のエラーの説明がある。
    - neutron/HACKING.rst at master · openstack/neutron · GitHub
  - Swiftは、個別チェックHACKING.rstがない。
- E系および、W系のエラーの説明がある(PEP8)
  - PEP 8 -- Style Guide for Python Code | Python.org
  - Introduction — pycodestyle 2.5.0 documentation
- H系のエラー説明がある(Openstack hacking)
  - OpenStack Docs: hacking: OpenStack Hacking Guideline Enforcement
  - GitHub - openstack/hacking: OpenStack Hacking Style Checks
- 日本語での説明