ヒストグラムしたい（`pandas.DataFrame.plot.hist`）

data.plot(kind="hist", bins=ビン数, title="ヒストグラム")
data.plot(kind="hist", bins=ビン数, stacked=True, title="積み上げヒストグラム")
data.plot.hist(by=["カラム名"], bins=ビン数)

pandas.DataFrame.plot.histを使って数値データをヒストグラムにできます。 byオプションにグループ化に使うカラム名を指定します。 binsオプションでビン数を変更できます。デフォルトは10になっています。その他にpandas.DataFrame.histとmatplotlib.pyplot.histのオプションも利用できます。

注釈

カラム内のデータがstr型の場合、このメソッドは使えません（たぶん）。あらかじめデータフレームを集計して、棒グラフ（pandas.DataFrame.plot.bar）を使う必要があります（たぶん）。

重要

ヒストグラムはいろんなことを教えてくれます。実験で測定したデータは、まずヒストグラムにしてその分布を確認しましょう。

参考

ビニングしたい（`bins`）

bins = [0, 10, 20, 30, 40, 50, 60, 70, 80]
data.plot.hist(bins=bins)

binsオプションを使ってビニングを変更できます。デフォルトはbins=10となっていて、均等に10分割されます。リストを指定して、任意の間隔でビニングできます。

統計情報を自動計算したい

def hbar(data, x, bins, xmin, xmax, **kwargs):

    # x で指定したカラムのコピーを作成
    copied = data[[x]].copy()

    # Entries
    entries = len(copied)

    # Underflow
    q = f"{x} < {xmin}"
    uf = copied.query(q).count().iloc[0]

    # Overflow
    q = f"{x} > {xmax}"
    of = copied.query(q).count().iloc[0]

    # Valid
    q = f"{xmin} <= {x} <= {xmax}"
    v = copied.query(q)
    n = len(v)
    mean = v.mean().iloc[0]
    rms = v.std().iloc[0]

    stats = {
        "entries": int(entries),
        "underflow": int(uf),
        "overflow": int(of),
        "mean": mean,
        "rms": rms
        }
    plot = v.plot.hist(bins=bins, **kwargs)
    return plot, stats

ROOTのTH1クラスを真似してヒストグラムを作ってみました。

ヒストグラムしたい（pandas.DataFrame.plot.hist）

ビニングしたい（bins）

統計情報を自動計算したい

ヒストグラムしたい（`pandas.DataFrame.plot.hist`）

ビニングしたい（`bins`）