Trajectory Averaging Tutorial¶
This notebook demonstrates frame-averaged structural analysis in amorphouspy using a 101-frame sodium borosilicate glass trajectory (NaBSiO_glass_trajectory.extxyz).
Topics covered:
- Loading a multi-frame trajectory
- Individual analysis functions — RDF, S(q), Qn, bond angles, rings — each with error bands
- Effect of frame count on statistical uncertainty
from collections import Counter
import matplotlib.pyplot as plt
import numpy as np
from ase.io import read
import amorphouspy as am
1. Load the trajectory¶
ase.io.read with index=":" returns a list[Atoms] — one entry per frame. The 101 frames here come from an equilibrated NVT run at 300 K after a melt-quench cycle.
frames = read("data/NaBSiO_glass_trajectory.extxyz", index=":")
print(f"Frames loaded : {len(frames)}")
print(f"Atoms per frame: {len(frames[0])}")
counts = Counter(frames[0].get_chemical_symbols())
total = sum(counts.values())
print("Composition :", {k: f"{v:.1f}" for k, v in sorted(counts.items())})
Frames loaded : 101
Atoms per frame: 999
Composition : {'B': '90.0', 'Na': '92.0', 'O': '605.0', 'Si': '212.0'}
2. Individual analysis functions with error bands¶
Each analysis function operates on a single Atoms frame. To average over multiple frames, use average_over_frames from amorphouspy.analysis.averaging. It calls the function on each frame, returns (means_tuple, sems_tuple) with element-wise means and standard errors of the mean.
2.1 Radial distribution functions (RDF)¶
compute_rdf returns (r, rdfs, cumcn). average_over_frames averages all positions:
(r, rdfs_mean, cumcn_mean), (_, rdfs_sem, cumcn_sem) = average_over_frames(compute_rdf, frames, ...)
from amorphouspy.properties.structural.averaging import average_over_frames
(r, rdfs_mean, cumcn_mean), (_, rdfs_sem, cumcn_sem) = average_over_frames(
am.compute_rdf,
frames,
r_max=8.0,
n_bins=500,
)
# Show which pairs are available
print("Pairs computed:", list(rdfs_mean.keys()))
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Pairs computed: [(5, 5), (5, 8), (5, 11), (5, 14), (8, 8), (8, 11), (8, 14), (11, 11), (11, 14), (14, 14)]
# Keys use sorted atomic numbers (lower Z first): Si=14, O=8, B=5, Na=11
pairs_to_plot = [(8, 14), (5, 8), (8, 11)] # Si-O, B-O, Na-O
pair_labels = {(8, 14): "Si-O", (5, 8): "B-O", (8, 11): "Na-O"}
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
for pair in pairs_to_plot:
if pair not in rdfs_mean:
continue
g_mean = np.array(rdfs_mean[pair])
g_sem = np.array(rdfs_sem[pair]) * np.sqrt(len(frames)) # SEM = std/sqrt(N), this show the std not the SEM
label = pair_labels[pair]
(line,) = ax.plot(r, g_mean, label=label)
ax.fill_between(r, g_mean - g_sem, g_mean + g_sem, alpha=0.25, color=line.get_color())
ax.set_xlim(0, 5)
ax.set_ylim(0, None)
ax.set_xlabel("r (Å)")
ax.set_ylabel("g(r)")
ax.legend()
plt.tight_layout()
plt.show()
2.2 Structure factor S(q)¶
compute_structure_factor returns (q, sq, partials). average_over_frames unpacks as:
(q, sq_mean, partials_mean), (_, sq_sem, partials_sem) = average_over_frames(compute_structure_factor, frames, ...)
(q, sq_mean, partials_mean), (_, sq_sem, partials_sem) = average_over_frames(
am.compute_structure_factor,
frames,
q_min=0.5,
q_max=10.0,
r_max=15.0,
n_q=500,
)
sq_mean_arr = np.array(sq_mean)
sq_sem_arr = np.array(sq_sem) * np.sqrt(len(frames)) # SEM = std/sqrt(N), this shows the std not the SEM
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
ax.plot(q, sq_mean_arr, label="S(q)")
ax.fill_between(q, sq_mean_arr - sq_sem_arr, sq_mean_arr + sq_sem_arr, alpha=0.25)
ax.set_xlabel("q (Å⁻¹)")
ax.set_ylabel("S(q)")
ax.legend()
plt.tight_layout()
plt.show()
/Users/achrafatila/Documents/Workflows/amorphouspy/amorphouspy/src/amorphouspy/analysis/structure_factor.py:319: UserWarning: r_max=15.0000 Å exceeds half the smallest perpendicular cell height (13.2294 Å). r_max has been automatically adjusted to 13.0 Å (largest integer not exceeding the limit). r, rdfs, _ = compute_rdf(structure, r_max=r_max, n_bins=n_bins, type_pairs=type_pairs)
2.3 Qn speciation¶
average_over_frames unpacks as:
(total_qn_mean, partial_qn_mean), (total_qn_sem, partial_qn_sem) = average_over_frames(compute_qn, frames, ...)
The cutoff is the Si-O first-shell cutoff read from the RDF minimum.
# Determine Si-O cutoff from the averaged RDF (pair key: lower Z first, so O=8 < Si=14)
si_o_rdf = np.array(rdfs_mean.get((8, 14), []))
si_o_cutoff = am.find_rdf_minimum(r, si_o_rdf) if len(si_o_rdf) else 2.0
print(f"Si-O cutoff: {si_o_cutoff:.3f} Å")
(total_qn_mean, partial_qn_mean), (total_qn_sem, partial_qn_sem) = average_over_frames(
am.compute_qn,
frames,
cutoff=si_o_cutoff,
former_types=[14], # Si
o_type=8,
)
print("\nFormer Qn distribution (mean ± SEM):")
for n in sorted(total_qn_mean):
print(f" Q{n}: {total_qn_mean[n]:.1f} ± {total_qn_sem[n]:.1f}")
Si-O cutoff: 1.816 Å Former Qn distribution (mean ± SEM): Q0: 2.0 ± 0.0 Q1: 14.0 ± 0.0 Q2: 60.2 ± 0.1 Q3: 90.2 ± 0.1 Q4: 44.1 ± 0.1 Q5: 1.5 ± 0.1 Q6: 0.0 ± 0.0
ns = sorted(total_qn_mean.keys())
means = [total_qn_mean[n] for n in ns]
errors = [total_qn_sem[n] for n in ns]
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
ax.bar([f"Q{n}" for n in ns], means, yerr=errors, capsize=4, color="steelblue")
ax.set_ylabel("Fraction")
plt.tight_layout()
plt.show()
2.4 Bond angle distribution¶
compute_angles returns (bin_centers, hist). average_over_frames unpacks as:
(bin_centers, hist_mean), (_, hist_sem) = average_over_frames(compute_angles, frames, ...)
Here we compute the O-Si-O intra-tetrahedral angle (center_type=14, neighbor_type=8), which should peak near 109.5°.
# O-Si-O angle: center atom is Si (14), neighbor atoms are O (8)
(bin_centers, hist_mean), (_, hist_sem) = average_over_frames(
am.compute_angles,
frames,
center_type=14,
neighbor_type=8,
cutoff=si_o_cutoff,
)
hist_mean_arr = np.array(hist_mean)
hist_sem_arr = np.array(hist_sem) * np.sqrt(len(frames)) # SEM = std/sqrt(N), this shows the std not the SEM
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
ax.plot(bin_centers, hist_mean_arr, label="O-Si-O")
ax.fill_between(bin_centers, hist_mean_arr - hist_sem_arr, hist_mean_arr + hist_sem_arr, alpha=0.25)
ax.set_xlim(0, 180)
ax.set_ylim(0, None)
ax.set_xlabel("Angle (°)")
ax.set_ylabel("Frequency")
ax.legend()
plt.tight_layout()
plt.show()
2.5 Ring statistics¶
average_over_frames unpacks as:
(histogram_mean, mean_size_mean), (histogram_sem, mean_size_sem) = average_over_frames(compute_guttmann_rings, frames, ...)
bond_lengths = am.generate_bond_length_dict(
frames[0], specific_cutoffs={("Si", "O"): 2.0, ("B", "O"): 1.8}, default_cutoff=0.0
)
(histogram_mean, mean_size_mean), (histogram_sem, mean_size_sem) = average_over_frames(
am.compute_guttmann_rings,
frames,
bond_lengths=bond_lengths,
)
print(f"Mean ring size: {mean_size_mean:.2f} ± {mean_size_sem:.2f} members")
ring_sizes = sorted(histogram_mean.keys())
h_mean = [histogram_mean[s] for s in ring_sizes]
h_sem = [histogram_sem.get(s, 0.0) for s in ring_sizes]
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
ax.bar(ring_sizes, h_mean, yerr=h_sem, capsize=4, color="steelblue")
ax.set_xlabel("Ring size")
ax.set_ylabel("Count (mean)")
plt.tight_layout()
plt.show()
Finding rings: 100%|██████████| 553/553 [00:00<00:00, 6139.52edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 7253.05edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 7293.08edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7299.46edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7227.80edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7290.13edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7387.88edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7383.41edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7408.11edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7297.20edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7336.98edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7257.11edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7327.70edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7286.85edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7372.33edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7298.70edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7346.17edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7345.33edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7295.76edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7323.81edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7296.18edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7333.21edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7175.63edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 6356.44edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7360.31edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7433.73edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7263.07edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7348.67edge/s] Finding rings: 100%|██████████| 552/552 [00:00<00:00, 7323.79edge/s] Finding rings: 100%|██████████| 552/552 [00:00<00:00, 7275.03edge/s] Finding rings: 100%|██████████| 552/552 [00:00<00:00, 7217.83edge/s] Finding rings: 100%|██████████| 552/552 [00:00<00:00, 7348.74edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 6856.76edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 6636.98edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 6715.26edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6709.06edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6866.92edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 6927.27edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7117.99edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 6904.32edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6515.45edge/s] Finding rings: 100%|██████████| 553/553 [00:00<00:00, 6860.37edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7160.14edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 6974.57edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6735.29edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6985.67edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 7042.33edge/s] Finding rings: 100%|██████████| 552/552 [00:00<00:00, 6177.24edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7217.32edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7128.95edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7183.53edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 6868.48edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 6782.46edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 6837.25edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 6684.12edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 6959.60edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 7184.39edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7142.10edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7116.43edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 6058.94edge/s] Finding rings: 100%|██████████| 551/551 [00:00<00:00, 7192.35edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7266.78edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7211.06edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7160.63edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7208.16edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7166.70edge/s] Finding rings: 100%|██████████| 546/546 [00:00<00:00, 7174.02edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7109.63edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7121.79edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7193.94edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7195.94edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7165.60edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7263.50edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7161.38edge/s] Finding rings: 100%|██████████| 546/546 [00:00<00:00, 7224.92edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7207.12edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7211.42edge/s] Finding rings: 100%|██████████| 545/545 [00:00<00:00, 7136.49edge/s] Finding rings: 100%|██████████| 546/546 [00:00<00:00, 7237.34edge/s] Finding rings: 100%|██████████| 553/553 [00:00<00:00, 7131.50edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7188.95edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7171.60edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7154.14edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 6297.10edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7120.24edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7244.67edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7275.47edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7115.52edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7187.63edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7121.52edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7205.24edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7200.45edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7116.39edge/s] Finding rings: 100%|██████████| 549/549 [00:00<00:00, 7283.69edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7144.90edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7236.34edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7287.62edge/s] Finding rings: 100%|██████████| 547/547 [00:00<00:00, 7275.77edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7137.84edge/s] Finding rings: 100%|██████████| 550/550 [00:00<00:00, 7161.00edge/s] Finding rings: 100%|██████████| 548/548 [00:00<00:00, 7199.53edge/s]
Mean ring size: 4.93 ± 0.00 members
3. Effect of frame count on uncertainty¶
The standard error of the mean shrinks as $1/\sqrt{N}$. We can verify this empirically by computing the total S(q) standard error of the mean for increasing numbers of frames.
frame_counts = [5, 10, 20, 50, 101]
mean_sems = []
for n in frame_counts:
(_, sq_mean_n, _), (_, sq_sem_n, _) = average_over_frames(
am.compute_structure_factor,
frames[:n],
q_min=0.5,
q_max=15.0,
n_q=500,
)
mean_sems.append(float(np.mean(np.abs(sq_sem_n))))
print(f" N={n:>3d} mean |SEM| = {mean_sems[-1]:.5f}")
N= 5 mean |SEM| = 0.00204 N= 10 mean |SEM| = 0.00141 N= 20 mean |SEM| = 0.00099 N= 50 mean |SEM| = 0.00067 N=101 mean |SEM| = 0.00050
ref_scale = mean_sems[0] * np.sqrt(frame_counts[0])
ref_x = np.linspace(frame_counts[0], frame_counts[-1], 200)
fig, ax = plt.subplots(figsize=(3.5, 3.5 / 1.333), dpi=300)
ax.plot(frame_counts, np.asarray(mean_sems) * 100, "o-", label="Measured SEM")
ax.plot(ref_x, (ref_scale / np.sqrt(ref_x)) * 100, "--", label="1/√N reference")
ax.set_xlabel("Number of frames (N)")
ax.set_ylabel("Mean |SEM| * 100 of S(q)")
ax.legend()
plt.tight_layout()
plt.show()
The measured standard error of the mean closely follows the $1/\sqrt{N}$ curve, confirming that frames are effectively decorrelated and that the averaging is working as expected.
Summary¶
| What | How |
|---|---|
| Load trajectory | ase.io.read(path, index=":") → list[Atoms] |
| Average any analysis function | average_over_frames(fn, frames, **kwargs) → (means, sems) |
| Plot mean ± SEM ribbon | ax.fill_between(x, mean - sem, mean + sem, alpha=0.25) |
| Plot mean ± std ribbon | ax.fill_between(x, mean - sem*√N, mean + sem*√N, alpha=0.25) |