pmcpy: A Python package for PartMC post-processing
Author: Dr. Zhonghua Zheng
About
Introduction
This package is used to post-process PartMC data (see the reference here). Additionally, a mixing state calculator can be used to calculate the mixing state index.
You can play with the mixing state calculator HERE
Relevant Publications
Zheng, Z., Curtis, J. H., Yao, Y., Gasparik, J. T., Anantharaj, V. G., Zhao, L., et al. (2021). Estimating submicron aerosol mixing state at the global scale with machine learning and Earth system modeling. Earth and Space Science, 8, e2020EA001500. https://doi.org/10.1029/2020EA001500
Zheng, Z., West, M., Zhao, L., Ma, P.-L., Liu, X., and Riemer, N.: Quantifying the structural uncertainty of the aerosol mixing state representation in a modal model, Atmos. Chem. Phys., 21, 17727–17741, https://doi.org/10.5194/acp-21-17727-2021, 2021.
Riemer, N. and West, M.: Quantifying aerosol mixing state with entropy and diversity measures, Atmos. Chem. Phys., 13, 11423–11439, https://doi.org/10.5194/acp-13-11423-2013, 2013.
Installation
Step 1: create a conda environment
$ conda create -n pmcpy python=3.8
$ conda activate pmcpy
$ conda install -c conda-forge numpy pandas xarray netcdf4
Step 2: install using pip
$ pip install pmcpy
(optional) install from source::
$ git clone https://github.com/zzheng93/pmcpy.git
$ cd pmcpy
$ python setup.py install
pmcpy quickstart
[1]:
# import necessary package
import pmcpy
import xarray as xr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
load the raw output file from PartMC
[2]:
# define the path to data
p = "../data/urban_plume_0001_00000002.nc"
pmc = pmcpy.load_pmc(p)
number concentration
[3]:
print("overall number conc.:", pmc.get_num_conc(), "# m^{-3}")
print("diameter<=2.5um:", pmc.get_num_conc(pmc.get_aero_particle_diameter()<=2.5e-6), "# m^{-3}")
print("diameter>=2.5um:", pmc.get_num_conc(pmc.get_aero_particle_diameter()>=2.5e-6), "# m^{-3}")
overall number conc.: 8854807162.459946 # m^{-3}
diameter<=2.5um: 8854806282.274422 # m^{-3}
diameter>=2.5um: 880.1855249722205 # m^{-3}
aerosol mass concentration of selected species
[4]:
# consider BC and OC only
aero_species_ls = ["BC","OC"]
print("BC and OC mass conc,:",
pmc.get_aero_mass_conc(aero_species_ls),
"kg m^{-3}")
# consider all species and dry species
all_aero_species = pmc.aero_species.copy()
dry_aero_species = all_aero_species.copy()
dry_aero_species.remove("H2O")
print("overall mass conc.:",
pmc.get_aero_mass_conc(all_aero_species).sum(),
"kg m^{-3}")
print("overall dry mass conc.:",
pmc.get_aero_mass_conc(dry_aero_species).sum(),
"kg m^{-3}")
print("PM2.5 mass conc. (with water):",
pmc.get_aero_mass_conc(all_aero_species,
part_cond=(pmc.get_aero_particle_diameter()<=2.5e-6)).sum(),
"kg m^{-3}")
BC and OC mass conc,: [6.53030409e-10 5.73984931e-09] kg m^{-3}
overall mass conc.: 2.2186891469474523e-08 kg m^{-3}
overall dry mass conc.: 1.316284475347949e-08 kg m^{-3}
PM2.5 mass conc. (with water): 2.2164289553077785e-08 kg m^{-3}
overall mass concentration
[5]:
print("overall mass conc.:",
pmc.get_mass_conc(dry=False), "kg m^{-3}")
print("overall dry mass conc.:",
pmc.get_mass_conc(dry=True), "kg m^{-3}")
print("PM2.5 mass conc. (with water):",
pmc.get_mass_conc(dry=False, part_cond=pmc.get_aero_particle_diameter()<=2.5e-6), "kg m^{-3}")
overall mass conc.: 2.2186891469474523e-08 kg m^{-3}
overall dry mass conc.: 1.316284475347949e-08 kg m^{-3}
PM2.5 mass conc. (with water): 2.2164289553077805e-08 kg m^{-3}
aerosol density
[6]:
print("overall density.:",
pmc.get_aero_density(dry=False), "kg m^{-3}")
print("overall dry density:",
pmc.get_aero_density(dry=True), "kg m^{-3}")
print("PM2.5 density (with water):",
pmc.get_aero_density(dry=False, part_cond=pmc.get_aero_particle_diameter()<=2.5e-6), "kg m^{-3}")
overall density.: 1179.020687087021 kg m^{-3}
overall dry density: 2265.3505453667535 kg m^{-3}
PM2.5 density (with water): 1178.3768174927013 kg m^{-3}
gas mixing ratio
[7]:
gas_list = ['CO','O3']
pmc.get_gas_mixing_ratio(gas_list)
[7]:
<xarray.DataArray 'gas_mixing_ratio' (gas_species: 2)> array([354.311384, 43.069374]) Coordinates: * gas_species (gas_species) int32 17 11 Attributes: unit: ppb long_name: mixing ratios of gas species
- gas_species: 2
- 354.3 43.07
array([354.311384, 43.069374])
- gas_species(gas_species)int3217 11
- names :
- H2SO4,HNO3,HCl,NH3,NO,NO2,NO3,N2O5,HONO,HNO4,O3,O1D,O3P,OH,HO2,H2O2,CO,SO2,CH4,C2H6,CH3O2,ETHP,HCHO,CH3OH,ANOL,CH3OOH,ETHOOH,ALD2,HCOOH,RCOOH,C2O3,PAN,ARO1,ARO2,ALK1,OLE1,API1,API2,LIM1,LIM2,PAR,AONE,MGLY,ETH,OLET,OLEI,TOL,XYL,CRES,TO2,CRO,OPEN,ONIT,ROOH,RO2,ANO2,NAP,XO2,XPAR,ISOP,ISOPRD,ISOPP,ISOPN,ISOPO2,API,LIM,DMS,MSA,DMSO,DMSO2,CH3SO2H,CH3SCH2OO,CH3SO2,CH3SO3,CH3SO2OO,CH3SO2CH2OO,SULFHOX
- description :
- dummy dimension variable (no useful value) - read species names as comma-separated values from the 'names' attribute
array([17, 11], dtype=int32)
- unit :
- ppb
- long_name :
- mixing ratios of gas species
mixing state index calculation
based on group_list
[8]:
group_list = [['SO4','Cl','ARO1','ARO2','ALK1','OLE1',
'API1','API2','LIM1','LIM2','Na'],
['BC','OC','OIN']]
pmc.get_mixing_state_index(group_list=group_list, diversity=False)
[8]:
0.838104186813985
based on all species (without water
)
[9]:
pmc.get_mixing_state_index(drop_list=["H2O"], diversity=False)
[9]:
0.6471920046172537
specific range of diameters
and display diversity
(\(D_\alpha\), \(D_\gamma\), \(\chi\))
[10]:
pmc.get_mixing_state_index(drop_list=["H2O"],
part_cond=pmc.get_aero_particle_diameter()<=2.5e-6,
diversity=True)
[10]:
(3.1876143541221276, 4.355134759609095, 0.6520198176412458)
particle size distribution visualization (number concentration v.s. diameter)
overall number concentration
[11]:
# get number distribution
aero_diameter = pmc.get_aero_particle_diameter()
num_conc_per_particle = pmc.ds["aero_num_conc"]
# setup the 111 bins ranged from 10^-8 to 10^-6
bins = np.logspace(-8,-6,2*50+1)
# plot the number distribution
plt.hist(aero_diameter, bins=bins, weights=num_conc_per_particle)
plt.xscale('log')
plt.xlabel('diameter [m]')
plt.ylabel(r'num conc. [# m$^{-3}$]')
plt.show()

number concentration for particles with diameters between 1\(\mu\)m and 2.5\(\mu\)m
[12]:
# get the diameters of all particles
aero_diameter = pmc.get_aero_particle_diameter()
# select particles with diameters between 1um and 2.5um
part_cond = (aero_diameter<=2.5e-6) & (aero_diameter>=1.0e-6)
aero_diameter_cond = aero_diameter[part_cond]
num_conc_per_particle_cond = pmc.ds["aero_num_conc"][part_cond]
# setup the 111 bins ranged from 10^-8 to 10^-6
bins = np.logspace(-7,-5,2*50+1)
# plot the number distribution
plt.hist(aero_diameter_cond, bins=bins, weights=num_conc_per_particle_cond)
plt.xscale('log')
plt.xlabel('diameter [m]')
plt.ylabel(r'num conc. [# m$^{-3}$]')
plt.show()

mixing state calculator
row
is a particle
, each column
is the mass
of a species
)file_name
and group_list
belowsurrogate species
Run
button on the top of your jupyter notebookPlease edit the “file_name” and “group_list”, then run the code below
[1]:
file_name = "sample_data.csv"
group_list = [["BC","POM"],
["DUST"],
["SS"]]
# ====== please don't change ======
import numpy as np
import pandas as pd
import pmcpy
# load data
df = pd.read_csv(file_name)
# get matrix for calculation
da = np.concatenate([df[group].values.sum(axis=1).reshape(-1,1) for group in group_list],axis=1)
# calculate mixing state index
D_alpha, D_gamma, chi = pmcpy.get_chi(da)
print("mixing state index:", chi)
print("average particle (alpha) species diversity:", D_alpha)
print("bulk population (gamma) species diversity:", D_gamma)
mixing state index: 0.9408323695414793
average particle (alpha) species diversity: 2.838147712915256
bulk population (gamma) species diversity: 2.9537462489849164
benchmarking
This script is used to compare the pmcpy results with urban_plume_process.F90
[1]:
# import necessary package
import pmcpy
import xarray as xr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# define the path to data
p = "../data/"
results from ``pmcpy``
[2]:
# define a dictionary to save the results
d = {"tot_num_conc": [], "tot_mass_conc":[],
"chi":[], "d_alpha":[], "d_gamma":[],
"chi_a":[], "d_alpha_a":[], "d_gamma_a":[]}
# define the surrogate groups for mixing state calculation
group_list = [["OC","BC"],
["API1","API2","LIM1","LIM2"],
["SO4","NO3","NH4"]]
# loop across scenarios using pmcpy
for i in range(1,26):
pmc = pmcpy.load_pmc(p+"/urban_plume_0001_000000"+str(i).zfill(2)+".nc")
d["tot_num_conc"].append(pmc.get_num_conc())
d["tot_mass_conc"].append(pmc.get_mass_conc(dry=False))
# calculate mixing state for grouped species with water
D_alpha, D_gamma, chi = pmc.get_mixing_state_index(group_list, diversity=True)
d["d_alpha"].append(D_alpha)
d["d_gamma"].append(D_gamma)
d["chi"].append(chi)
# calculate mixing state for all species (without water)
D_alpha_a, D_gamma_a, chi_a = pmc.get_mixing_state_index(drop_list=["H2O"], diversity=True)
d["d_alpha_a"].append(D_alpha_a)
d["d_gamma_a"].append(D_gamma_a)
d["chi_a"].append(chi_a)
compare the results from Fortran postprocessing ``urban_plume_process.F90``
[3]:
ds_b = xr.open_dataset(p+"urban_plume_process.nc")
# ======= comparison =======
for k in d:
print("######",k,"######")
fig = plt.figure(figsize=(9,2))
ax1 = fig.add_subplot(131)
pd.Series(ds_b[k].values).plot(ax=ax1)
ax1.set_title("Fortran")
ax2 = fig.add_subplot(132)
pd.Series(d[k]).plot(ax=ax2)
ax2.set_title("pmcpy (Python)")
ax3 = fig.add_subplot(133)
pd.Series(ds_b[k].values-np.array(d[k])).plot(ax=ax3)
ax3.set_title("Difference")
plt.tight_layout()
plt.show()
###### tot_num_conc ######

###### tot_mass_conc ######

###### chi ######

###### d_alpha ######

###### d_gamma ######

###### chi_a ######

###### d_alpha_a ######

###### d_gamma_a ######

format of PartMC raw output
Here is a sample of PartMC raw output
[1]:
import xarray as xr
p = "../data/urban_plume_0001_00000002.nc"
ds = xr.open_dataset(p)
ds
[1]:
<xarray.Dataset> Dimensions: (gas_species: 77, aero_species: 20, aero_source: 8, aero_weight_group: 2, aero_weight_class: 8, aero_particle: 1246, aero_removed: 4636) Coordinates: * gas_species (gas_species) int32 1 2 3 4 5 ... 73 74 75 76 77 * aero_species (aero_species) int32 1 2 3 4 5 ... 17 18 19 20 * aero_source (aero_source) int32 1 2 3 4 5 6 7 8 * aero_weight_group (aero_weight_group) int32 1 2 * aero_weight_class (aero_weight_class) int32 1 2 3 4 5 6 7 8 * aero_particle (aero_particle) int32 1 2 3 4 ... 1244 1245 1246 * aero_removed (aero_removed) int32 1 2 3 4 ... 4634 4635 4636 Data variables: (12/48) time float64 3.6e+03 timestep float64 60.0 timestep_index int32 2 repeat int32 1 temperature float64 292.5 relative_humidity float64 0.8126 ... ... aero_refract_core_real (aero_particle) float64 0.0 0.0 0.0 ... 0.0 0.0 aero_refract_core_imag (aero_particle) float64 0.0 0.0 0.0 ... 0.0 0.0 aero_core_vol (aero_particle) float64 0.0 0.0 0.0 ... 0.0 0.0 aero_removed_id (aero_removed) int32 134 602 715 ... 5534 5783 aero_removed_action (aero_removed) int32 1 1 1 4 3 3 ... 1 1 1 1 1 4 aero_removed_other_id (aero_removed) int32 32767 32767 ... 32767 0 Attributes: title: PartMC version 2.5.0 output file source: PartMC version 2.5.0 UUID: A96876F0-F62E-45FB-AC47-2FB221548641 history: 2021-09-22T15:54:26.773-05:00 created by PartMC version 2.5.0 Conventions: CF-1.4
particles visualization
This script is used for generating and plotting particles
[1]:
import matplotlib.pyplot as plt
import numpy as np
[2]:
mat = np.random.lognormal(size=(100,6))
radius = np.random.normal(size=(100,1))
colors = ["#000000","#FF7F0E","#1F77B4","#7efc3f","#e027e6","#D62728"]
[3]:
fig, axes = plt.subplots(10, 10, figsize=(12,8))
axes=axes.ravel()
# plot each pie chart in a separate subplot
for i in range(100):
axes[i].pie(mat[i,],radius=radius[i], colors=colors)
plt.tight_layout()
plt.show()

How to ask for help
The GitHub issue tracker is the primary place for bug reports.