Intro to the collection interface
The base.LabeledCollection
interface is one of the fundamental structures in PlesioGeostroPy.
It is a class designed for convenient manipulation (esp. symbolic manipulation) of a physically meaningful collection of variables / fields / equations.
To understand why this is necessary, we just need to compare the PG model with the 3D counterpart, i.e. MHD equations. In the original MHD equations, the unknown fields are the vector velocity field \(\mathbf{U}\), vector magnetic field \(\mathbf{B}\), and perhaps scalar temperature field \(\Theta\). If Mie decomp is used, then these are further reduced to five scalars \(T_u\), \(P_u\), \(T_b\), \(P_b\) and \(\Theta\). Either way, the set of unknown variables is small, well defined, and well organized by their physical meanings, and so are the governing equations. Manipulating these variables can be easily done by writing codes for each equation / variable.
Things become much more annoying when dealing with PG equations and variables. First, we are now dealing with at least 15 variables, and it is no longer desirable to write a code snippet for every single variable / equation. Second, although all the variables have clear physical meaning, their conglomeration does not. The velocity field merely occupies one variable, while magnetic field occupies eight, within which there are both symmetric integrals and anti-symmetric integrals. The existence of boundary quantities poses another complication.
The design of class base.LabeledCollection
is hence directed at fulfilling two purposes. First, it allows batch manipulation of all variables within this collection, without the need to write the code for each component. Second, when field-specific manipulation is necessary, the class admits extracting one or more fields by calling their “names”. Underneath, the class is really just a class whose attributes can be traversed with an iterator, but it provides the flexibility to support both batch and specific manipulation.
import os, sys
root_dir = "."
# The following 2 lines are a hack to import from parent directory
# If the notebook is run in the root directory, comment out these 2 lines
sys.path.append(os.path.dirname(os.getcwd()))
root_dir = ".."
from pg_utils.pg_model import base
LabeledCollection: initialization and operations
Now let us look at an example of how this class can be used. Granted, there are not so many circumstances where one wants a collection to be both iterable and indexable by name. The following example may not be the most necessary, but it gives an idea how this interface functions.
Let us consider the meta-data of an article, or any published material really. It has fields such as “title”, “DOI”, etc. We can build a LabeledCollection around this structure.
The construction of the base.LabeledCollection
class starts with a list of field names, followed by the values of these fields. For the example, we can construct an entry using:
import datetime
article_meta = base.LabeledCollection(
["Title", "Abstract", "Authors", "Date", "AccessDate", "DOI", "Journal"],
Title="The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly",
Abstract=("We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field "
"between 1999 and 2020 based on magnetic field observations "
"collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, "
"and on annual differences of monthly means of ground observatory measurements."),
Authors=["C. C. Finlay", "C. Kloss", "N. Olsen", "M. D. Hammer", "L. Tøffner-Clausen", "A. Grayver", "A. Kuvshinov"],
Date=datetime.datetime(2020, 10, 20),
AccessDate=datetime.datetime(2023, 3, 7),
DOI="10.1186/s40623-020-01252-9",
Journal="Earth, Planets and Space"
)
The field names can be accessed via
article_meta._field_names
['Title', 'Abstract', 'Authors', 'Date', 'AccessDate', 'DOI', 'Journal']
And the fields within this data structure is arranged according to this list, which is the first parameter passed in. For instance, the first field will be title
article_meta[0]
'The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly'
and the last field will be the name of the journal
article_meta[-1]
'Earth, Planets and Space'
The same field can also be accessed by the name of the field, i.e. “Journal”:
article_meta["Journal"]
'Earth, Planets and Space'
At this point one might think that a dictionary as a container is sufficient: the values can be accessed via the key (“call me by my name”, if you will), and can also be accessed via the index, say using dict[list(dict.keys())[index]]
or list(dict.values())[index]
. However, there are two problems. First, it is usually not right to assume that the index within a dictionary/map is stable. Second, designing a specific data structure would give you more control as to how you can use it. For instance, we can also call the value as an attribute:
article_meta.Authors
['C. C. Finlay',
'C. Kloss',
'N. Olsen',
'M. D. Hammer',
'L. Tøffner-Clausen',
'A. Grayver',
'A. Kuvshinov']
Iterator: traversing the collection
One of the key things enabled by assigning an index to each field is that you can iterate through the collection. You can do this by
iterating through the field names:
for field_name in article_meta._field_names:
print("{:12s}: {:s}".format(field_name, str(article_meta[field_name])))
Title : The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly
Abstract : We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.
Authors : ['C. C. Finlay', 'C. Kloss', 'N. Olsen', 'M. D. Hammer', 'L. Tøffner-Clausen', 'A. Grayver', 'A. Kuvshinov']
Date : 2020-10-20 00:00:00
AccessDate : 2023-03-07 00:00:00
DOI : 10.1186/s40623-020-01252-9
Journal : Earth, Planets and Space
or simply iterating through the field values. Your call.
for field_value in article_meta:
print(str(field_value))
The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly
We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.
['C. C. Finlay', 'C. Kloss', 'N. Olsen', 'M. D. Hammer', 'L. Tøffner-Clausen', 'A. Grayver', 'A. Kuvshinov']
2020-10-20 00:00:00
2023-03-07 00:00:00
10.1186/s40623-020-01252-9
Earth, Planets and Space
There are also some built-in syntax sugars for simpler manipulation of a collection.
One of them is the apply
function, which applies a processing method to all of the fields.
Say you (for some reason) want to make a 1950s Swiss style poster of this article, and you want all letters to be in lowercase (see examples here).
def string_lower(field_value):
if isinstance(field_value, str):
return field_value.lower()
elif isinstance(field_value, list):
return [item.lower() for item in field_value]
else:
return field_value
article_meta_swiss_style = article_meta.apply(string_lower)
for field_name in article_meta_swiss_style._field_names:
print("{:12s}: {:s}".format(field_name, str(article_meta_swiss_style[field_name])))
Title : the chaos-7 geomagnetic field model and observed changes in the south atlantic anomaly
Abstract : we present the chaos-7 model of the time-dependent near-earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-earth orbit satellites swarm, cryosat-2, champ, sac-c and ørsted, and on annual differences of monthly means of ground observatory measurements.
Authors : ['c. c. finlay', 'c. kloss', 'n. olsen', 'm. d. hammer', 'l. tøffner-clausen', 'a. grayver', 'a. kuvshinov']
Date : 2020-10-20 00:00:00
AccessDate : 2023-03-07 00:00:00
DOI : 10.1186/s40623-020-01252-9
Journal : earth, planets and space
You can also optionally pass in the name of the field. For instance, say you want the dates to be expressed in days since 2000:
def date_ref_2000(field_name, field_value):
if field_name[-4:] == "Date":
return (field_value - datetime.datetime(2000, 1, 1)) / datetime.timedelta(days=1)
else:
return field_value
article_meta_ref_2000 = article_meta.apply(date_ref_2000, metadata=True)
for field_name in article_meta_ref_2000._field_names:
print("{:12s}: {:s}".format(field_name, str(article_meta_ref_2000[field_name])))
Title : The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly
Abstract : We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.
Authors : ['C. C. Finlay', 'C. Kloss', 'N. Olsen', 'M. D. Hammer', 'L. Tøffner-Clausen', 'A. Grayver', 'A. Kuvshinov']
Date : 7598.0
AccessDate : 8466.0
DOI : 10.1186/s40623-020-01252-9
Journal : Earth, Planets and Space
Serialization
The last useful function is to serialize the collection. This can be particularly useful when you want to export the collection.
Serialization of a collection converts the collection into a list of tuples. By default, all of the items are converted to strings.
article_meta.serialize()
[('Title',
'The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly'),
('Abstract',
'We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.'),
('Authors',
"['C. C. Finlay', 'C. Kloss', 'N. Olsen', 'M. D. Hammer', 'L. Tøffner-Clausen', 'A. Grayver', 'A. Kuvshinov']"),
('Date', '2020-10-20 00:00:00'),
('AccessDate', '2023-03-07 00:00:00'),
('DOI', '10.1186/s40623-020-01252-9'),
('Journal', 'Earth, Planets and Space')]
You can specify specific conversions for field values by specifying the serializer
parameter.
For instance, you want everything to remain the same, without the need of conversion:
article_meta.serialize(serializer=lambda x: x)
[('Title',
'The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly'),
('Abstract',
'We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.'),
('Authors',
['C. C. Finlay',
'C. Kloss',
'N. Olsen',
'M. D. Hammer',
'L. Tøffner-Clausen',
'A. Grayver',
'A. Kuvshinov']),
('Date', datetime.datetime(2020, 10, 20, 0, 0)),
('AccessDate', datetime.datetime(2023, 3, 7, 0, 0)),
('DOI', '10.1186/s40623-020-01252-9'),
('Journal', 'Earth, Planets and Space')]
This way list remains a list, and datetimes remain datetimes.
The serialized object contains all the information of the collection, and can be used to reconstruct a collection:
article_meta_reconstructed = base.LabeledCollection.deserialize(article_meta.serialize(serializer=lambda x: x))
for field_name in article_meta_reconstructed._field_names:
print("{:12s}: {:s}".format(field_name, str(article_meta_reconstructed[field_name])))
Title : The CHAOS-7 geomagnetic field model and observed changes in the South Atlantic Anomaly
Abstract : We present the CHAOS-7 model of the time-dependent near-Earth geomagnetic field between 1999 and 2020 based on magnetic field observations collected by the low-Earth orbit satellites Swarm, CryoSat-2, CHAMP, SAC-C and Ørsted, and on annual differences of monthly means of ground observatory measurements.
Authors : ['C. C. Finlay', 'C. Kloss', 'N. Olsen', 'M. D. Hammer', 'L. Tøffner-Clausen', 'A. Grayver', 'A. Kuvshinov']
Date : 2020-10-20 00:00:00
AccessDate : 2023-03-07 00:00:00
DOI : 10.1186/s40623-020-01252-9
Journal : Earth, Planets and Space
Specializations of collections
There are two major specializations of collections in the PG model: base.CollectionPG
, and base.CollectionConjugate
.
These both inherit from the base class base.LabeledCollection
, and has pratically the same functionality, except for that the field names are pre-defined.
CollectionPG
For base.CollectionPG
, the fields correspond to the PG variables:
base.CollectionPG.pg_field_names
['Psi',
'Mss',
'Mpp',
'Msp',
'Msz',
'Mpz',
'zMss',
'zMpp',
'zMsp',
'Bs_e',
'Bp_e',
'Bz_e',
'dBs_dz_e',
'dBp_dz_e',
'Br_b',
'Bs_p',
'Bp_p',
'Bz_p',
'Bs_m',
'Bp_m',
'Bz_m']
Several key variables are of this type. For instance, the collection of the PG variables:
from pg_utils.pg_model import core
type(core.pgvar)
pg_utils.pg_model.base.CollectionPG
These field values are sympy.Function
s that represent the corresponding PG field
for field_name in core.pgvar._field_names:
print("{:s}:".format(field_name))
display(core.pgvar[field_name])
Psi:
Mss:
Mpp:
Msp:
Msz:
Mpz:
zMss:
zMpp:
zMsp:
Bs_e:
Bp_e:
Bz_e:
dBs_dz_e:
dBp_dz_e:
Br_b:
Bs_p:
Bp_p:
Bz_p:
Bs_m:
Bp_m:
Bz_m:
CollectionConjugate
Similar variables exist for the conjugate/transformed fields:
base.CollectionConjugate.cg_field_names
['Psi',
'M_1',
'M_p',
'M_m',
'M_zp',
'M_zm',
'zM_1',
'zM_p',
'zM_m',
'B_ep',
'B_em',
'Bz_e',
'dB_dz_ep',
'dB_dz_em',
'Br_b',
'B_pp',
'B_pm',
'Bz_p',
'B_mp',
'B_mm',
'Bz_m']
type(core.cgvar)
pg_utils.pg_model.base.CollectionConjugate
for field_name in core.cgvar._field_names:
print("{:s}:".format(field_name))
display(core.cgvar[field_name])
Psi:
M_1:
M_p:
M_m:
M_zp:
M_zm:
zM_1:
zM_p:
zM_m:
B_ep:
B_em:
Bz_e:
dB_dz_ep:
dB_dz_em:
Br_b:
B_pp:
B_pm:
Bz_p:
B_mp:
B_mm:
Bz_m:
For more detailed information on the variables, please refer to Demo_Variables.
Transformation between PG and transformed variables
The conjugate / transformed variables form another set of variables that is mathematically equivalent to the PG variables, but are desired in that they admit simpler spectral expansions to fulfill regularity conditions. If you are unsure what it means, please refer to the formulation PDF.
PlesioGeostroPy provides the interface for linearly transforming one set of variables to another.
For instance, core.PG_to_conjugate
gives the linear transform from PG to conjugate / transformed variables:
import sympy
cgvar_in_pgvar = core.PG_to_conjugate(core.pgvar)
for field_name in core.cgvar._field_names:
display(sympy.Eq(core.cgvar[field_name], cgvar_in_pgvar[field_name].expand(), evaluate=False))
Similarly, core.conjugate_to_PG
converts transformed/conjugate variables to PG counterparts.
import sympy
pgvar_in_cgvar = core.conjugate_to_PG(core.cgvar)
for field_name in core.pgvar._field_names:
display(sympy.Eq(core.pgvar[field_name], pgvar_in_cgvar[field_name].expand(), evaluate=False))
The transforms core.conjugate_to_PG
and core.PG_to_conjugate
have several applications.
It is used in the module
core
to construct mappings from transformed variables (PG variables) to PG variables (transformed variables);It is used in the module
equations
to derive the transformed equations from original PG equations;It also works on numerical arrays, and can be used in post-processing to construct PG fields from transformed quantities.