pysep¶
Submodules¶
Attributes¶
Classes¶
Functions¶
|
Interactive/scripting function to run PySep and return quality controlled, |
|
Specfem3D outputs seismograms to ASCII (.sem? or .sem.ascii) files. |
|
Convert a SPECFEM STATIONS file into an ObsPy Inventory object. |
|
Addition to the base ObsPy.read_events() function that, in addition to the |
Package Contents¶
- class pysep.Pysep(client='IRIS', minlatitude=None, minlongitude=None, maxlatitude=None, maxlongitude=None, user=None, password=None, use_mass_download=False, client_debug=False, timeout=600, llnl_db_path=None, event_selection='default', origin_time=None, event_latitude=None, event_longitude=None, event_depth_km=None, event_magnitude=None, seconds_before_event=20, seconds_after_event=20, reference_time=None, seconds_before_ref=100, seconds_after_ref=300, extra_download_pct=0.005, networks='*', stations='*', locations='*', channels='*', station_ids=None, mindistance_km=0, maxdistance_km=20000.0, minazimuth=0, maxazimuth=360, remove_clipped=True, remove_insufficient_length=True, remove_masked_data=True, fill_data_gaps=False, gap_fraction=1.0, detrend=True, demean=True, taper_percentage=0.0, rotate=None, resample_freq=None, scale_factor=1, remove_response=True, output_unit='VEL', water_level=60, pre_filt='default', phase_list=None, taup_model='ak135', config_file=None, log_level='DEBUG', legacy_naming=False, overwrite_event_tag=None, write_files='inv,event,stream,sac,config_file,station_list', plot_files='all', output_dir=None, overwrite=False, **kwargs)[source]¶
Download, preprocess, and save waveform data using ObsPy
Note
Parameters for general data gathering control
- Parameters:
client (str) – ObsPy FDSN client to query data from, e.g., IRIS, LLNL, NCEDC or any FDSN clients accepted by ObsPy. Defaults to ‘IRIS’
minlatitude (float) – for event, station and waveform retrieval. Defines the minimum latitude for a rectangular bounding box that is used to search for data. Only used for events if `event_selection`==’search’
maxlatitude (float) – for event, station and waveform retrieval. Defines the maximum latitude for a rectangular bounding box that is used to search for data. Only used for events if `event_selection`==’search’
minlongitude (float) – for event, station and waveform retrieval. Defines the minimum longitude for a rectangular bounding box that is used to search for data. Only used for events if `event_selection`==’search’
maxlongitude (float) – for event, station and waveform retrieval. Defines the maximum longitude for a rectangular bounding box that is used to search for data. Only used for events if `event_selection`==’search’
user (str) – User ID if IRIS embargoes data behind passwords. This is passed into the instantiation of client.
password (str) – Password if IRIS embargoes data behind passwords. This is passed into the instantiation of ‘client’
use_mass_download (bool) – Use ObsPy’s mass download option to download all available stations in the region regardless of data provider.
client_debug (bool) – turn on DEBUG mode for the ObsPy FDSN client, which outputs information-rich log messages to std out. Use for debugging when FDSN fails mysteriously.
timeout (float) – time out time in units of seconds, passed to the client to determine how long to wait for return data before exiting. Defaults to 600s.
llnl_db_path (str) – If `client`==’LLNL’, PySEP assumes we are accesing data from the LLNL waveeform database (which must be stored local). Points to the path where this is saved.
Note
Event selection parameters
- Parameters:
event_selection (str) – How to define the Event which is used to define the event origin time and hypocentral location. - ‘default’: User defines Event origin_time, and location with event_latitude and event_longitude - ‘search’: PySEP will use client to search for a Catalog event defined by event_origintime, event_magnitude and event_depth_km. Buffer time around the origin_time can be defined by seconds_before_event and seconds_after_event.
origin_time (str) – the event origin time used as a central reference point for data gathering. Must be in a string format that is recognized by ObsPy UTCDateTime. For example ‘2000-01-01T00:00:00’.
event_latitude (float) – latitude of the event in units of degrees. used for defining the event hypocenter and for removing stations based on distance from the event.
event_longitude (float) – longitude of the event in units of degrees. used for defining the event hypocenter and for removing stations based on distance from the event.
event_depth_km (float or NoneType) –
depth of event in units of kilometers. postive values for deeper depths. Used for:
`event_selection`==’search’
estimating phase arrivals with TauP
plotting events and title on source receiver maps
If set to None, (2) and (3) will fail. Best-guesses are acceptable.
event_magnitude (float or NoneType) – event magnitude in Mw used for `event_selection`==’search’ and source receiver map plotting. If provided as None, map plotting will fail.
seconds_before_event (float) – For event selection only, only used if event_selection`==’search’. Time [s] before given `origin_time to search
seconds_after_event – For event selection only, only used if event_selection`==’search’. Time [s] after given `origin_time to search for a matching catalog event from the given client
Note
Waveform and station metadata gathering parameters
- Parameters:
reference_time (str) – Waveform origin time. If not given, defaults to the event origin time. This allows for a static time shift from the event origin time, e.g., if there are timing errors with relation to the origin_time. Defaults to NoneType (origin_time).
seconds_before_ref (float) – For waveform fetching. Defines the time before reference_time to fetch waveform data. Units [s]
seconds_after_ref (float) – For waveform fetching. Defines the time after reference_time to fetch waveform data. Units [s]
extra_download_pct (float) – extra download percentage. Adds a buffer around origin_time + seconds_before_ref + extra_download_pct (also -seconds_after_ref), which gathers a bit of extra data which will be trimmed away. Used because gathering data directly at the requested time limits may lead to shorter expected waveforms after resampling or preprocessing procedures. Given as a percent [0,1], defaults to .5%.
networks (str) – name or names of networks to query for, if names plural, must be a comma-separated list, i.e., ‘AK,AT,AV’. Wildcards okay, defaults to ‘*’.
stations (str) – station name or names to query for. If multiple stations, input as a list of comma-separated values, e.g., ‘STA01,STA02,STA03’. Wildcards acceptable, if using wildcards, use a ‘-’ to exclude stations (e.g., ‘,-STA01’ will gather all stations available, except STA01. Defaults to ‘’
locations (str) – locations name or names to query for, wildcard okay. See stations for inputting multiple location values. Default ‘*’.
channels (str) – channel name or names to query for, wildcard okay. If multiple stations, input as a list of comma-separated values, e.g., ‘HH?,BH?’. Wildcards acceptable. Defaults to ‘*’.
station_ids (list of str) – an alternative to gathering based on individual codes, allow the user to input a direct list of trace IDs which will be broken up and used to gather waveforms and metadata. NOTE: OVERRIDES network, stations, locations, and channels, these parameters will NOT be used. Station ids should be provided as: [‘NN.SSS.LL.CCC’, …]
Note
Station removal and curtailing parameters
- Parameters:
mindistance_km (float) –
Used for removing stations and mass download option
Removing stations: Remove any stations who are closer than the
given minimum distance away from event (units: km). Always applied - Mass Download: If use_mass_download is True and `domain_type`==’circular’, defines the minimum radius around the event hypocenter to gather waveform data and station metadata
maxdistance_km (float) –
Used for removing stations and mass download option
Removing stations: Remove any stations who are farther than the
given maximum distance away from event (units: km). Always applied - Mass Download: If use_mass_download is True and `domain_type`==’circular’, defines the maximum radius around the event hypocenter to gather waveform data and station metadata
minazimuth (float) – for station removal. stations whose azimuth relative to the event hypocenter that do not fall within the bounds [minazimuth, maxazimuth] are removed from the final list. Defaults to 0 degrees.
minazimuth – for station removal. stations whose azimuth relative to the event hypocenter that do not fall within the bounds [minazimuth, maxazimuth] are removed from the final list. Defaults to 360 degrees.
remove_clipped (bool) – remove any clipped stations from gathered stations. Checks the max amplitude of against a maximum value expected for a 24 bit signal. Defaults False
remove_insufficient_length (bool) – remove waveforms whose trace length does not match the average (mode) trace length in the stream. Defaults to True
remove_masked_data (bool) – If fill_data_gaps is False or None, data with gaps that go through the merge process will contain masked arrays (essentially retaining gaps). By default, PySEP will remove these data during processing. To keep this data, set remove_masked_data == True.
fill_data_gaps (str or int or float or bool) –
How to deal with data gaps (missing sections of waveform over a continuous time span). False by default, which means data with gaps are removed completely. Users who want access to data with gaps must choose how gaps are filled. See API for ObsPy.core.stream.Stream.merge() for how merge is handled:
Options include:
’mean’: fill with the mean of all data values in the gappy data
<int or float>: fill with a constant, user-defined value, e.g.,
0 or 1.23 or 9.999 - ‘interpolate’: linearly interpolate from the last value pre-gap to the first value post-gap - ‘latest’: fill with the last value of pre-gap data - False: do not fill data gaps, which will lead to stations w/ data gaps being removed.
NOTE: Be careful about data types, as there are no checks that the fill value matches the internal data types. This may cause unexpected errors.
gap_fraction (float) – if fill_data_gaps is not None, determines the maximum allowable fraction (percentage) of data that gaps can comprise. For example, a value of 0.3 means that 30% of the data (in samples) can be gaps that will be filled by fill_data_gaps. Traces with gap fractions that exceed this value will be removed. Defaults to 1. (100%) of data can be gaps.
Note
Data processing parameters
- Parameters:
detrend (bool) – apply simple linear detrend as the first preprocessing step
demean (bool) – apply demeaning to data during instrument reseponse removal. Only applied if remove_response == True.
taper_percentage (float) – apply a taper to the waveform with ObsPy taper, fraction between 0 and 1 as the percentage of the waveform to be tapered Applied generally used when data is noisy, e.g., HutchisonGhosh2016 Note: To get the same results as the default taper in SAC, use max_percentage=0.05 and leave type as hann. Tapering also happens while resampling (see util_write_cap.py). Only applied if remove_response == True.
rotate (list of str or NoneType) –
choose how to rotate the waveform data. pre-rotation processing will be applied. Can include the following options (order insensitive):
ZNE: Rotate from arbitrary components to North, East, Up
RTZ: Rotate from ZNE to Radial, Transverse, Up
UVW: Rotate from ZNE to orthogonal UVW orientation
If set to None, no rotation processing will take place.
resample_freq (float) – frequency to resample data in units Hz. If not given, no data resampling will take place. Defaults to NoneType
scale_factor (float) – scale all data by a constant factor Note: for CAP use 10**2 (to convert m/s to cm/s). Defaults to NoneType (no scaling applied)
Note
Instrument response removal parameters
- Parameters:
remove_response (bool) – remove instrument response using station response information gathered from client. Defaults to True.
output_unit (str) – the output format of the waveforms if instrument response removal is applied. Only relevant if `remove_response`==True. See ObsPy.core.trace.Trace.remove_response for acceptable values. Typical values are: ‘DISP’, ‘VEL’, ‘ACC’ (displacement [m], velocity [m/s], acceleration [m/s^2]).
water_level (float or None) – a water level threshold to apply during filtering for small values. Passed to Obspy.core.trace.Trace.remove_response
pre_filt (str, tuple or NoneType) –
apply a pre-filter to the waveforms before deconvolving instrument response. Options are:
’default’: automatically calculate (f0, f1, f2, f3) based on the
length of the waveform (dictating longest allowable period) and the sampling rate (dictating shortest allowable period). This is the default behavior. * NoneType: do not apply any pre-filtering * tuple of float: (f0, f1, f2, f3) define the corners of your pre filter in units of frequency (Hz)
Note
SAC header control parameters
- Parameters:
phase_list (list of str) – phase names to get ray information from TauP with. Defaults to ‘ttall’, which is ObsPy’s default for getting all phase arrivals. Must match Phases expected by TauP (see ObsPy TauP documentation for acceptable phases). Earliest P and S phase arrivals will be added to SAC headers, the remainder will be discarded.
taup_model (str) – name of TauP model to use to calculate phase arrivals See also phase_list which defines phases to grab arrival data for. Defaults to ‘AK135’. See ObsPy TauP documentation for avilable models.
Note
PySEP Configuration parameters
- Parameters:
config_file (str) – path to YAML configuration file which will be used to overwrite internal default parameters. Used for command-line version of PySEP
log_level (str) – Level of verbosity for the internal PySEP logger. In decreasing order of verbosity: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘CRITICAL’
legacy_naming (bool) – if True, revert to old PySEP naming schema for event tags, which is responsible for naming the output directory and SAC files. Legacy filenames look something like ‘20000101000000.NN.SSS.LL.CC.c’ (event origin time, network, station, location, channel, component). Default to False
overwrite_event_tag (str or bool) –
option to allow the user to set their own event tag, rather than the automatically generated one.
- NoneType (default): use automatically generated event tag which
consists of event origin time and Flinn-Engdahl region
- ’’: empty string will dump ALL files into output_dir, no new
directories will be made
- str: User-defined event tag which will be created in output_dir,
all files will be stored in {output_dir}/{overwrite_event_tag}/*
Note
Output file and figure control
- Parameters:
write_files (str or NoneType) –
Which files to write out after data gathering.
User-defined comma-separated list of the following
weights_az: write out CAP weight file sorted by azimuth
weights_dist: write out CAP weight file sorted by distance
weights_code: write out CAP weight file sorted by station code
station_list: write out a text file with station information
inv: save a StationXML (.xml) file (ObsPy inventory)
event: save a QuakeML (.xml) file (ObsPy Catalog)
stream: save an ObsPy stream in Mseed (.ms) (ObsPy Stream)
config_file: save YAML config file w/ all input parameters
sac: save all waveforms as SAC (.sac) files w/ correct headers
- sac_raw: save raw waveforms. these are straight from the
data center with no quality check and no SAC headers
sac_zne: save only ZNE channel SAC files
sac_rtz: save only RTZ channel SAC files
sac_uvw: save only UVW channel SAC files
Example input: `write_files`==’inv,event,stream,sac’ By Default: ‘inv,event,stream,sac,config_file,station_list’ 2) If NoneType or an empty string, no files will be written. 3) If ‘all’, write all files listed in (1)
plot_files (str or NoneType) –
What to plot after data gathering. Should be a comma-separated list of the following:
map: plot a source-receiver map with event and all stations
record_section: plot a record section with default parameters
all: plot all of the above (default value)
If None, no files will be plotted.
output_dir (str) – path to output directory where all the files and figures defined by write_files and plot_files will be stored. Defaults to the current working directory.
overwrite (bool) – If True, overwrite an existing PySEP event directory. This prevents Users from re-downloading data. Defaults to False.
- event_tag = None¶
- config_file = None¶
- client = 'IRIS'¶
- client_debug = False¶
- timeout = 600¶
- _user = None¶
- _password = None¶
- taup_model = 'ak135'¶
- use_mass_download = False¶
- _extra_download_pct = 0.005¶
- event_selection = 'default'¶
- seconds_before_event = 20¶
- seconds_after_event = 20¶
- event_latitude = None¶
- event_longitude = None¶
- event_depth_km = None¶
- event_magnitude = None¶
- networks = '*'¶
- stations = '*'¶
- channels = '*'¶
- locations = '*'¶
- station_ids = None¶
- reference_time¶
- seconds_before_ref = 100¶
- seconds_after_ref = 300¶
- llnl_db_path = '/store/raw/LLNL/UCRL-MI-222502/westernus.wfdisc'¶
- mindistance_km = 0¶
- maxdistance_km = 20000.0¶
- minazimuth = 0¶
- maxazimuth = 360¶
- minlatitude = None¶
- maxlatitude = None¶
- minlongitude = None¶
- maxlongitude = None¶
- demean = True¶
- detrend = True¶
- taper_percentage = 0.0¶
- rotate = None¶
- remove_response = True¶
- output_unit = 'VEL'¶
- water_level = 60¶
- pre_filt = 'default'¶
- scale_factor = 1¶
- resample_freq = None¶
- remove_clipped = True¶
- remove_insufficient_length = True¶
- remove_masked_data = True¶
- fill_data_gaps = False¶
- gap_fraction = 1.0¶
- _output_dir¶
- output_dir = None¶
- write_files = 'inv,event,stream,sac,config_file,station_list'¶
- plot_files = 'all'¶
- log_level = 'DEBUG'¶
- legacy_naming = False¶
- _overwrite = False¶
- _overwrite_event_tag = None¶
- c = None¶
- st = None¶
- inv = None¶
- event = None¶
- st_raw = None¶
- kwargs¶
- get_client()[source]¶
Options to choose different Clients based on attribute client which will be used to gather waveforms and metadata
- Return type:
obspy.clients.fdsn.client.Client
- Returns:
Client used to gather waveforms and metadata
- load(config_file=None, overwrite_event=True)[source]¶
Overwrite default parameters using a YAML config file
- Parameters:
config_file (str) – YAML configuration file to load from
overwrite_event (bool) – overwrite event search parameters (origin time, lat, lon etc.) from the YAML config file. Defaults to True
- get_event()[source]¶
Exposed API for grabbing event metadata depending on the event_selection choice.
- Options for event_selection are:
‘search’: query FDSN with event parameters ‘default’: create an event from scratch using user parameters
or if ‘client’==’LLNL’, grab event from internal database
- Return type:
obspy.core.event.Event
- Returns:
Matching event given event criteria. If multiple events are returned with the query, returns the first in the catalog
- _query_event_from_client(magnitude_buffer=0.1, depth_buffer_km=1.0)[source]¶
Retrieve an event catalog using ObsPy Client.get_events(). Searches Client for a given origin time, location, depth (optional) and magnitude (optional).
To use this, set attribute `event_selection`==’search’
- Parameters:
magnitude_buffer (float) – if attribute event_magnitude is given, will search events for events with magnitude: event_magnitude +/- magnitude_buffer
depth_buffer_km (float) – if attribute event_depth_km is given, will search events for events with depth: event_depth_km +/- depth_buffer_km
- Return type:
obspy.core.event.Event
- Returns:
Matching event given event criteria. If multiple events are returned with the query, returns the first in the catalog
- _create_event_from_scratch()[source]¶
Make a barebones event object based on user-defined parameters which will then be used to query for waveforms and StationXML data
- Return type:
obspy.core.event.Event
- Returns:
Event object with origin and magnitude information appended
- _get_event_from_llnl_catalog()[source]¶
Special getter function for Lawrence Livermore National Lab data LLNL database has a special client
TODO Do we need more filtering in the catalog?
- Return type:
obspy.core.event.Event
- Returns:
Event information queried from LLNL database
- get_stations()[source]¶
Exposed API for grabbing station metadata from client. Download station metadata using ObsPy get_stations() with a user-defined bounding box and for user-defined networks, stations etc.
- Return type:
obspy.core.inventory.Inventory
- Returns:
Station metadata queried from Client
- get_waveforms()[source]¶
Exposed API for grabbing waveforms from client. Internal logic determines how waveforms are queried, but mainly it is controlled by the internal inv attribute detailing station information, and reference times for start and end times.
Note
We do not use the minimumlength variable so that we can figure out which stations have data gaps
- Return type:
obspy.core.stream.Stream
- Returns:
Stream of channel-separated waveforms
- _bulk_query_waveforms_from_client()[source]¶
Make a bulk request query to the Client based on the internal inv attribute defining the available station metadata.
- Return type:
obspy.core.stream.Stream
- Returns:
Stream of channel-separated waveforms
- mass_download()[source]¶
Use ObsPy Mass downloader to grab events from a pre-determined region
- Keyword Arguments:
domain_type (str) –
How to define the search region domain - rectangular: rectangular bounding box defined by min/max
latitude/longitude
circular: circular bounding circle defined by the events latitude and longitude, with radii defined by mindistance_km and maxdistance_km
delete_tmpdir (bool) – Remove the temporary directories that store the MSEED and StationXML files which were downloaded by the mass downloader. Saves space but also if anything fails prior to saving data, the downloaded data will not be saved. Defaults to True.
- curtail_stations()[source]¶
Remove stations from inv based on station distance, azimuth, etc.
Note
One-function function currently, but we can expand curtailing here if need by
- Return type:
obspy.core.inventory.Inventory
- Returns:
station metadata that has been curtailed based on acceptable paramaters
- preprocess()[source]¶
Very simple preprocessing to remove response and apply a prefilter scale waveforms (if necessary) and clean up waveform time series
- Return type:
obspy.core.stream.Stream
- Returns:
a preprocessed stream with response removed, amplitude scaled (optional), and time series standardized
- _remove_response_llnl(st)[source]¶
Remove response information from LLNL stations. This requires using the custom LLNL DB client. There are also some internal checks that need to be bypassed else they cause the program to crash
- rotate_streams()[source]¶
Rotate arbitrary three-component seismograms to desired orientation ‘ZNE’, ‘RTZ’ or ‘UVW’.
Warning
This function combines all traces, both rotated and non-rotated components (ZNE, RTZ, UVW, but not raw, e.g., 12Z), into a single stream. This is deemed okay because we don’t do any component-specific operations after rotation.
- Return type:
obspy.core.stream.Stream
- Returns:
a stream that has been rotated to desired coordinate system with SAC headers that have been adjusted for the rotation, as well as non-rotated streams which are saved incase user needs access to other components
- write(write_files=None, _return_filenames=False, _subset=None, **kwargs)[source]¶
Write out various files specifying information about the collected stations and waveforms.
Options are:
config_file: write the current configuration as a YAML file
station_list: write a text file with station information
inv: write the inventory as a StationXML file
event: write the event as a QuakeML file
stream: write the stream as a single MSEED file
- sac_zne: write the stream as individual (per-channel) SAC files
for ZNE components with the appropriate SAC header
sac_rtz: write out per-channel SAC files for RTZ components
sac_uvw: write out per-channel SAC files for UVW components
weights_dist: write out CAP ‘weights.dat’ file sorted by distance
weights_az: write out CAP ‘weights.dat’ file sorted by azimuth
weights_code: write out CAP ‘weights.dat’ file sorted by sta code
- Parameters:
write_files (list of str) – list of files that should be written out, must match the acceptable list defined in the function or here in the docstring. If not given, defaults to internal list of files
_return_filenames (bool) – internal flag to not actually write anything but just return a list of acceptable filenames. This keeps all the file naming definitions in one function. This is only required by the check() function.
_subset (list) – internal parameter used for intermediate file saving. PySEP will attempt to save files once they have been collected however if the files it tries to save do not match against the User-defined file list, they will be ignored.
- Keyword Arguments:
order_station_list_by (str) – how to order the station list available options are: network, station, latitude, longitude, elevation, burial.
config_fid (str) – optional name for the configuration file name defaults to ‘pysep_config.yaml’
station_fid (str) – optional name for the stations list file name defaults to ‘station_list.txt’
inv_fid (str) – optional name for saved ObsPy inventory object, defaults to ‘inv.xml’
event_fid (str) – optional name for saved ObsPy Event object, defaults to ‘event.xml’
stream_fid (str) – optional name for saved ObsPy Stream miniseed object, defaults to ‘stream.ms’
sac_subdir (str) – sub-directory within output directory and event directory to save SAC files. Defaults to SAC/. Use an empty string to dump files directly into the event directory
- _write_sac(st, output_dir=os.getcwd(), components=None)[source]¶
Write SAC files with a specific naming schema, which allows for both legacy (old PySEP) or non-legacy (new PySEP) naming.
- Parameters:
st (obspy.core.stream.Stream) – Stream to be written
output_dir (str) – where to save the SAC files, defaults to the current working directory
components (str) – acceptable component values for saving files, allows only saving subsets of the Stream. Example ‘RTZNE’ or just ‘R’. Must match against Trace.stats.component
- write_config(fid=None, overwrite=False)[source]¶
Write a YAML config file based on the internal Pysep attributes. Remove a few internal attributes (those containing data) before writing and also change types on a few to keep the output file simple but also re-usable for repeat queries.
- Parameters:
fid (str) – name of the file to write. defaults to config.yaml
overwrite (bool) – if True and fid already exists, save a new config file with the same name, overwriting the old file. if False (default), throws a warning if encountering existing fid and does not write config file
- plot()[source]¶
Plot map and record section if requested. Allow general error catching for mapping and record section plotting because we don’t want these auxiliary steps to crash the entire workflow since they are not critical.
- _event_tag_and_output_dir()[source]¶
Convenience function to establish and naming schema for files and directories. Also takes care of making empty directories.
- Return type:
tuple of str
- Returns:
(unique event tag, path to output directory)
- _set_log_file(mode)[source]¶
Write logger to file as well as stdout, with the same format as the stdout logger. Need mode==1 to move the log file after everything is done because we don’t know the event tag prior to starting the logs
- Parameters:
mode (int) – Two options for using this function 0: set the logger to a temporary file ‘pysep.log’, 1: move the logger from the temporary file into final output dir
- run(event=None, inv=None, st=None, **kwargs)[source]¶
Run PySEP: Seismogram Extraction and Processing. Steps in order are:
Set default parameters or load from config file
Check parameter validity, exit if unexpected values
Get data and metadata (QuakeML, StationXML, waveforms)
Remove unacceptable stations based on user-defined criteria
Remove unacceptable waveforms based on user-defined criteria
Generate some new metadata for tagging and output
Pre-process waveforms and standardize for general use
Generate output files and figures as end-product
- Parameters:
event (obspy.core.event.Event) – optional user-provided event object which will force a skip over QuakeML/event searching
inv (obspy.core.inventory.Inventory) – optional user-provided inventory object which will force a skip over StationXML/inventory searching
st (obspy.core.stream.Stream) – optional user-provided strean object which will force a skip over waveform searching
- pysep.get_data(config_file=None, event=None, inv=None, st=None, write_files=None, plot_files=None, log_level=None, *args, **kwargs)[source]¶
Interactive/scripting function to run PySep and return quality controlled, SAC-headed stream object which can then be used for other processes.
Note
By default turns file writing and plotting OFF so that this function acts solely as a data collection/processing call.
Note
args and kwargs are passed directly to Pysep.__init__() so you can define all your parameters in this call, or through a config file
>>> from pysep import get_data >>> st = get_data(config_file=’config.yaml’)
- Parameters:
config_file (str) – path to YAML config file which will overload any default configs
event (obspy.core.event.Event) – optional user-provided event object which will force a skip over QuakeML/event searching
inv (obspy.core.inventory.Inventory) – optional user-provided inventory object which will force a skip over StationXML/inventory searching
st (obspy.core.stream.Stream) – optional user-provided strean object which will force a skip over waveform searching
write_files (list or None) – list of files to write, acceptable options defined in write(). Defaults to None, no files will be written
plot_files (list or None) – list of files to plot, acceptable options defined in plot(). Defaults to None, no figures will be made
log_level (str or None) – verbosity of logger. Defaults to no logging to mimic a standard function call rather than a standalone package
- Return type:
tuple of (obspy.core.event.Event, obspy.core.inventory.Inventory, obspy.core.stream.Stream)
- Returns:
returns obspy objects defining data and metadata that have been collected by PySEP
- class pysep.Declust(cat, inv=None, data_avail=None, min_lat=None, max_lat=None, min_lon=None, max_lon=None)[source]¶
Declustering class in charge of declustering and source receiver weighting
User-input parameters to determine algorithm behavior
- Parameters:
cat (obspy.core.catalog.Catalog) – Catalog of events to consider. Events must include origin information latitude and longitude
inv (obspy.core.inventory.Inventory) – Inventory of stations to consider
data_avail (dict) – If None, Declust assumes that all events in cat were recorded by all stations in inv. This is typically not the case however, so this dict allows the user to tell Declust about data availability. Keys of data_avail must match resource IDs, and values must be lists of station names (NN.SSSS)
min_lat (float) – optional, minimum latitude for bounding box defining the region of interest. If not given, will use the minimum latitude in the catalog of events
max_lat (float) – optional, maximum latitude for bounding box defining the region of interest. If not given, will use the maximum latitude in the catalog of events
min_lon (float) – optional, minimum longitude for bounding box defining the region of interest. If not given, will use the minimum longitude in the catalog of events
max_lon (float) – optional, maximum longitude for bounding box defining the region of interest. If not given, will use the maximum longitude in the catalog of events
- cat¶
- inv = None¶
- _user_min_lat = None¶
- _user_max_lat = None¶
- _user_min_lon = None¶
- _user_max_lon = None¶
- _user_data_avail = None¶
- evlats = None¶
- evlons = None¶
- evids = None¶
- stalats = None¶
- stalons = None¶
- staids = None¶
- min_lat = None¶
- max_lat = None¶
- min_lon = None¶
- max_lon = None¶
- depths = None¶
- mags = None¶
- data_avail = None¶
- update_metadata(cat=None, inv=None)[source]¶
Get metadata like location and event depth and magnitude from a given Cat and Inv and set as internal attributes. Needs to be as separate function as threshold_events will cut down the internal catalog representation so this will need to be re-run
- threshold_catalog(zedges=None, min_mags=None, min_data=None)[source]¶
Kick out events that fall below a given magnitude range or a given data availability range. Allow this to be done for various depth ranges or for the entire volume at once.
Note
Updates internal cat Catalog object and metadata in place
- Parameters:
zedges (list of float) – depth [km] slices to partition domain into when thresholding data
min_mags (int or list) – a list of minimum magnitude thresholds for each depth slice. If zedges is None, should be a list of length==1, which provides minimum magnitude for entire catalog. Elif zedges is given, should be a list of len(zedges)-1, which defines minimum magnitude for each depth bin. For example if zedges=[0, 35, 400], then one example is min_mags=[4, 6]. Meaning between 0-34km the minimum magnitude is 4, and between 35-400km the minimum magnitude is 6.
min_data (int or list) – an integer or list of length len(zedges)-1 that defines the minimum number of stations on for a given event origin time, which allows user to prioritize stations with available data
- calculate_srcrcv_weights(cat=None, inv=None, write='weights.txt', plot=False, show=False, save='srcrcvwght.png')[source]¶
Calculate event and station specific weights based on event geographic weights, and event-dependent station geographic weights which take into account data availability for each event.
Uses the internal cat and inv attributes of the class. If declustering was run to thin out the catalog, run update_metadata first to update internal attributes which are used for defining weights.
- Parameters:
cat (obspy.core.catalog.Catalog) – Optional, catalog of events to consider. If none given, will use whatever internal catalog is available
inv (obspy.core.inventory.Inventory) – Inventory of stations to consider
plot (bool) – plot source weights and an average of station weights on a single figure.
write (str) – filename used to write station weights to text file. if None, will not write
show (bool) – show figure in GUI
save (str) – if given, file id for the name of the output figure to save if not given, will not save figure
- Return type:
np.array
- Returns:
a 2D array where each row corresponds to a station and each column corresponds to an event. The value for a given row and column is the weight of that station w.r.t all other available stations. Stations that were not available (not on) are given a weight of 0
- _get_weights(lons, lats, norm=None, plot=False, save='reference_distance_scan.png')[source]¶
Given a set of coordinates (either stations or earthquakes), calculate a reference distance and a set of weights for each coordinate. Reference distance is chosen as one-third the largest possible value for all possible reference distances
- Parameters:
lons (np.array) – array of longitude values
lats (np.array) – array of latitude values
plot (bool) – plot a simple scatterplot showing the weights for each station
norm (str) – how to normalize the weights - None: don’t normalize, provide raw weights - ‘max’: normalize by the maximum weight - ‘len’: normalize by the length of the array - ‘avg’: normalize by the mean weight value
- Return type:
np.array
- Returns:
relative, normalized weights for each lon/lat pair
- static _covert_dist_to_weight(dists, ref_dist)[source]¶
Calculate distance weights from a matrix of distances and a reference distance
- Parameters:
dists (np.array) – array of inter-source distances
ref_dist (float) – user-defined reference distance in units km. larger values for reference distances increase the sensitivity to inter-station distances. lower values tend to reduce scatter.
- decluster_events(cat=None, inv=None, choice='cartesian', zedges=None, min_mags=None, nkeep=1, select_by='magnitude', **kwargs)[source]¶
Main logic function for choosing how to decluster events. Allow for both cartesian and polar binning of the domain.
See _decluster_events_cartesian and _decluster_events_polar for specific input parameters to control declustering.
- Parameters:
cat (obspy.core.catalog.Catalog) – Optional, catalog of events to consider. If none given, will use whatever internal catalog is available
inv (obspy.core.inventory.Inventory) – Inventory of stations to consider
choice (str) – choice of domain partitioning, can be one of: - cartesian: grid the domain as a cube with nx by ny cells - polar: grid the domain with polar coordinates and ntheta bins
zedges (list of float) – depth [km] slices to partition domain into. Each slice will be given equal weighting w.r.t to all other slices, independent of slice size. e.g., allows upweighting crustal events
min_mags (list) – a list of minimum magnitude thresholds for each depth slice. If zedges is None, should be a list of length==1, which provides minimum magnitude for entire catalog. Elif zedges is given, should be a list of len(zedges)-1, which defines minimum magnitude for each depth bin. For example if zedges=[0, 35, 400], then one example is min_mags=[4, 6]. Meaning between 0-34km the minimum magnitude is 4, and between 35-400km the minimum magnitude is 6.
nkeep (int or list of int) – number of events to keep per cell. If zedges is None, then this must be an integer which defines a blanket value to apply. If zedges is given, then this must be a list of length zedges - 1, defining the number of events to keep per cell, per depth slice. See min_mags definition for example.
select_by (str) – determine how to prioritize events in the cell - magnitude (default): largest magnitudes prioritized - magnitude_r: smallest magnitudes prioritized - depth: shallower depths prioritized - depth_r: deeper depths prioritized - data: prioritize events which have the most data availability
- _decluster_events_cartesian(nx=10, ny=10, zedges=None, nkeep=1, select_by='magnitude_r', plot=False, plot_dir='./', **kwargs)[source]¶
Decluster event catalog by partitioning the 3D domain in the X, Y and Z directions, and then selecting a given number of events in each cell.
- Parameters:
nx (int) – Number of X/longitude cells to partition domain into
ny (int) – Number of Y/latitude cells to partition domain into
zedges (list of float) – depth [km] slices to partition domain into. Each slice will be given equal weighting w.r.t to all other slices, independent of slice size. e.g., allows upweighting crustal events
nkeep (int or list of int) – number of events to keep per cell. If zedges is None, then this must be an integer which defines a blanket value to apply. If zedges is given, then this must be a list of length zedges - 1, defining the number of events to keep per cell, per depth slice. See min_mags definition for example.
select_by (str) – determine how to prioritize events in the cell - magnitude (default): largest magnitudes prioritized - magnitude_r: smallest magnitudes prioritized - depth: shallower depths prioritized - depth_r: deeper depths prioritized - data: less data availability prioritized - data_r: more data availability prioritized
plot (bool) – create a before and after catalog scatter plot to compare which events were kept/removed. Plots within the cwd
plot_dir (str) – directory to save figures to. file names will be generated automatically
- Return type:
obspy.core.catalog.Catalog
- Returns:
a declustered event catalog
- _decluster_events_polar(ntheta=16, zedges=None, nkeep=1, select_by='magnitude_r', plot=False, plot_dir='./', **kwargs)[source]¶
Run the declustering agorithm but partition the domain in polar. That is, divide each depth slice into a pie with ntheta partitions and keep events based on events within each slice of the pie. Option to cut each slice of pie by radius (distance from center of domain) and put additional constraints (e.g., more distant events require larger magnitude).
- Parameters:
ntheta (int) – Number of theta bins to break a polar search into. Used to break up 360 degrees, so e.g., `ntheta`==17 will return bins of size 22.5 degrees ([0, 22.5, 45., 67.5 …. 360.])
zedges (list of float) – depth [km] slices to partition domain into. Each slice will be given equal weighting w.r.t to all other slices, independent of slice size. e.g., allows upweighting crustal events
nkeep (int or list of int) – number of events to keep per cell. If zedges is None, then this must be an integer which defines a blanket value to apply. If zedges is given, then this must be a list of length zedges - 1, defining the number of events to keep per cell, per depth slice. See min_mags definition for example.
select_by (str) – determine how to prioritize events in the cell - magnitude (default): largest magnitudes prioritized - magnitude_r: smallest magnitudes prioritized - depth: shallower depths prioritized - depth_r: deeper depths prioritized - data: less data availability prioritized - data_r: more data availability prioritized
plot (bool) – create a before and after catalog scatter plot to compare which events were kept/removed. Plots within the cwd
plot_dir (str) – directory to save figures to. file names will be generated automatically
- Return type:
obspy.core.catalog.Catalog
- Returns:
a declustered event catalog
- plot(cat=None, inv=None, color_by='depth', connect_data_avail=False, vmin=None, vmax=None, title=None, cmap='inferno_r', show=True, save=None, equal_scale=False, **kwargs)[source]¶
Geranalized plot function used to plot an event catalog and station inventory.
- Parameters:
cat (obspy.core.catalog.Catalog) – Catalog of events to consider. Events must include origin information latitude and longitude
inv (obspy.core.inventory.Inventory) – Inventory of stations to consider
color_by (str) –
how to color the event markers, available are - ‘depth’: color by the event depth - ‘data’: color by data availability for given event - ‘custom’: used by internal plotting routines to provide custom
color array to the plot
connect_data_avail (bool) – connect sources and receivers with a thin line based on data availability
vmin (float) – min value for the colorbar, defaults to smallest value in array defined by color_by
vmax (float:) – maximum value for colorbar, defaults to largest value in the array defined by color_by
title (str) – custom user-input title for the figure, otherwise defaults to useful information about the catalog and inventory
cmap (str) – matplotlib colormap to use for array defined by color_by
show (bool) – show figure in GUI
save (str) – if given, file id for the name of the output figure to save if not given, will not save figure
equal_scale (bool) – set the scale of lat and lon equal, False by default
- pysep.read_sem(fid, origintime='1970-01-01T00:00:00', source=None, stations=None, location='', precision=4, source_format='CMTSOLUTION')[source]¶
Specfem3D outputs seismograms to ASCII (.sem? or .sem.ascii) files. Converts SPECFEM synthetics into ObsPy Stream objects with the correct header information. If source and stations files are also provided, PySEP will write appropriate SAC headers to the underlying data.
- Parameters:
fid (str) – path of the given ascii file
origintime (obspy.UTCDateTime) – UTCDatetime object for the origintime of the event. If None given, defaults to dummy value of ‘1970-01-01T00:00:00’
source (str) – optional SPECFEM source file (e.g., CMTSOLUTION, SOURCE) defining the event which generated the synthetics. Used to grab event information and append as SAC headers to the ObsPy Stream
stations (str) – optional STATIONS file defining the station locations for the SPECFEM generated synthetics, used to generate SAC headers
location (str) – location value for a given station/component
precision (int) – dt precision determined by differencing two adjancent time steps in the underlying ascii text file.
- Rtype st:
obspy.Stream.stream
- Return st:
stream containing header and data info taken from ascii file
- pysep.read_stations(path_to_stations)[source]¶
Convert a SPECFEM STATIONS file into an ObsPy Inventory object.
Specfem3D STATION files contain no channel or location information, so the inventory can only go down to the station level.
Note
This assumes a row structure for the station file is STA, NET, LAT [deg], LON [deg], ELEVATION [m], BURIAL [m]
- Parameters:
path_to_stations (str) – the path to the STATIONS file that is associated with the Specfem3D DATA directory
- Return type:
obspy.core.inventory.Inventory
- Returns:
a station-level Inventory object
- Raises:
ValueError – if latitude and longitude values are not in geographic coordinates (i.e., in cartesian coordinates). Thrown by the init of the Station class.
- pysep.read_events_plus(fid, format, **kwargs)[source]¶
Addition to the base ObsPy.read_events() function that, in addition to the acceptable formats read by ObsPy, can also read the following: * SPECFEM2D SOURCE * SPECFEM3D/3D_GLOBE FORCESOLUTION * SPECFEM3D/3D_GLOBE CMTSOLUTION (both geographic and non-geographic)
See the following link for acceptable ObsPy formats: See the following link for acceptable ObsPy formats: https://docs.obspy.org/packages/autogen/obspy.core.event.read_events.html
- Parameters:
fid (str) – full path to the event file to be read
format (str) – Expected format of the file (case-insensitive), available are - SOURCE - FORCESOLUTION - CMTSOLUTION - any of ObsPy’s accepted arguments for ObsPy.read_events()
- Return type:
obspy.core.catalog.Catalog
- Returns:
Catalog which should only contain one event, read from the fid for the given fmt (format)