pygmt.select
- pygmt.select(data=None, outfile=None, *, area_thresh=None, resolution=None, gridmask=None, reverse=None, projection=None, mask=None, region=None, verbose=None, z_subregion=None, binary=None, nodata=None, find=None, coltypes=None, gap=None, header=None, incols=None, outcols=None, registration=None, skiprows=None, wrap=None, **kwargs)[source]
- Select data table subsets based on multiple spatial criteria. - This is a filter that reads (x, y) or (longitude, latitude) positions from the first 2 columns of data and uses a combination of 1-7 criteria to pass or reject the records. Records can be selected based on whether or not they are: - inside a rectangular region (region [and projection]) 
- within dist km of any point in pointfile 
- within dist km of any line in linefile 
- inside one of the polygons in the polygonfile 
- inside geographical features (based on coastlines) 
- has z-values within a given range, or 
- inside bins of a grid mask whose nodes are non-zero 
 - The sense of the tests can be reversed for each of these 7 criteria by using the reverse option. - Full option list at https://docs.generic-mapping-tools.org/latest/gmtselect.html - Aliases: - A = area_thresh 
- D = resolution 
- G = gridmask 
- I = reverse 
- J = projection 
- N = mask 
- R = region 
- V = verbose 
- Z = z_subregion 
- b = binary 
- d = nodata 
- e = find 
- f = coltypes 
- g = gap 
- h = header 
- i = incols 
- o = outcols 
- r = registration 
- s = skiprows 
- w = wrap 
 - Parameters
- data (str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame) – Pass in either a file name to an ASCII data table, a 2D - numpy.ndarray, a- pandas.DataFrame, an- xarray.Datasetmade up of 1D- xarray.DataArraydata variables, or a- geopandas.GeoDataFramecontaining the tabular data.
- outfile (str) – The file name for the output ASCII file. 
- area_thresh (int or float or str) – min_area[/min_level/max_level][+a[g|i][s|S]][+l|r][+ppercent]. Features with an area smaller than min_area in km2 or of hierarchical level that is lower than min_level or higher than max_level will not be plotted [Default is 0/0/4 (all features)]. 
- resolution (str) – resolution[+f]. Ignored unless mask is set. Selects the resolution of the coastline data set to use ((f)ull, (h)igh, (i)ntermediate, (l)ow, or (c)rude). The resolution drops off by ~80% between data sets. [Default is l]. Append (+f) to automatically select a lower resolution should the one requested not be available [Default is abort if not found]. Note that because the coastlines differ in details it is not guaranteed that a point will remain inside [or outside] when a different resolution is selected. 
- gridmask (str) – Pass all locations that are inside the valid data area of the grid gridmask. Nodes that are outside are either NaN or zero. 
- reverse (str) – - [cflrsz]. Reverses the sense of the test for each of the criteria specified: - c select records NOT inside any point’s circle of influence. 
- f select records NOT inside any of the polygons. 
- g will pass records inside the cells with z equal zero of the grid mask in gridmask. 
- l select records NOT within the specified distance of any line. 
- r select records NOT inside the specified rectangular region. 
- s select records NOT considered inside as specified by mask (and area_thresh, resolution). 
- z select records NOT within the range specified by z_subregion. 
 
- projection (str) – projcode[projparams/]width. Select map projection. 
- Pass all records whose location is inside specified geographical features. Specify if records should be skipped (s) or kept (k) using 1 of 2 formats: - wet/dry. 
- ocean/land/lake/island/pond. 
 - [Default is s/k/s/k/s (i.e., s/k), which passes all points on dry land]. 
- region (str or list) – xmin/xmax/ymin/ymax[+r][+uunit]. Specify the region of interest. 
- Select verbosity level [Default is w], which modulates the messages written to stderr. Choose among 7 levels of verbosity: - q - Quiet, not even fatal error messages are produced 
- e - Error messages only 
- w - Warnings [Default] 
- t - Timings (report runtimes for time-intensive algorithms); 
- i - Informational messages (same as - verbose=True)
- c - Compatibility warnings 
- d - Debugging messages 
 
- z_subregion (str) – min[/max][+a][+ccol][+i]. Pass all records whose 3rd column (z; col = 2) lies within the given range or is NaN (use skiprows to skip NaN records). If max is omitted then we test if z equals min instead. This means equality within 5 ULPs (unit of least precision; http://en.wikipedia.org/wiki/Unit_in_the_last_place). Input file must have at least three columns. To indicate no limit on min or max, specify a hyphen (-). If your 3rd column is absolute time then remember to supply - coltypes="2T". To specify another column, append +ccol, and to specify several tests just repeat the z_subregion option as many times as you have columns to test. Note: When more than one z_subregion option is given then the- reverse="z"option cannot be used. In the case of multiple tests you may use these modifiers as well: +a passes any record that passes at least one of your z tests [Default is all tests must pass], and +i reverses the tests to pass record with z value NOT in the given range. Finally, if +c is not used then it is automatically incremented for each new z_subregion option, starting with 2.
- i|o[ncols][type][w][+l|b]. Select native binary input (using - binary="i") or output (using- binary="o"), where ncols is the number of data columns of type, which must be one of:- c - int8_t (1-byte signed char) 
- u - uint8_t (1-byte unsigned char) 
- h - int16_t (2-byte signed int) 
- H - uint16_t (2-byte unsigned int) 
- i - int32_t (4-byte signed int) 
- I - uint32_t (4-byte unsigned int) 
- l - int64_t (8-byte signed int) 
- L - uint64_t (8-byte unsigned int) 
- f - 4-byte single-precision float 
- d - 8-byte double-precision float 
- x - use to skip ncols anywhere in the record 
 - For records with mixed types, append additional comma-separated combinations of ncols type (no space). The following modifiers are supported: - w after any item to force byte-swapping. 
- +l|b to indicate that the entire data file should be read as little- or big-endian, respectively. 
 - Full documentation is at https://docs.generic-mapping-tools.org/latest/gmt.html#bi-full. 
- nodata (str) – i|onodata. Substitute specific values with NaN (for tabular data). For example, - d="-9999"will replace all values equal to -9999 with NaN during input and all NaN values with -9999 during output. Prepend i to the nodata value for input columns only. Prepend o to the nodata value for output columns only.
- find (str) – [~]“pattern” | [~]/regexp/[i]. Only pass records that match the given pattern or regular expressions [Default processes all records]. Prepend ~ to the pattern or regexp to instead only pass data expressions that do not match the pattern. Append i for case insensitive matching. This does not apply to headers or segment headers. 
- coltypes (str) – [i|o]colinfo. Specify data types of input and/or output columns (time or geographical data). Full documentation is at https://docs.generic-mapping-tools.org/latest/gmt.html#f-full. 
- x|y|z|d|X|Y|Dgap[u][+a][+ccol][+n|p]. Examine the spacing between consecutive data points in order to impose breaks in the line. To specify multiple criteria, provide a list with each item containing a string describing one set of criteria. - x|X - define a gap when there is a large enough change in the x coordinates (upper case to use projected coordinates). 
- y|Y - define a gap when there is a large enough change in the y coordinates (upper case to use projected coordinates). 
- d|D - define a gap when there is a large enough distance between coordinates (upper case to use projected coordinates). 
- z - define a gap when there is a large enough change in the z data. Use +ccol to change the z data column [Default col is 2 (i.e., 3rd column)]. 
 - A unit u may be appended to the specified gap: - For geographic data (x|y|d), the unit may be arc d(egree), m(inute), and s(econd), or (m)e(ter), f(eet), k(ilometer), M(iles), or n(autical miles) [Default is (m)e(ter)]. 
- For projected data (X|Y|D), the unit may be i(nch), c(entimeter), or p(oint). 
 - Append modifier +a to specify that all the criteria must be met [default imposes breaks if any one criterion is met]. - One of the following modifiers can be appended: - +n - specify that the previous value minus the current column value must exceed gap for a break to be imposed. 
- +p - specify that the current value minus the previous value must exceed gap for a break to be imposed. 
 
- header (str) – - [i|o][n][+c][+d][+msegheader][+rremark][+ttitle]. Specify that input and/or output file(s) have n header records [Default is 0]. Prepend i if only the primary input should have header records. Prepend o to control the writing of header records, with the following modifiers supported: - +d to remove existing header records. 
- +c to add a header comment with column names to the output [Default is no column names]. 
- +m to add a segment header segheader to the output after the header block [Default is no segment header]. 
- +r to add a remark comment to the output [Default is no comment]. The remark string may contain \n to indicate line-breaks. 
- +t to add a title comment to the output [Default is no title]. The title string may contain \n to indicate line-breaks. 
 - Blank lines and lines starting with # are always skipped. 
- incols (str or 1d array) – - Specify data columns for primary input in arbitrary order. Columns can be repeated and columns not listed will be skipped [Default reads all columns in order, starting with the first (i.e., column 0)]. - For 1d array: specify individual columns in input order (e.g., - incols=[1,0]for the 2nd column followed by the 1st column).
- For - str: specify individual columns or column ranges in the format start[:inc]:stop, where inc defaults to 1 if not specified, with columns and/or column ranges separated by commas (e.g.,- incols="0:2,4+l"to input the first three columns followed by the log-transformed 5th column). To read from a given column until the end of the record, leave off stop when specifying the column range. To read trailing text, add the column t. Append the word number to t to ingest only a single word from the trailing text. Instead of specifying columns, use- incols="n"to simply read numerical input and skip trailing text. Optionally, append one of the following modifiers to any column or column range to transform the input columns:- +l to take the log10 of the input values. 
- +d to divide the input values by the factor divisor [Default is 1]. 
- +s to multiple the input values by the factor scale [Default is 1]. 
- +o to add the given offset to the input values [Default is 0]. 
 
 
- outcols (str or 1d array) – - cols[,…][,t[word]]. Specify data columns for primary output in arbitrary order. Columns can be repeated and columns not listed will be skipped [Default writes all columns in order, starting with the first (i.e., column 0)]. - For 1d array: specify individual columns in output order (e.g., - outcols=[1,0]for the 2nd column followed by the 1st column).
- For - str: specify individual columns or column ranges in the format start[:inc]:stop, where inc defaults to 1 if not specified, with columns and/or column ranges separated by commas (e.g.,- outcols="0:2,4"to output the first three columns followed by the 5th column). To write from a given column until the end of the record, leave off stop when specifying the column range. To write trailing text, add the column t. Append the word number to t to write only a single word from the trailing text. Instead of specifying columns, use- outcols="n"to simply read numerical input and skip trailing text. Note: if- incolsis also used then the columns given to- outcolscorrespond to the order after the- incolsselection has taken place.
 
- registration (str) – g|p. Force gridline (g) or pixel (p) node registration. [Default is g(ridline)]. 
- [cols][+a][+r]. Suppress output for records whose z-value equals NaN [Default outputs all records]. Optionally, supply a comma-separated list of all columns or column ranges to consider for this NaN test [Default only considers the third data column (i.e., cols = 2)]. Column ranges must be given in the format start[:inc]:stop, where inc defaults to 1 if not specified. The following modifiers are supported: - +r to reverse the suppression, i.e., only output the records whose z-value equals NaN. 
- +a to suppress the output of the record if just one or more of the columns equal NaN [Default skips record only if values in all specified cols equal NaN]. 
 
- wrap (str) – - y|a|w|d|h|m|s|cperiod[/phase][+ccol]. Convert the input x-coordinate to a cyclical coordinate, or a different column if selected via +ccol. The following cyclical coordinate transformations are supported: - y - yearly cycle (normalized) 
- a - annual cycle (monthly) 
- w - weekly cycle (day) 
- d - daily cycle (hour) 
- h - hourly cycle (minute) 
- m - minute cycle (second) 
- s - second cycle (second) 
- c - custom cycle (normalized) 
 - Full documentation is at https://docs.generic-mapping-tools.org/latest/gmt.html#w-full. 
 
- Returns
- output (pandas.DataFrame or None) – Return type depends on whether the - outfileparameter is set:- pandas.DataFrametable if- outfileis not set.
- None if - outfileis set (filtered output will be stored in file set by- outfile).
 
 - Example - >>> import pygmt >>> # Load a table of ship observations of bathymetry off Baja California >>> data = pygmt.datasets.load_sample_data( ... name="bathymetry" ... ) >>> # Only return the data points that lie within the region between >>> # longitudes 246 and 247 and latitudes 20 and 21 >>> pygmt.select( ... data=ship_data, region=[246, 247, 20, 21] ... )