| Title: | Manipulate Matrix Row and Column Labels with Ease |
|---|---|
| Description: | Functions to assist manipulation of matrix row and column labels for all types of matrix mathematics where row and column labels are to be respected. |
| Authors: | Matthew Heun [aut, cre] (ORCID: <https://orcid.org/0000-0002-7438-214X>) |
| Maintainer: | Matthew Heun <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.12 |
| Built: | 2026-05-29 10:39:13 UTC |
| Source: | https://github.com/matthewheun/rclabels |
A description of arrow notation.
arrow_notationarrow_notation
A vector of notational symbols that provides an arrow separator ("a -> b") between prefix and suffix.
arrow_notationarrow_notation
A description of bracket arrow notation.
bracket_arrow_notationbracket_arrow_notation
A vector of notational symbols that provides bracket arrow ("a [-> b]") notation.
bracket_arrow_notationbracket_arrow_notation
A description of bracket notation.
bracket_notationbracket_notation
A vector of notational symbols that provides bracket ("a [b]") notation.
bracket_notationbracket_notation
A description of dash notation.
dash_notationdash_notation
A vector of notational symbols that provides an dash separator ("a - b") between prefix and suffix.
dash_notationdash_notation
A description of first dot notation. Note that "a.b.c" splits into prefix ("a") and suffix ("b.c").
first_dot_notationfirst_dot_notation
A vector of notational symbols that provides first dot ("a.b") notation.
first_dot_notationfirst_dot_notation
A description of from notation.
from_notationfrom_notation
A vector of notational symbols that provides from ("a [from b]") notation.
from_notationfrom_notation
Nouns are the first part of a row-column label,
"a" in "a [b]".
Internally, this function calls get_pref_suff(which = "pref").
get_nouns( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE )get_nouns( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE )
labels |
A list or vector of labels from which nouns are to be extracted. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting nouns.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
A list of nouns from row and column labels.
get_nouns("a [b]", notation = bracket_notation) # Also works with vectors and lists. get_nouns(c("a [b]", "c [d]")) get_nouns(list("a [b]", "c [d]"))get_nouns("a [b]", notation = bracket_notation) # Also works with vectors and lists. get_nouns(c("a [b]", "c [d]")) get_nouns(list("a [b]", "c [d]"))
This function extracts the objects of prepositional phrases
from row and column labels.
The format of the output is a list of
named items, one name for each preposition encountered in labels.
Objects are NA if there is no prepositional phrase starting
with that preposition.
get_objects( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )get_objects( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )
labels |
The row and column labels from which prepositional phrases are to be extracted. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
prepositions |
A vector of strings to be treated as prepositions.
Note that a space is appended to each word internally,
so, e.g., "to" becomes "to ".
Default is |
A list of objects of prepositional phrases, with names being prepositions, and values being objects.
get_objects(c("a [of b into c]", "d [of Coal from e -> f]"))get_objects(c("a [of b into c]", "d [of Coal from e -> f]"))
This is a wrapper function for get_pref_suff(), get_nouns(), and
get_objects().
It returns a piece of a row or column label.
get_piece( labels, piece = "all", inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )get_piece( labels, piece = "all", inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )
labels |
The row and column labels from which prepositional phrases are to be extracted. |
piece |
The name of the item to return. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
prepositions |
A vector of strings to be treated as prepositions.
Note that a space is appended to each word internally,
so, e.g., "to" becomes "to ".
Default is |
piece is typically one of
"all" (which returns labels directly),
"pref" (for the prefixes),
"suff" (for the suffixes),
"noun" (returns the noun),
"pps" (prepositional phrases, returns prepositional phrases in full),
"prepositions" (returns a list of prepositions),
"objects" (returns a list of objects with prepositions as names), or
a preposition in prepositions (as a string), which will return
the object of that preposition named by the preposition itself.
piece must be a character vector of length 1.
If a piece is missing in a label, "" (empty string) is returned.
If specifying more than one notation, be sure the notations are in a list.
notation = c(RCLabels::bracket_notation, RCLabels::arrow_notation)
is unlikely to produce the desired result, because the notations
are concatenated together to form a long string vector.
Rather say
notation = list(RCLabels::bracket_notation, RCLabels::arrow_notation).
A piece of labels.
labs <- c("a [from b in c]", "d [of e in f]", "Export [of Coal from USA to MEX]") get_piece(labs, "pref") get_piece(labs, "suff") get_piece(labs, piece = "noun") get_piece(labs, piece = "pps") get_piece(labs, piece = "prepositions") get_piece(labs, piece = "objects") get_piece(labs, piece = "from") get_piece(labs, piece = "in") get_piece(labs, piece = "of") get_piece(labs, piece = "to")labs <- c("a [from b in c]", "d [of e in f]", "Export [of Coal from USA to MEX]") get_piece(labs, "pref") get_piece(labs, "suff") get_piece(labs, piece = "noun") get_piece(labs, piece = "pps") get_piece(labs, piece = "prepositions") get_piece(labs, piece = "objects") get_piece(labs, piece = "from") get_piece(labs, piece = "in") get_piece(labs, piece = "of") get_piece(labs, piece = "to")
This function extracts prepositional phrases from suffixes of row and column labels of the form "a [preposition b]", where "preposition b" is the prepositional phrase.
get_pps( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )get_pps( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )
labels |
A list or vector of labels from which prepositional phrases are to be extracted. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositional phrases.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
prepositions |
A list of prepositions for which to search.
Default is |
All prepositional phrases in a suffix.
get_pps(c("a [in b]", "c [of d]")) get_pps(c("a [of b in c]", "d [-> e of f]"))get_pps(c("a [in b]", "c [of d]")) get_pps(c("a [of b in c]", "d [-> e of f]"))
This function extracts prepositions from a list of row and column labels. The list has outer structure of the number of labels and an inner structure of each prepositional phrase in the specific label.
get_prepositions( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )get_prepositions( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )
labels |
The row and column labels from which prepositional phrases are to be extracted. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
prepositions |
A vector of strings to be treated as prepositions.
Note that a space is appended to each word internally,
so, e.g., "to" becomes "to ".
Default is |
If labels are in the form of
from_notation, to_notation or similar,
it is probably best to give bracket_notation in the notation
argument.
Providing
from_notation, to_notation or similar
in the notation argument will lead to empty results.
The preposition is discarded when extracting the suffix,
yielding empty strings for the prepositions.
A list of prepositions.
get_prepositions(c("a [of b into c]", "d [-> e of f]")) get_prepositions(c("a [of b]", "d [-> e of f]"), inf_notation = FALSE, notation = bracket_notation) # Best to *not* specify notation by the preposition, # as the result will be empty strings. # Rather, give the notation as `bracket_notation` # as shown above, or infer the notation # as shown below. get_prepositions(c("a [of b]", "d [-> e of f]"), inf_notation = TRUE) # The suffix is extracted, and the preposition # is lost before looking for the preposition. get_prepositions(c("a [of b]", "d [of f]"), inf_notation = FALSE, notation = of_notation)get_prepositions(c("a [of b into c]", "d [-> e of f]")) get_prepositions(c("a [of b]", "d [-> e of f]"), inf_notation = FALSE, notation = bracket_notation) # Best to *not* specify notation by the preposition, # as the result will be empty strings. # Rather, give the notation as `bracket_notation` # as shown above, or infer the notation # as shown below. get_prepositions(c("a [of b]", "d [-> e of f]"), inf_notation = TRUE) # The suffix is extracted, and the preposition # is lost before looking for the preposition. get_prepositions(c("a [of b]", "d [of f]"), inf_notation = FALSE, notation = of_notation)
A description of in notation.
in_notationin_notation
A vector of notational symbols that provides to ("a [in b]") notation.
in_notationin_notation
It is convenient to know which notation is applicable to row or column labels.
This function infers which notations are appropriate for x.
infer_notation( x, inf_notation = TRUE, notations = RCLabels::notations_list, allow_multiple = FALSE, retain_names = FALSE, choose_most_specific = TRUE, must_succeed = TRUE )infer_notation( x, inf_notation = TRUE, notations = RCLabels::notations_list, allow_multiple = FALSE, retain_names = FALSE, choose_most_specific = TRUE, must_succeed = TRUE )
x |
A row or column label (or vector of labels). |
inf_notation |
A boolean that tells whether to infer notation for |
notations |
A list of notations from which matches will be inferred.
This function might not work as expected if
|
allow_multiple |
A boolean that tells whether multiple notation matches
are allowed.
If |
retain_names |
A boolean that tells whether to retain names from |
choose_most_specific |
A boolean that indicates whether the most-specific notation
will be returned when more than one of |
must_succeed |
A boolean that if |
This function is vectorized.
Thus, x can be a vector, in which case the output is a list of notations.
notations is treated as a store from which matches for each label in x
can be determined.
notations should be a named list of notations.
When retain_names = TRUE, the names on notations will be retained,
and the return value is always a list.
By default (allow_multiple = FALSE),
a single notation object is returned for each item in x
if only one notation in notations
is appropriate for x.
If allow_multiple = FALSE (the default) and more than one notation is applicable to x,
an error is thrown.
Multiple matches can be returned when allow_multiple = TRUE.
If multiple notations are matched, the return value is a list.
When choose_most_specific = TRUE (the default),
the most specific notation in notations is returned.
"Most specific" is defined as the matching notation
whose sum of characters in the pref_start, pref_end,
suff_start and suff_end elements
is greatest.
If choose_most_specific = TRUE and
two matching notations in notations have the same number of characters,
only the first match is returned.
When choose_most_specific = TRUE,
the value of allow_multiple no longer matters.
allow_multiple = FALSE is implied and
at most one of the notations will be returned.
When inf_notation = FALSE (default is TRUE),
notations are returned unmodified,
essentially disabling this function.
Although calling with inf_notation = FALSE seems daft,
this behavior enables cleaner code elsewhere.
A single notation object (if x is a single row or column label)
or a list of notation objects (if x is a vector or a list).
If no notations match x, NULL is returned,
either alone or in a list.
# Does not match any notations in RCLabels::notations_list # and throws an error, because the default value for `must_succeed` # is `TRUE`. ## Not run: infer_notation("abc") ## End(Not run) # This returns `NULL`, because `must_succeed = FALSE`. infer_notation("abc", must_succeed = FALSE) # This succeeds, because the label is in the form of a # notation in `RCLabels::notation_list`, # the default value of the `notation` argument. infer_notation("a -> b") # Names of the notations can be retained, in which case # the return value is always a list. infer_notation("a -> b", retain_names = TRUE) # This function is vectorized. # The list of labels matches # all known notations in `RCLabels::notations_list`. infer_notation(c("a -> b", "a (b)", "a [b]", "a [from b]", "a [of b]", "a [to b]", "a [in b]", "a [-> b]", "a.b"), retain_names = TRUE) # By default, the most specific notation is returned. # But when two or more matches are present, # multiple notations can be returned, too. infer_notation("a [from b]", allow_multiple = TRUE, retain_names = TRUE, choose_most_specific = FALSE) infer_notation(c("a [from b]", "c [to d]"), allow_multiple = TRUE, retain_names = TRUE, choose_most_specific = FALSE) # As shown above, "a \[from b\]" matches 2 notations: # `RCLabels::bracket_notation` and `RCLabels::from_notation`. # The default value for the notation argument is # RCLabels::notations_list, # which includes `RCLabels::bracket_notation` # and `RCLabels::from_notation` in that order. # Thus, there is some flexibility to how this function works # if the value of the `notation` argument is a list of notations # ordered from least specific to most specific, # as `RCLabels::notations_list` is ordered. # To review, the next call returns both `RCLabels::bracket_notation` and # `RCLabels::from_notation`, because `allow_multiple = TRUE` and # `choose_most_specific = FALSE`, neither of which are default. infer_notation("a [from b]", allow_multiple = TRUE, choose_most_specific = FALSE, retain_names = TRUE) # The next call returns `RCLabels::from_notation`, because # the most specific notation is requested, and # `RCLabels::from_notation` has more characters in its specification than # `RCLabels::bracket_notation`. infer_notation("a [from b]", choose_most_specific = TRUE, retain_names = TRUE) # The next call returns the `RCLabels::bracket_notation`, because # `choose_most_specific = FALSE`, and the first matching # notation in `RCLabels::notations_list` is `RCLabels::bracket_notation`. infer_notation("a [from b]", choose_most_specific = FALSE, retain_names = TRUE)# Does not match any notations in RCLabels::notations_list # and throws an error, because the default value for `must_succeed` # is `TRUE`. ## Not run: infer_notation("abc") ## End(Not run) # This returns `NULL`, because `must_succeed = FALSE`. infer_notation("abc", must_succeed = FALSE) # This succeeds, because the label is in the form of a # notation in `RCLabels::notation_list`, # the default value of the `notation` argument. infer_notation("a -> b") # Names of the notations can be retained, in which case # the return value is always a list. infer_notation("a -> b", retain_names = TRUE) # This function is vectorized. # The list of labels matches # all known notations in `RCLabels::notations_list`. infer_notation(c("a -> b", "a (b)", "a [b]", "a [from b]", "a [of b]", "a [to b]", "a [in b]", "a [-> b]", "a.b"), retain_names = TRUE) # By default, the most specific notation is returned. # But when two or more matches are present, # multiple notations can be returned, too. infer_notation("a [from b]", allow_multiple = TRUE, retain_names = TRUE, choose_most_specific = FALSE) infer_notation(c("a [from b]", "c [to d]"), allow_multiple = TRUE, retain_names = TRUE, choose_most_specific = FALSE) # As shown above, "a \[from b\]" matches 2 notations: # `RCLabels::bracket_notation` and `RCLabels::from_notation`. # The default value for the notation argument is # RCLabels::notations_list, # which includes `RCLabels::bracket_notation` # and `RCLabels::from_notation` in that order. # Thus, there is some flexibility to how this function works # if the value of the `notation` argument is a list of notations # ordered from least specific to most specific, # as `RCLabels::notations_list` is ordered. # To review, the next call returns both `RCLabels::bracket_notation` and # `RCLabels::from_notation`, because `allow_multiple = TRUE` and # `choose_most_specific = FALSE`, neither of which are default. infer_notation("a [from b]", allow_multiple = TRUE, choose_most_specific = FALSE, retain_names = TRUE) # The next call returns `RCLabels::from_notation`, because # the most specific notation is requested, and # `RCLabels::from_notation` has more characters in its specification than # `RCLabels::bracket_notation`. infer_notation("a [from b]", choose_most_specific = TRUE, retain_names = TRUE) # The next call returns the `RCLabels::bracket_notation`, because # `choose_most_specific = FALSE`, and the first matching # notation in `RCLabels::notations_list` is `RCLabels::bracket_notation`. infer_notation("a [from b]", choose_most_specific = FALSE, retain_names = TRUE)
This is a non-public helper function for vectorized infer_notation().
infer_notation_for_one_label( x, inf_notation = TRUE, notations = RCLabels::notations_list, allow_multiple = FALSE, retain_names = FALSE, choose_most_specific = TRUE, must_succeed = TRUE )infer_notation_for_one_label( x, inf_notation = TRUE, notations = RCLabels::notations_list, allow_multiple = FALSE, retain_names = FALSE, choose_most_specific = TRUE, must_succeed = TRUE )
x |
A single row or column label. |
inf_notation |
A boolean that tells whether to infer notation for |
notations |
A list of notations from which matches will be inferred
This function might not work as expected if
|
allow_multiple |
A boolean that tells whether multiple notation matches
are allowed.
If |
retain_names |
A boolean that tells whether to retain names on the
outgoing matches.
Default is |
choose_most_specific |
A boolean that indicates if the most-specific notation
will be returned when more than one of |
must_succeed |
A boolean that if |
A single matching notation object (if allow_multiple = FALSE, the default)
or possibly multiple matching notation objects (if allow_multiple = TRUE).
If no notations match x, NULL.
Repeats x as necessary to make n of them.
Does not try to simplify x.
make_list(x, n, lenx = ifelse(is.vector(x), length(x), 1))make_list(x, n, lenx = ifelse(is.vector(x), length(x), 1))
x |
The object to be duplicated. |
n |
The number of times to be duplicated. |
lenx |
The length of item |
If x is itself a vector or list,
you may want to override the default value for lenx.
For example, if x is a list that should be duplicated several times,
set lenx = 1.
A list of x duplicated n times
m <- matrix(c(1:6), nrow=3, dimnames = list(c("r1", "r2", "r3"), c("c2", "c1"))) make_list(m, n = 1) make_list(m, n = 2) make_list(m, n = 5) make_list(list(c(1,2), c(1,2)), n = 4) m <- matrix(1:4, nrow = 2) l <- list(m, m+100) make_list(l, n = 4) make_list(l, n = 1) # Warning because l is trimmed. make_list(l, n = 5) # Warning because length(l) (i.e., 2) not evenly divisible by 5 make_list(list(c("r10", "r11"), c("c10", "c11")), n = 2) # Confused by x being a list make_list(list(c("r10", "r11"), c("c10", "c11")), n = 2, lenx = 1) # Fix by setting lenx = 1m <- matrix(c(1:6), nrow=3, dimnames = list(c("r1", "r2", "r3"), c("c2", "c1"))) make_list(m, n = 1) make_list(m, n = 2) make_list(m, n = 5) make_list(list(c(1,2), c(1,2)), n = 4) m <- matrix(1:4, nrow = 2) l <- list(m, m+100) make_list(l, n = 4) make_list(l, n = 1) # Warning because l is trimmed. make_list(l, n = 5) # Warning because length(l) (i.e., 2) not evenly divisible by 5 make_list(list(c("r10", "r11"), c("c10", "c11")), n = 2) # Confused by x being a list make_list(list(c("r10", "r11"), c("c10", "c11")), n = 2, lenx = 1) # Fix by setting lenx = 1
This function makes "or" regex patterns from vectors or lists of strings.
This function can be used with the matsbyname::select_rows_byname()
and matsbyname::select_cols_byname functions.
make_or_pattern() correctly escapes special characters in strings,
such as ( and ), as needed.
Thus, it is highly recommended that make_or_pattern be used when
constructing patterns for row and column selections with
matsbyname::select_rows_byname() and matsbyname::select_cols_byname().
make_or_pattern( strings, pattern_type = c("exact", "leading", "trailing", "anywhere", "literal") )make_or_pattern( strings, pattern_type = c("exact", "leading", "trailing", "anywhere", "literal") )
strings |
A vector of row and column names. |
pattern_type |
One of "exact", "leading", "trailing", "anywhere", or "literal". Default is "exact". |
pattern_type controls the type of pattern created:
exact produces a regex pattern that selects row or column names by exact match.
leading produces a regex pattern that selects row or column names if the item in strings matches
the beginnings of row or column names.
trailing produces a regex pattern that selects row or column names if the item in strings matches
the ends of row or column names.
anywhere produces a regex pattern that selects row or column names if the item in strings matches
any substring of row or column names.
literal returns strings unmodified, and it is up to the caller to formulate a correct regex.
An "or" regex pattern suitable for selecting row and column names.
Amenable for use with matsbyname::select_rows_byname or matsbyname::select_cols_byname.
make_or_pattern(strings = c("a", "b"), pattern_type = "exact") make_or_pattern(strings = c("a", "b"), pattern_type = "leading") make_or_pattern(strings = c("a", "b"), pattern_type = "trailing") make_or_pattern(strings = c("a", "b"), pattern_type = "anywhere") make_or_pattern(strings = c("a", "b"), pattern_type = "literal")make_or_pattern(strings = c("a", "b"), pattern_type = "exact") make_or_pattern(strings = c("a", "b"), pattern_type = "leading") make_or_pattern(strings = c("a", "b"), pattern_type = "trailing") make_or_pattern(strings = c("a", "b"), pattern_type = "anywhere") make_or_pattern(strings = c("a", "b"), pattern_type = "literal")
Typical pieces include "noun" or a preposition,
such as "in" or "from".
See RCLabels::prepositions for additional examples.
This argument may be a single string or a character vector.
modify_label_pieces( labels, piece, mod_map, prepositions = RCLabels::prepositions_list, inf_notation = TRUE, notation = RCLabels::bracket_notation, choose_most_specific = FALSE )modify_label_pieces( labels, piece, mod_map, prepositions = RCLabels::prepositions_list, inf_notation = TRUE, notation = RCLabels::bracket_notation, choose_most_specific = FALSE )
labels |
A vector of row or column labels in which pieces will be modified. |
piece |
The piece (or pieces) of the row or column label that will be modified. |
mod_map |
A modification map. See details. |
prepositions |
A list of prepositions, used to detect prepositional phrases.
Default is |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether the most specific
notation is selected when more than one notation match.
Default is |
This function modifies pieces of row and column labels
according to label_map that defines "one or many to one" relationships.
This function is useful for aggregations.
For example, replacing nouns can be done by
modify_label_pieces(labels, piece = "noun", label_map = list(new_noun = c("a", "b", "c")).
The string "new_noun" will replace any of "a", "b", or "c"
when they appear as nouns in a row or column label.
See examples for details.
The mod_map argument should consist of a
named list of character vectors in which names indicate
strings to be inserted and values indicate
values that should be replaced.
The sense is new = old or new = olds,
where "new" is the new name (the replacement) and
"old"/"olds" is/are a string/vector of strings,
any one of which will be replaced by "new".
Note piece can be "pref"/"suff" or "noun"/"prepositions"
If any piece is "pref" or "suff",
all pieces are assumed to be a prefix or a suffix.
If non of the pieces are "pref" or "suff",
all pieces are assumed to be nouns or prepositions,
such as "in" or "from".
See RCLabels::prepositions for additional examples.
This argument may be a single string or a character vector.
labels with replacements according to piece and mod_map.
# Simple case modify_label_pieces("a [of b in c]", piece = "noun", mod_map = list(new_noun = c("a", "b"))) # Works with a vector or list of labels modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = "noun", mod_map = list(new_noun = c("d", "e"))) # Works with multiple items in the mod_map modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = "noun", mod_map = list(new_noun1 = c("a", "b", "c"), new_noun2 = c("d", "e", "f"))) # Works with multiple pieces to be modified modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = c("noun", "in"), mod_map = list(new_noun = c("a", "b", "c"), new_in = c("c", "f")))# Simple case modify_label_pieces("a [of b in c]", piece = "noun", mod_map = list(new_noun = c("a", "b"))) # Works with a vector or list of labels modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = "noun", mod_map = list(new_noun = c("d", "e"))) # Works with multiple items in the mod_map modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = "noun", mod_map = list(new_noun1 = c("a", "b", "c"), new_noun2 = c("d", "e", "f"))) # Works with multiple pieces to be modified modify_label_pieces(c("a [of b in c]", "d [-> e in f]"), piece = c("noun", "in"), mod_map = list(new_noun = c("a", "b", "c"), new_in = c("c", "f")))
This function modifies the nouns of row and column labels.
The length of new_nouns must be the same as the length of labels.
modify_nouns( labels, new_nouns, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE )modify_nouns( labels, new_nouns, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE )
labels |
The row and column labels in which the nouns will be modified. |
new_nouns |
The new nouns to be set in |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
A character vector of same length as labels
with nouns modified to be new_nouns.
labels <- c("a [of b in c]", "d [of e in USA]") modify_nouns(labels, c("a_plus", "g"))labels <- c("a [of b in c]", "d [of e in USA]") modify_nouns(labels, c("a_plus", "g"))
A list of all bundled notations.
This list is organized by least specific to most specific,
thereby enabling some unique behaviors in infer_notation().
See the examples for infer_notation().
notations_listnotations_list
A list of bundled notations.
notations_listnotations_list
A description of of notation.
of_notationof_notation
A vector of notational symbols that provides of ("a [of b]") notation.
of_notationof_notation
A description of parenthetical notation.
paren_notationparen_notation
A vector of notational symbols that provides a parenthetical ("a (b)") notation.
paren_notationparen_notation
This function recombines (unsplits) row or column labels that have
been separated by split_noun_pp().
paste_noun_pp( splt_labels, notation = RCLabels::bracket_notation, squish = TRUE )paste_noun_pp( splt_labels, notation = RCLabels::bracket_notation, squish = TRUE )
splt_labels |
A vector of split row or column labels, probably created by |
notation |
The notation object that describes the labels.
Default is |
squish |
A boolean that tells whether to remove extra spaces in the output of |
Recombined row and column labels.
labs <- c("a [of b in c]", "d [from Coal mines in USA]") labs split <- split_noun_pp(labs) split paste_noun_pp(split) # Also works in a data frame df <- tibble::tibble(labels = c("a [in b]", "c [of d into USA]", "e [of f in g]", "h [-> i in j]")) recombined <- df %>% dplyr::mutate( splits = split_noun_pp(labels), recombined = paste_noun_pp(splits) ) all(recombined$labels == recombined$recombined)labs <- c("a [of b in c]", "d [from Coal mines in USA]") labs split <- split_noun_pp(labs) split paste_noun_pp(split) # Also works in a data frame df <- tibble::tibble(labels = c("a [in b]", "c [of d into USA]", "e [of f in g]", "h [-> i in j]")) recombined <- df %>% dplyr::mutate( splits = split_noun_pp(labels), recombined = paste_noun_pp(splits) ) all(recombined$labels == recombined$recombined)
This constant is deprecated.
Please use prepositiions_list instead.
prepositionsprepositions
A vector of prepositions used in row and column labels.
Prepositions used in row and column labels.
prepositions_listprepositions_list
A vector of prepositions used in row and column labels.
prepositions_listprepositions_list
match_by_pattern() tells whether row or column labels
match a regular expression.
Internally, grepl() decides whether a match occurs.
replace_by_pattern() replaces portions of row of column labels
when a regular expression is matched.
Internally, gsub() performs the replacements.
match_by_pattern( labels, regex_pattern, pieces = "all", prepositions = RCLabels::prepositions_list, notation = RCLabels::bracket_notation, inf_notation = TRUE, choose_most_specific = FALSE, ... ) replace_by_pattern( labels, regex_pattern, replacement, pieces = "all", prepositions = RCLabels::prepositions_list, notation = RCLabels::bracket_notation, ... )match_by_pattern( labels, regex_pattern, pieces = "all", prepositions = RCLabels::prepositions_list, notation = RCLabels::bracket_notation, inf_notation = TRUE, choose_most_specific = FALSE, ... ) replace_by_pattern( labels, regex_pattern, replacement, pieces = "all", prepositions = RCLabels::prepositions_list, notation = RCLabels::bracket_notation, ... )
labels |
The row and column labels to be modified. |
regex_pattern |
The regular expression pattern to determine matches and replacements.
Consider using |
pieces |
The pieces of row or column labels to be checked for matches or replacements. See details. |
prepositions |
A vector of strings that count as prepositions.
Default is prepositions_list.
Used to detect prepositional phrases
if |
notation |
The notation used in |
inf_notation |
A boolean that tells whether to infer notation for |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
... |
Other arguments passed to |
replacement |
For |
By default (pieces = "all"), complete labels (as strings) are checked for matches
and replacements.
If pieces == "pref" or pieces == "suff",
only the prefix or the suffix is checked for matches and replacements.
Alternatively, pieces = "noun" or pieces = <<preposition>> indicate
that only specific pieces of labels are to be checked for matches and replacements.
When pieces = <<preposition>>, only the object of <<preposition>> is
checked for matches and replacement.
pieces can be a vector, indicating multiple pieces to be checked for matches
and replacements.
But if any of the pieces are "all", all pieces are checked and replaced.
If pieces is "pref" or "suff", only one can be specified.
A logical vector of same length as labels,
where TRUE indicates a match was found and FALSE indicates otherwise.
labels <- c("Production [of b in c]", "d [of Coal in f]", "g [of h in USA]") # With default `pieces` argument, matching is done for whole labels. match_by_pattern(labels, regex_pattern = "Production") match_by_pattern(labels, regex_pattern = "Coal") match_by_pattern(labels, regex_pattern = "USA") # Check beginnings of labels match_by_pattern(labels, regex_pattern = "^Production") # Check at ends of labels: no match. match_by_pattern(labels, regex_pattern = "Production$") # Can match on nouns or prepositions. match_by_pattern(labels, regex_pattern = "Production", pieces = "noun") # Gives FALSE, because "Production" is a noun. match_by_pattern(labels, regex_pattern = "Production", pieces = "in")labels <- c("Production [of b in c]", "d [of Coal in f]", "g [of h in USA]") # With default `pieces` argument, matching is done for whole labels. match_by_pattern(labels, regex_pattern = "Production") match_by_pattern(labels, regex_pattern = "Coal") match_by_pattern(labels, regex_pattern = "USA") # Check beginnings of labels match_by_pattern(labels, regex_pattern = "^Production") # Check at ends of labels: no match. match_by_pattern(labels, regex_pattern = "Production$") # Can match on nouns or prepositions. match_by_pattern(labels, regex_pattern = "Production", pieces = "noun") # Gives FALSE, because "Production" is a noun. match_by_pattern(labels, regex_pattern = "Production", pieces = "in")
This function removes pieces from row and column labels.
remove_label_pieces( labels, pieces_to_remove, prepositions = RCLabels::prepositions_list, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE )remove_label_pieces( labels, pieces_to_remove, prepositions = RCLabels::prepositions_list, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE )
labels |
The row and column labels from which prepositional phrases will be removed. |
pieces_to_remove |
The names of pieces of the label to be removed,
typically "noun" or a preposition such as "of" or "in"
See |
prepositions |
A list of prepositions, used to detect prepositional phrases.
Default is |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether the most specific
notation is selected when more than one notation match.
Default is |
labels with pieces removed.
labs <- c("a [of b in c]", "d [-> e in f]") remove_label_pieces(labs, pieces_to_remove = "of") remove_label_pieces(labs, pieces_to_remove = c("of", "->")) remove_label_pieces(labs, pieces_to_remove = c("in", "into")) remove_label_pieces(labs, pieces_to_remove = c("of", "in"))labs <- c("a [of b in c]", "d [-> e in f]") remove_label_pieces(labs, pieces_to_remove = "of") remove_label_pieces(labs, pieces_to_remove = c("of", "->")) remove_label_pieces(labs, pieces_to_remove = c("in", "into")) remove_label_pieces(labs, pieces_to_remove = c("of", "in"))
It is often convenient to represent matrix row and column names with notation that includes a prefix and a suffix, with corresponding separators or start-end string sequences. There are several functions to generate specialized versions or otherwise manipulate row and column names on their own or as row or column names.
flip_pref_suff() Switches the location of prefix and suffix, such that the prefix becomes the suffix, and
the suffix becomes the prefix.
E.g., "a -> b" becomes "b -> a" or "a [b]" becomes "b [a]".
get_pref_suff() Selects only prefix or suffix, discarding notational elements
and the rejected part.
Internally, this function calls split_pref_suff() and selects only the desired portion.
notation_vec() Builds a vector of notation symbols in a standard format.
By default, it builds a list of notation symbols that provides an arrow
separator (" -> ") between prefix and suffix.
paste_pref_suff() paste0's prefixes and suffixes, the inverse of split_pref_suff().
Always returns a character vector.
preposition_notation() Builds a list of notation symbols that provides (by default) square brackets around the suffix with a preposition ("prefix [preposition suffix]").
split_pref_suff() Splits prefixes from suffixes, returning each in a list with names pref and suff.
If no prefix or suffix delimiters are found, x is returned in the pref item, unmodified,
and the suff item is returned as "" (an empty string).
If there is no prefix, and empty string is returned for the pref item.
If there is no suffix, and empty string is returned for the suff item.
switch_notation() Switches from one type of notation to another based on the from and to arguments.
Optionally, prefix and suffix can be flipped.
Parts of a notation vector are
"pref_start", "pref_end", "suff_start", and "suff_end".
None of the strings in a notation vector are considered part of the prefix or suffix.
E.g., "a -> b" in arrow notation means that "a" is the prefix and "b" is the suffix.
If sep only is specified for notation_vec() (default is " -> "),
pref_start, pref_end, suff_start, and suff_end are
set appropriately.
For functions where the notation argument is used to identify portions of the row or column label
(such as split_pref_suff(), get_pref_suff(),
and the from argument to switch_notation()),
(Note: flip_pref_suff() cannot infer notation, because it switches prefix and suffix in a known, single notation.)
if notation is a list, it is treated as a store from which
the most appropriate notation is inferred by infer_notation(choose_most_specific = TRUE).
Because default is RCLabels::notations_list,
notation is inferred by default.
The argument choose_most_specific tells what to do when two notations match a label:
if TRUE (the default), the notation with most characters is selected.
If FALSE, the first matching notation in notation will be selected.
See details at infer_notation().
If specifying more than one notation, be sure the notations are in a list.
notation = c(RCLabels::bracket_notation, RCLabels::arrow_notation)
is unlikely to produce the desired result, because the notations
are concatenated together to form a long string vector.
Rather say
notation = list(RCLabels::bracket_notation, RCLabels::arrow_notation).
For functions that construct labels (such as paste_pref_suff()),
notation can be a list of notations
over which the paste tasks is mapped.
If notation is a list, it must have as many items as
there are prefix/suffix pairs to be pasted.
If either pref or suff are a zero-length character vector
(essentially an empty character vector
such as obtained from character())
input to paste_pref_suff(),
an error is thrown.
Instead, use an empty character string
(such as obtained from "").
notation_vec( sep = " -> ", pref_start = "", pref_end = "", suff_start = "", suff_end = "" ) preposition_notation(preposition, suff_start = " [", suff_end = "]") split_pref_suff( x, transpose = FALSE, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE ) paste_pref_suff( ps = list(pref = pref, suff = suff), pref = NULL, suff = NULL, notation = RCLabels::arrow_notation, squish = TRUE ) flip_pref_suff( x, notation = RCLabels::notations_list, inf_notation = TRUE, choose_most_specific = TRUE ) get_pref_suff( x, which = c("pref", "suff"), inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE ) switch_notation( x, from = RCLabels::notations_list, to, flip = FALSE, inf_notation = TRUE )notation_vec( sep = " -> ", pref_start = "", pref_end = "", suff_start = "", suff_end = "" ) preposition_notation(preposition, suff_start = " [", suff_end = "]") split_pref_suff( x, transpose = FALSE, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE ) paste_pref_suff( ps = list(pref = pref, suff = suff), pref = NULL, suff = NULL, notation = RCLabels::arrow_notation, squish = TRUE ) flip_pref_suff( x, notation = RCLabels::notations_list, inf_notation = TRUE, choose_most_specific = TRUE ) get_pref_suff( x, which = c("pref", "suff"), inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = TRUE ) switch_notation( x, from = RCLabels::notations_list, to, flip = FALSE, inf_notation = TRUE )
sep |
A string separator between prefix and suffix. Default is " -> ". |
pref_start |
A string indicating the start of a prefix. Default is |
pref_end |
A string indicating the end of a prefix. Default is the value of |
suff_start |
A string indicating the start of a suffix. Default is the value of |
suff_end |
A string indicating the end of a suffix. Default is |
preposition |
A string used to indicate position for energy flows, typically "from" or "to" in different notations. |
x |
A string or vector of strings to be operated upon. |
transpose |
A boolean that tells whether to |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
A notation vector generated by one of the |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from the |
ps |
A list of prefixes and suffixes in which each item of the list is itself a list with two items named |
pref |
A string or list of strings that are prefixes. Default is |
suff |
A string of list of strings that are suffixes. Default is |
squish |
A boolean that tells whether to remove extra spaces in the output of |
which |
Tells which to keep, the prefix ("pref") or the suffix ("suff"). |
from |
The |
to |
The |
flip |
A boolean that tells whether to also flip the notation. Default is |
For notation_vec(), arrow_notation, and bracket_notation,
a string vector with named items pref_start, pref_end, suff_start, and suff_end;
For split_pref_suff(), a string list with named items pref and suff.
For paste_pref_suff(), split_pref_suff(), and switch_notation(),
a string list in notation format specified by various notation arguments, including
from, and to.
For keep_pref_suff, one of the prefix or suffix or a list of prefixes or suffixes.
notation_vec() arrow_notation bracket_notation split_pref_suff("a -> b", notation = arrow_notation) # Or infer the notation (by default from notations_list) split_pref_suff("a -> b") split_pref_suff(c("a -> b", "c -> d", "e -> f")) split_pref_suff(c("a -> b", "c -> d", "e -> f"), transpose = TRUE) flip_pref_suff("a [b]", notation = bracket_notation) # Infer notation flip_pref_suff("a [b]") get_pref_suff("a -> b", which = "suff") switch_notation("a -> b", from = arrow_notation, to = bracket_notation) # Infer notation and flip prefix and suffix switch_notation("a -> b", to = bracket_notation, flip = TRUE) # Also works for vectors switch_notation(c("a -> b", "c -> d"), from = arrow_notation, to = bracket_notation) # Functions can infer the correct notation and return multiple matches infer_notation("a [to b]", allow_multiple = TRUE, choose_most_specific = FALSE) # Or choose the most specific notation infer_notation("a [to b]", allow_multiple = TRUE, choose_most_specific = TRUE) # When setting the from notation, only that type of notation will be switched switch_notation(c("a -> b", "c [to d]"), from = arrow_notation, to = bracket_notation) # But if notations are inferred, all notations can be switched switch_notation(c("a -> b", "c [to d]"), to = bracket_notation) # A double-switch can be accomplished. # In this first example, `RCLabels::first_dot_notation` is inferred. switch_notation("a.b.c", to = arrow_notation) # In this second example, # it is easier to specify the `from` and `to` notations. switch_notation("a.b.c", to = arrow_notation) %>% switch_notation(from = first_dot_notation, to = arrow_notation) # "" can be used as an input paste_pref_suff(pref = "a", suff = "", notation = RCLabels::from_notation)notation_vec() arrow_notation bracket_notation split_pref_suff("a -> b", notation = arrow_notation) # Or infer the notation (by default from notations_list) split_pref_suff("a -> b") split_pref_suff(c("a -> b", "c -> d", "e -> f")) split_pref_suff(c("a -> b", "c -> d", "e -> f"), transpose = TRUE) flip_pref_suff("a [b]", notation = bracket_notation) # Infer notation flip_pref_suff("a [b]") get_pref_suff("a -> b", which = "suff") switch_notation("a -> b", from = arrow_notation, to = bracket_notation) # Infer notation and flip prefix and suffix switch_notation("a -> b", to = bracket_notation, flip = TRUE) # Also works for vectors switch_notation(c("a -> b", "c -> d"), from = arrow_notation, to = bracket_notation) # Functions can infer the correct notation and return multiple matches infer_notation("a [to b]", allow_multiple = TRUE, choose_most_specific = FALSE) # Or choose the most specific notation infer_notation("a [to b]", allow_multiple = TRUE, choose_most_specific = TRUE) # When setting the from notation, only that type of notation will be switched switch_notation(c("a -> b", "c [to d]"), from = arrow_notation, to = bracket_notation) # But if notations are inferred, all notations can be switched switch_notation(c("a -> b", "c [to d]"), to = bracket_notation) # A double-switch can be accomplished. # In this first example, `RCLabels::first_dot_notation` is inferred. switch_notation("a.b.c", to = arrow_notation) # In this second example, # it is easier to specify the `from` and `to` notations. switch_notation("a.b.c", to = arrow_notation) %>% switch_notation(from = first_dot_notation, to = arrow_notation) # "" can be used as an input paste_pref_suff(pref = "a", suff = "", notation = RCLabels::from_notation)
This function is similar to split_pref_suff() in that it returns a list.
However, this function's list is more detailed than
split_pref_suff().
The return value from this function is a list
with the first named item being the prefix (with the name noun)
followed by objects of prepositional phrases
(with names being prepositions that precede the objects).
split_noun_pp( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )split_noun_pp( labels, inf_notation = TRUE, notation = RCLabels::notations_list, choose_most_specific = FALSE, prepositions = RCLabels::prepositions_list )
labels |
The row and column labels from which prepositional phrases are to be extracted. |
inf_notation |
A boolean that tells whether to infer notation for |
notation |
The notation type to be used when extracting prepositions.
Default is |
choose_most_specific |
A boolean that tells whether to choose the most specific
notation from |
prepositions |
A vector of strings to be treated as prepositions.
Note that a space is appended to each word internally,
so, e.g., "to" becomes "to ".
Default is |
Unlike split_pref_suff(), it does not make sense to have a transpose
argument on split_noun_pp().
Labels may not have the same structure,
e.g., they may have different prepositions.
A list of lists with items named noun and pp.
# Specify the notation split_noun_pp(c("a [of b in c]", "d [of e into f]"), notation = bracket_notation) # Infer the notation via default arguments split_noun_pp(c("a [of b in c]", "d [of e into f]"))# Specify the notation split_noun_pp(c("a [of b in c]", "d [of e into f]"), notation = bracket_notation) # Infer the notation via default arguments split_noun_pp(c("a [of b in c]", "d [of e into f]"))
This function should only ever see a single label (x)
and a single notation.
strip_label_part(x, notation, part, pattern_pref = "", pattern_suff = "")strip_label_part(x, notation, part, pattern_pref = "", pattern_suff = "")
x |
The label(s) to be split. |
notation |
The notations to be used for each |
part |
The part of the label to work on, such as "pref_start", "pref_end", "suff_start", or "suff_end". |
pattern_pref |
The prefix to a regex pattern to be used in |
pattern_suff |
The suffix to a regex pattern to be used in |
If notation is NULL, x is returned, unmodified.
A label shorn of the part to be stripped.
A description of to notation.
to_notationto_notation
A vector of notational symbols that provides to ("a [to b]") notation.
to_notationto_notation