yq/operators/string-operators.md

386 lines
5.8 KiB
Markdown
Raw Normal View History

2021-10-30 03:14:39 +00:00
# String Operators
2021-11-03 04:00:28 +00:00
## RegEx
2022-04-01 02:22:25 +00:00
This uses golangs native regex functions under the hood - See their [docs](https://github.com/google/re2/wiki/Syntax) for the supported syntax.
Case insensitive tip: prefix the regex with `(?i)` - e.g. `test("(?i)cats)"`.
2021-10-30 03:14:39 +00:00
2022-03-22 23:28:38 +00:00
### match(regEx)
This operator returns the substring match details of the given regEx.
### capture(regEx)
Capture returns named RegEx capture groups in a map. Can be more convenient than `match` depending on what you are doing.
2022-04-01 02:22:25 +00:00
## test(regEx)
2022-03-22 23:28:38 +00:00
Returns true if the string matches the RegEx, false otherwise.
## sub(regEx, replacement)
Substitutes matched substrings. The first parameter is the regEx to match substrings within the original string. The second is a what to replace those matches with. This can refer to capture groups from the first RegEx.
2021-10-30 03:14:39 +00:00
## String blocks, bash and newlines
Bash is notorious for chomping on precious trailing newline characters, making it tricky to set strings with newlines properly. In particular, the `$( exp )` _will trim trailing newlines_.
For instance to get this yaml:
```
a: |
cat
```
Using `$( exp )` wont work, as it will trim the trailing new line.
```
2022-01-28 01:45:43 +00:00
m=$(echo "cat\n") yq -n '.a = strenv(m)'
2021-10-30 03:14:39 +00:00
a: cat
```
However, using printf works:
```
2022-01-28 01:45:43 +00:00
printf -v m "cat\n" ; m="$m" yq -n '.a = strenv(m)'
2021-10-30 03:14:39 +00:00
a: |
cat
```
As well as having multiline expressions:
```
m="cat
2022-01-28 01:45:43 +00:00
" yq -n '.a = strenv(m)'
2021-10-30 03:14:39 +00:00
a: |
cat
```
Similarly, if you're trying to set the content from a file, and want a trailing new line:
```
IFS= read -rd '' output < <(cat my_file)
2022-01-28 01:45:43 +00:00
output=$output ./yq '.data.values = strenv(output)' first.yml
2021-10-30 03:14:39 +00:00
```
2022-02-27 01:10:01 +00:00
## To up (upper) case
Works with unicode characters
Given a sample.yml file of:
```yaml
água
```
then
```bash
yq 'upcase' sample.yml
```
will output
```yaml
ÁGUA
```
## To down (lower) case
Works with unicode characters
Given a sample.yml file of:
```yaml
ÁgUA
```
then
```bash
yq 'downcase' sample.yml
```
will output
```yaml
água
```
2021-11-03 04:00:28 +00:00
## Join strings
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
- cat
- meow
- 1
- null
- true
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'join("; ")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
cat; meow; 1; ; true
```
2022-08-29 08:31:50 +00:00
## Trim strings
Given a sample.yml file of:
```yaml
- ' cat'
- 'dog '
- ' cow cow '
- horse
```
then
```bash
yq '.[] | trim' sample.yml
```
will output
```yaml
cat
dog
cow cow
horse
```
2021-11-03 04:00:28 +00:00
## Match string
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
foo bar foo
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'match("foo")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
string: foo
offset: 0
length: 3
captures: []
```
2021-11-03 04:00:28 +00:00
## Match string, case insensitive
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
foo bar FOO
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '[match("(?i)foo"; "g")]' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- string: foo
offset: 0
length: 3
captures: []
- string: FOO
offset: 8
length: 3
captures: []
```
2022-02-27 01:10:01 +00:00
## Match with global capture group
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
abc abc
```
then
```bash
2022-02-27 01:10:01 +00:00
yq '[match("(ab)(c)"; "g")]' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- string: abc
offset: 0
length: 3
captures:
2022-02-27 01:10:01 +00:00
- string: ab
2021-10-30 03:14:39 +00:00
offset: 0
2022-02-27 01:10:01 +00:00
length: 2
- string: c
offset: 2
length: 1
2021-10-30 03:14:39 +00:00
- string: abc
offset: 4
length: 3
captures:
2022-02-27 01:10:01 +00:00
- string: ab
2021-10-30 03:14:39 +00:00
offset: 4
2022-02-27 01:10:01 +00:00
length: 2
- string: c
offset: 6
length: 1
2021-10-30 03:14:39 +00:00
```
2021-11-03 04:00:28 +00:00
## Match with named capture groups
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
foo bar foo foo foo
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '[match("foo (?P<bar123>bar)? foo"; "g")]' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- string: foo bar foo
offset: 0
length: 11
captures:
- string: bar
offset: 4
length: 3
name: bar123
- string: foo foo
offset: 12
length: 8
captures:
- string: null
offset: -1
length: 0
name: bar123
```
2021-11-03 04:00:28 +00:00
## Capture named groups into a map
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
xyzzy-14
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'capture("(?P<a>[a-z]+)-(?P<n>[0-9]+)")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
a: xyzzy
n: "14"
```
2021-11-03 04:00:28 +00:00
## Match without global flag
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
cat cat
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'match("cat")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
string: cat
offset: 0
length: 3
captures: []
```
2021-11-03 04:00:28 +00:00
## Match with global flag
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
cat cat
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '[match("cat"; "g")]' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- string: cat
offset: 0
length: 3
captures: []
- string: cat
offset: 4
length: 3
captures: []
```
2021-11-03 04:00:28 +00:00
## Test using regex
2022-05-25 01:01:35 +00:00
Like jq's equivalent, this works like match but only returns true/false instead of full match details
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
- cat
- dog
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '.[] | test("at")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
true
false
```
2021-11-03 04:00:28 +00:00
## Substitute / Replace string
This uses golang regex, described [here](https://github.com/google/re2/wiki/Syntax)
Note the use of `|=` to run in context of the current string value.
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
a: dogs are great
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '.a |= sub("dogs", "cats")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
a: cats are great
```
2021-11-03 04:00:28 +00:00
## Substitute / Replace string with regex
This uses golang regex, described [here](https://github.com/google/re2/wiki/Syntax)
Note the use of `|=` to run in context of the current string value.
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
a: cat
b: heat
```
then
```bash
2022-01-28 01:45:43 +00:00
yq '.[] |= sub("(a)", "${1}r")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
a: cart
b: heart
```
2022-02-27 01:10:01 +00:00
## Custom types: that are really strings
When custom tags are encountered, yq will try to decode the underlying type.
Given a sample.yml file of:
```yaml
a: !horse cat
b: !goat heat
```
then
```bash
yq '.[] |= sub("(a)", "${1}r")' sample.yml
```
will output
```yaml
a: !horse cart
b: !goat heart
```
2021-11-03 04:00:28 +00:00
## Split strings
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
cat; meow; 1; ; true
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'split("; ")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- cat
- meow
- "1"
- ""
- "true"
```
2021-11-03 04:00:28 +00:00
## Split strings one match
2021-10-30 03:14:39 +00:00
Given a sample.yml file of:
```yaml
word
```
then
```bash
2022-01-28 01:45:43 +00:00
yq 'split("; ")' sample.yml
2021-10-30 03:14:39 +00:00
```
will output
```yaml
- word
```
2021-11-03 04:00:28 +00:00