yq/pkg/yqlib/doc/operators/string-operators.md

378 lines
5.8 KiB
Markdown
Raw Normal View History

2021-11-03 04:00:58 +00:00
# String Operators
## RegEx
This uses Golang's native regex functions under the hood - See their [docs](https://github.com/google/re2/wiki/Syntax) for the supported syntax.
2022-04-01 02:21:55 +00:00
Case insensitive tip: prefix the regex with `(?i)` - e.g. `test("(?i)cats)"`.
2021-11-03 04:00:58 +00:00
2022-03-22 23:28:45 +00:00
### match(regEx)
This operator returns the substring match details of the given regEx.
### capture(regEx)
Capture returns named RegEx capture groups in a map. Can be more convenient than `match` depending on what you are doing.
2022-04-01 02:21:55 +00:00
## test(regEx)
2022-03-22 23:28:45 +00:00
Returns true if the string matches the RegEx, false otherwise.
## sub(regEx, replacement)
Substitutes matched substrings. The first parameter is the regEx to match substrings within the original string. The second parameter specifies what to replace those matches with. This can refer to capture groups from the first RegEx.
2021-11-03 04:00:58 +00:00
## String blocks, bash and newlines
Bash is notorious for chomping on precious trailing newline characters, making it tricky to set strings with newlines properly. In particular, the `$( exp )` _will trim trailing newlines_.
For instance to get this yaml:
```
a: |
cat
```
Using `$( exp )` wont work, as it will trim the trailing newline.
2021-11-03 04:00:58 +00:00
```
2022-01-27 06:21:10 +00:00
m=$(echo "cat\n") yq -n '.a = strenv(m)'
2021-11-03 04:00:58 +00:00
a: cat
```
However, using printf works:
```
2022-01-27 06:21:10 +00:00
printf -v m "cat\n" ; m="$m" yq -n '.a = strenv(m)'
2021-11-03 04:00:58 +00:00
a: |
cat
```
As well as having multiline expressions:
```
m="cat
2022-01-27 06:21:10 +00:00
" yq -n '.a = strenv(m)'
2021-11-03 04:00:58 +00:00
a: |
cat
```
Similarly, if you're trying to set the content from a file, and want a trailing newline:
2021-11-03 04:00:58 +00:00
```
IFS= read -rd '' output < <(cat my_file)
2022-01-27 06:21:10 +00:00
output=$output ./yq '.data.values = strenv(output)' first.yml
2021-11-03 04:00:58 +00:00
```
2022-02-22 05:17:23 +00:00
## To up (upper) case
Works with unicode characters
Given a sample.yml file of:
```yaml
água
```
then
```bash
yq 'upcase' sample.yml
```
will output
```yaml
ÁGUA
```
## To down (lower) case
Works with unicode characters
Given a sample.yml file of:
```yaml
ÁgUA
```
then
```bash
yq 'downcase' sample.yml
```
will output
```yaml
água
```
2021-11-03 04:00:58 +00:00
## Join strings
Given a sample.yml file of:
```yaml
2023-05-09 03:51:21 +00:00
[cat, meow, 1, null, true]
2021-11-03 04:00:58 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'join("; ")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
cat; meow; 1; ; true
```
2022-08-08 03:35:57 +00:00
## Trim strings
Given a sample.yml file of:
```yaml
2023-05-09 03:51:21 +00:00
[' cat', 'dog ', ' cow cow ', horse]
2022-08-08 03:35:57 +00:00
```
then
```bash
yq '.[] | trim' sample.yml
```
will output
```yaml
cat
dog
cow cow
horse
```
2021-11-03 04:00:58 +00:00
## Match string
Given a sample.yml file of:
```yaml
foo bar foo
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'match("foo")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
string: foo
offset: 0
length: 3
captures: []
```
## Match string, case insensitive
Given a sample.yml file of:
```yaml
foo bar FOO
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '[match("(?i)foo"; "g")]' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- string: foo
offset: 0
length: 3
captures: []
- string: FOO
offset: 8
length: 3
captures: []
```
## Match with global capture group
2021-11-03 04:00:58 +00:00
Given a sample.yml file of:
```yaml
abc abc
```
then
```bash
yq '[match("(ab)(c)"; "g")]' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- string: abc
offset: 0
length: 3
captures:
- string: ab
2021-11-03 04:00:58 +00:00
offset: 0
length: 2
- string: c
offset: 2
length: 1
2021-11-03 04:00:58 +00:00
- string: abc
offset: 4
length: 3
captures:
- string: ab
2021-11-03 04:00:58 +00:00
offset: 4
length: 2
- string: c
offset: 6
length: 1
2021-11-03 04:00:58 +00:00
```
## Match with named capture groups
Given a sample.yml file of:
```yaml
foo bar foo foo foo
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '[match("foo (?P<bar123>bar)? foo"; "g")]' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- string: foo bar foo
offset: 0
length: 11
captures:
- string: bar
offset: 4
length: 3
name: bar123
- string: foo foo
offset: 12
length: 8
captures:
- string: null
offset: -1
length: 0
name: bar123
```
## Capture named groups into a map
Given a sample.yml file of:
```yaml
xyzzy-14
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'capture("(?P<a>[a-z]+)-(?P<n>[0-9]+)")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
a: xyzzy
n: "14"
```
## Match without global flag
Given a sample.yml file of:
```yaml
cat cat
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'match("cat")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
string: cat
offset: 0
length: 3
captures: []
```
## Match with global flag
Given a sample.yml file of:
```yaml
cat cat
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '[match("cat"; "g")]' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- string: cat
offset: 0
length: 3
captures: []
- string: cat
offset: 4
length: 3
captures: []
```
## Test using regex
2022-05-24 08:18:27 +00:00
Like jq's equivalent, this works like match but only returns true/false instead of full match details
2021-11-03 04:00:58 +00:00
Given a sample.yml file of:
```yaml
2023-05-09 03:51:21 +00:00
[cat, dog]
2021-11-03 04:00:58 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '.[] | test("at")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
true
false
```
## Substitute / Replace string
2023-03-16 02:41:10 +00:00
This uses Golang's regex, described [here](https://github.com/google/re2/wiki/Syntax).
2021-11-03 04:00:58 +00:00
Note the use of `|=` to run in context of the current string value.
Given a sample.yml file of:
```yaml
a: dogs are great
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '.a |= sub("dogs", "cats")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
a: cats are great
```
## Substitute / Replace string with regex
2023-03-16 02:41:10 +00:00
This uses Golang's regex, described [here](https://github.com/google/re2/wiki/Syntax).
2021-11-03 04:00:58 +00:00
Note the use of `|=` to run in context of the current string value.
Given a sample.yml file of:
```yaml
a: cat
b: heat
```
then
```bash
2022-01-27 06:21:10 +00:00
yq '.[] |= sub("(a)", "${1}r")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
a: cart
b: heart
```
2022-02-22 03:50:45 +00:00
## Custom types: that are really strings
When custom tags are encountered, yq will try to decode the underlying type.
Given a sample.yml file of:
```yaml
a: !horse cat
b: !goat heat
```
then
```bash
yq '.[] |= sub("(a)", "${1}r")' sample.yml
```
will output
```yaml
a: !horse cart
b: !goat heart
```
2021-11-03 04:00:58 +00:00
## Split strings
Given a sample.yml file of:
```yaml
2023-05-09 03:51:21 +00:00
"cat; meow; 1; ; true"
2021-11-03 04:00:58 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'split("; ")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- cat
- meow
- "1"
- ""
- "true"
```
## Split strings one match
Given a sample.yml file of:
```yaml
2023-05-09 03:51:21 +00:00
"word"
2021-11-03 04:00:58 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq 'split("; ")' sample.yml
2021-11-03 04:00:58 +00:00
```
will output
```yaml
- word
```