yq/how-it-works.md

130 lines
3.6 KiB
Markdown
Raw Normal View History

2021-11-03 02:54:09 +00:00
# How it works
2021-04-24 07:41:06 +00:00
In `yq` expressions are made up of operators and pipes. A context of nodes is passed through the expression and each operation takes the context as input and returns a new context as output. That output is piped in as input for the next operation in the expression. To begin with, the context is set to the first yaml document of the first yaml file (if processing in sequence using eval).
2021-01-13 04:23:26 +00:00
Lets look at a couple of examples.
2021-11-03 02:54:09 +00:00
## Simple assignment example
2021-04-24 07:41:06 +00:00
Given a document like:
```yaml
2021-11-03 02:54:09 +00:00
a: cat
b: dog
2021-04-24 07:41:06 +00:00
```
with an expression:
```
2021-11-03 02:54:09 +00:00
.a = .b
2021-04-24 07:41:06 +00:00
```
2022-05-25 01:02:18 +00:00
Like math expressions - operator precedence is important.
2021-11-03 02:54:09 +00:00
The `=` operator takes two arguments, a `lhs` expression, which in this case is `.a` and `rhs` expression which is `.b`.
It pipes the current, lets call it 'root' context through the `lhs` expression of `.a` to return the node
2021-04-24 07:41:06 +00:00
```yaml
2021-11-03 02:54:09 +00:00
cat
2021-04-24 07:41:06 +00:00
```
2021-11-03 02:54:09 +00:00
Sidenote: this node holds not only its value 'cat', but comments and metadata too, including path and parent information.
2021-04-24 07:41:06 +00:00
2021-11-03 02:54:09 +00:00
The `=` operator then pipes the 'root' context through the `rhs` expression of `.b` to return the node
2021-04-24 07:41:06 +00:00
2021-11-03 02:54:09 +00:00
```yaml
dog
2021-04-24 07:41:06 +00:00
```
Both sides have now been evaluated, so now the operator copies across the value from the RHS (`.b`) to the LHS (`.a`), and it returns the now updated context:
2021-05-05 05:03:27 +00:00
```yaml
2021-11-03 02:54:09 +00:00
a: dog
2021-05-05 05:03:27 +00:00
b: dog
```
2021-11-03 02:54:09 +00:00
## Complex assignment, operator precedence rules
2021-05-05 05:03:27 +00:00
2022-05-25 01:02:18 +00:00
Just like math expressions - `yq` expressions have an order of precedence. The pipe `|` operator has a low order of precedence, so operators with higher precedence will get evaluated first.
2021-05-05 05:03:27 +00:00
2021-11-03 02:54:09 +00:00
Most of the time, this is intuitively what you'd want, for instance `.a = "cat" | .b = "dog"` is effectively: `(.a = "cat") | (.b = "dog")`.
However, this is not always the case, particularly if you have a complex LHS or RHS expression, for instance if you want to select particular nodes to update.
Lets say you had:
2021-01-13 04:23:26 +00:00
2021-05-05 05:03:27 +00:00
```yaml
2021-11-03 02:54:09 +00:00
- name: bob
fruit: apple
- name: sally
fruit: orange
2021-05-05 05:03:27 +00:00
```
2021-11-03 02:54:09 +00:00
Lets say you wanted to update the `sally` entry to have fruit: 'mango'. The _incorrect_ way to do that is:
`.[] | select(.name == "sally") | .fruit = "mango"`.
2021-05-05 05:03:27 +00:00
2022-05-24 08:18:27 +00:00
Because `|` has a low operator precedence, this will be evaluated (_incorrectly_) as : `(.[]) | (select(.name == "sally")) | (.fruit = "mango")`. What you'll see is only the updated segment returned:
2021-05-05 05:03:27 +00:00
```yaml
2021-11-03 02:54:09 +00:00
name: sally
fruit: mango
2021-05-05 05:03:27 +00:00
```
2021-11-03 10:57:34 +00:00
To properly update this yaml, you will need to use brackets (think BODMAS from maths) and wrap the entire LHS:
`(.[] | select(.name == "sally") | .fruit) = "mango"`
2021-05-05 05:03:27 +00:00
2021-01-13 04:23:26 +00:00
2021-11-03 10:57:34 +00:00
Now that entire LHS expression is passed to the 'assign' (`=`) operator, and the yaml is correctly updated and returned:
2021-11-03 02:54:09 +00:00
2021-11-03 10:57:34 +00:00
```yaml
- name: bob
fruit: apple
- name: sally
fruit: mango
```
2021-11-03 02:54:09 +00:00
## Relative update (e.g. `|=`)
2021-05-11 04:35:59 +00:00
There is another form of the `=` operator which we call the relative form. It's very similar to `=` but with one key difference when evaluating the RHS expression.
2021-01-13 04:23:26 +00:00
2021-05-11 04:35:59 +00:00
In the plain form, we pass in the 'root' level context to the RHS expression. In relative form, we pass in _each result of the LHS_ to the RHS expression. Let's go through an example.
Given a document like:
```yaml
a: 1
b: thing
```
with an expression:
```
.a |= . + 1
```
Similar to the `=` operator, `|=` takes two operands, the LHS and RHS.
2021-11-03 02:54:09 +00:00
It pipes the current context (the whole document) through the LHS expression of `.a` to get the node value:
2021-05-11 04:35:59 +00:00
```
1
```
Now it pipes _that LHS context_ into the RHS expression `. + 1` (whereas in the `=` plain form it piped the original document context into the RHS) to yield:
```
2
```
The assignment operator then copies across the value from the RHS to the value on the LHS, and it returns the now updated 'root' context:
```yaml
a: 2
b: thing
2021-11-03 02:54:09 +00:00
```