yq/pkg/yqlib/doc/usage/xml.md

383 lines
6.4 KiB
Markdown
Raw Normal View History

2021-12-21 04:02:07 +00:00
# XML
2021-12-21 05:52:54 +00:00
Encode and decode to and from XML. Whitespace is not conserved for round trips - but the order of the fields are.
2021-12-21 04:02:07 +00:00
2021-12-21 05:52:54 +00:00
Consecutive xml nodes with the same name are assumed to be arrays.
2021-12-21 04:02:07 +00:00
2022-01-22 01:35:33 +00:00
XML content data and attributes are created as fields. This can be controlled by the `'--xml-attribute-prefix` and `--xml-content-name` flags - see below for examples.
2021-12-21 04:02:07 +00:00
2022-02-06 03:39:46 +00:00
{% hint style="warning" %}
Note that versions prior to 4.18 require the 'eval/e' command to be specified. 
`yq e <exp> <file>`
{% endhint %}
2022-01-22 01:35:33 +00:00
## Parse xml: simple
Notice how all the values are strings, see the next example on how you can fix that.
2021-12-21 04:02:07 +00:00
2022-01-22 01:35:33 +00:00
Given a sample.xml file of:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<cat>
<says>meow</says>
<legs>4</legs>
<cute>true</cute>
</cat>
2021-12-21 04:02:07 +00:00
```
2022-01-22 01:35:33 +00:00
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml '.' sample.xml
2021-12-21 04:02:07 +00:00
```
2022-01-22 01:35:33 +00:00
will output
```yaml
cat:
says: meow
legs: "4"
cute: "true"
2021-12-21 05:52:54 +00:00
```
2022-01-22 01:35:33 +00:00
## Parse xml: number
All values are assumed to be strings when parsing XML, but you can use the `from_yaml` operator on all the strings values to autoparse into the correct type.
2021-12-21 05:52:54 +00:00
2021-12-21 04:02:07 +00:00
Given a sample.xml file of:
```xml
<?xml version="1.0" encoding="UTF-8"?>
2022-01-22 01:35:33 +00:00
<cat>
<says>meow</says>
<legs>4</legs>
<cute>true</cute>
</cat>
2021-12-21 04:02:07 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml ' (.. | select(tag == "!!str")) |= from_yaml' sample.xml
2021-12-21 04:02:07 +00:00
```
will output
```yaml
2022-01-22 01:35:33 +00:00
cat:
says: meow
legs: 4
cute: true
2021-12-21 04:02:07 +00:00
```
## Parse xml: array
Consecutive nodes with identical xml names are assumed to be arrays.
Given a sample.xml file of:
```xml
<?xml version="1.0" encoding="UTF-8"?>
2022-01-22 01:35:33 +00:00
<animal>cat</animal>
<animal>goat</animal>
2021-12-21 04:02:07 +00:00
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml '.' sample.xml
2021-12-21 04:02:07 +00:00
```
will output
```yaml
animal:
2022-01-22 01:35:33 +00:00
- cat
- goat
2021-12-21 04:02:07 +00:00
```
## Parse xml: attributes
Attributes are converted to fields, with the default attribute prefix '+'. Use '--xml-attribute-prefix` to set your own.
2021-12-21 04:02:07 +00:00
Given a sample.xml file of:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">
<legs>7</legs>
</cat>
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml '.' sample.xml
2021-12-21 04:02:07 +00:00
```
will output
```yaml
cat:
+legs: "4"
legs: "7"
```
## Parse xml: attributes with content
2022-01-22 01:35:33 +00:00
Content is added as a field, using the default content name of `+content`. Use `--xml-content-name` to set your own.
2021-12-21 04:02:07 +00:00
Given a sample.xml file of:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">meow</cat>
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml '.' sample.xml
2021-12-21 04:02:07 +00:00
```
will output
```yaml
cat:
+content: meow
+legs: "4"
```
## Parse xml: custom dtd
DTD entities are ignored.
Given a sample.xml file of:
```xml
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY writer "Blah.">
<!ENTITY copyright "Blah">
]>
<root>
<item>&writer;&copyright;</item>
</root>
```
then
```bash
yq -p=xml '.' sample.xml
```
will output
```yaml
root:
item: '&writer;&copyright;'
```
## Parse xml: with comments
A best attempt is made to preserve comments.
Given a sample.xml file of:
```xml
<!-- before cat -->
<cat>
<!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x>
<!-- before y -->
<y>
<!-- in y before -->
<d><!-- in d before -->z<!-- in d after --></d>
<!-- in y after -->
</y>
<!-- in_cat_after -->
</cat>
<!-- after cat -->
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml '.' sample.xml
```
will output
```yaml
# before cat
cat:
# in cat before
x: "3" # multi
# line comment
# for x
# before y
y:
# in y before
# in d before
d: z # in d after
# in y after
# in_cat_after
# after cat
```
## Parse xml: keep attribute namespace
Given a sample.xml file of:
```xml
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url">
</map>
```
then
```bash
yq -p=xml -o=xml --xml-keep-namespace '.' sample.xml
```
will output
```xml
<map xmlns="some-namespace" xmlns:xsi="some-instance" some-instance:schemaLocation="some-url"></map>
```
instead of
```xml
<map xmlns="some-namespace" xsi="some-instance" schemaLocation="some-url"></map>
```
## Parse xml: keep raw attribute namespace
Given a sample.xml file of:
```xml
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url">
</map>
```
then
```bash
yq -p=xml -o=xml --xml-keep-namespace --xml-raw-token '.' sample.xml
```
will output
```xml
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>
```
instead of
```xml
<map xmlns="some-namespace" xsi="some-instance" schemaLocation="some-url"></map>
```
2021-12-21 04:56:08 +00:00
## Encode xml: simple
Given a sample.yml file of:
```yaml
cat: purrs
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -o=xml '.' sample.yml
2021-12-21 04:56:08 +00:00
```
will output
```xml
<cat>purrs</cat>
```
2021-12-21 04:56:08 +00:00
## Encode xml: array
Given a sample.yml file of:
```yaml
pets:
cat:
- purrs
- meows
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -o=xml '.' sample.yml
2021-12-21 04:56:08 +00:00
```
will output
```xml
<pets>
<cat>purrs</cat>
<cat>meows</cat>
</pets>
```
2021-12-21 04:56:08 +00:00
2021-12-21 05:08:37 +00:00
## Encode xml: attributes
Fields with the matching xml-attribute-prefix are assumed to be attributes.
Given a sample.yml file of:
```yaml
cat:
+name: tiger
meows: true
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -o=xml '.' sample.yml
2021-12-21 05:08:37 +00:00
```
will output
```xml
<cat name="tiger">
<meows>true</meows>
</cat>
```
2021-12-21 05:08:37 +00:00
2021-12-21 05:19:27 +00:00
## Encode xml: attributes with content
Fields with the matching xml-content-name is assumed to be content.
Given a sample.yml file of:
```yaml
cat:
+name: tiger
+content: cool
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -o=xml '.' sample.yml
2021-12-21 05:19:27 +00:00
```
will output
```xml
<cat name="tiger">cool</cat>
```
## Encode xml: comments
A best attempt is made to copy comments to xml.
Given a sample.yml file of:
```yaml
# above_cat
cat: # inline_cat
# above_array
array: # inline_array
- val1 # inline_val1
# above_val2
- val2 # inline_val2
# below_cat
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -o=xml '.' sample.yml
```
will output
```xml
<!-- above_cat inline_cat --><cat><!-- above_array inline_array -->
<array>val1<!-- inline_val1 --></array>
<array><!-- above_val2 -->val2<!-- inline_val2 --></array>
</cat><!-- below_cat -->
```
## Round trip: with comments
A best effort is made, but comment positions and white space are not preserved perfectly.
Given a sample.xml file of:
```xml
<!-- before cat -->
<cat>
<!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x>
<!-- before y -->
<y>
<!-- in y before -->
<d><!-- in d before -->z<!-- in d after --></d>
<!-- in y after -->
</y>
<!-- in_cat_after -->
</cat>
<!-- after cat -->
```
then
```bash
2022-01-27 06:21:10 +00:00
yq -p=xml -o=xml '.' sample.xml
```
will output
```xml
<!-- before cat --><cat><!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x><!-- before y -->
<y><!-- in y before
in d before -->
<d>z<!-- in d after --></d><!-- in y after -->
</y><!-- in_cat_after -->
</cat><!-- after cat -->
```
2021-12-21 05:19:27 +00:00