yq/pkg/yqlib/doc/usage/xml.md
rndmit b9309a42a4
XML decoder additions (#1239)
* Add xml-keep-namespace and xml-raw-token features

* Add tests

* Change flags usage strings

* Append docs
2022-06-15 09:40:31 +10:00

6.4 KiB

XML

Encode and decode to and from XML. Whitespace is not conserved for round trips - but the order of the fields are.

Consecutive xml nodes with the same name are assumed to be arrays.

XML content data and attributes are created as fields. This can be controlled by the '--xml-attribute-prefix and --xml-content-name flags - see below for examples.

{% hint style="warning" %} Note that versions prior to 4.18 require the 'eval/e' command to be specified.

yq e <exp> <file> {% endhint %}

Parse xml: simple

Notice how all the values are strings, see the next example on how you can fix that.

Given a sample.xml file of:

<?xml version="1.0" encoding="UTF-8"?>
<cat>
  <says>meow</says>
  <legs>4</legs>
  <cute>true</cute>
</cat>

then

yq -p=xml '.' sample.xml

will output

cat:
  says: meow
  legs: "4"
  cute: "true"

Parse xml: number

All values are assumed to be strings when parsing XML, but you can use the from_yaml operator on all the strings values to autoparse into the correct type.

Given a sample.xml file of:

<?xml version="1.0" encoding="UTF-8"?>
<cat>
  <says>meow</says>
  <legs>4</legs>
  <cute>true</cute>
</cat>

then

yq -p=xml ' (.. | select(tag == "!!str")) |= from_yaml' sample.xml

will output

cat:
  says: meow
  legs: 4
  cute: true

Parse xml: array

Consecutive nodes with identical xml names are assumed to be arrays.

Given a sample.xml file of:

<?xml version="1.0" encoding="UTF-8"?>
<animal>cat</animal>
<animal>goat</animal>

then

yq -p=xml '.' sample.xml

will output

animal:
  - cat
  - goat

Parse xml: attributes

Attributes are converted to fields, with the default attribute prefix '+'. Use '--xml-attribute-prefix` to set your own.

Given a sample.xml file of:

<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">
  <legs>7</legs>
</cat>

then

yq -p=xml '.' sample.xml

will output

cat:
  +legs: "4"
  legs: "7"

Parse xml: attributes with content

Content is added as a field, using the default content name of +content. Use --xml-content-name to set your own.

Given a sample.xml file of:

<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">meow</cat>

then

yq -p=xml '.' sample.xml

will output

cat:
  +content: meow
  +legs: "4"

Parse xml: custom dtd

DTD entities are ignored.

Given a sample.xml file of:


<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY writer "Blah.">
<!ENTITY copyright "Blah">
]>
<root>
    <item>&writer;&copyright;</item>
</root>

then

yq -p=xml '.' sample.xml

will output

root:
  item: '&writer;&copyright;'

Parse xml: with comments

A best attempt is made to preserve comments.

Given a sample.xml file of:


<!-- before cat -->
<cat>
	<!-- in cat before -->
	<x>3<!-- multi
line comment 
for x --></x>
	<!-- before y -->
	<y>
		<!-- in y before -->
		<d><!-- in d before -->z<!-- in d after --></d>
		
		<!-- in y after -->
	</y>
	<!-- in_cat_after -->
</cat>
<!-- after cat -->

then

yq -p=xml '.' sample.xml

will output

# before cat
cat:
  # in cat before
  x: "3" # multi
  # line comment 
  # for x
  # before y

  y:
    # in y before
    # in d before
    d: z # in d after
    # in y after
  # in_cat_after
# after cat

Parse xml: keep attribute namespace

Given a sample.xml file of:


<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url">
</map>

then

yq -p=xml -o=xml --xml-keep-namespace '.' sample.xml

will output

<map xmlns="some-namespace" xmlns:xsi="some-instance" some-instance:schemaLocation="some-url"></map>

instead of

<map xmlns="some-namespace" xsi="some-instance" schemaLocation="some-url"></map>

Parse xml: keep raw attribute namespace

Given a sample.xml file of:


<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url">
</map>

then

yq -p=xml -o=xml --xml-keep-namespace --xml-raw-token '.' sample.xml

will output

<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>

instead of

<map xmlns="some-namespace" xsi="some-instance" schemaLocation="some-url"></map>

Encode xml: simple

Given a sample.yml file of:

cat: purrs

then

yq -o=xml '.' sample.yml

will output

<cat>purrs</cat>

Encode xml: array

Given a sample.yml file of:

pets:
  cat:
    - purrs
    - meows

then

yq -o=xml '.' sample.yml

will output

<pets>
  <cat>purrs</cat>
  <cat>meows</cat>
</pets>

Encode xml: attributes

Fields with the matching xml-attribute-prefix are assumed to be attributes.

Given a sample.yml file of:

cat:
  +name: tiger
  meows: true

then

yq -o=xml '.' sample.yml

will output

<cat name="tiger">
  <meows>true</meows>
</cat>

Encode xml: attributes with content

Fields with the matching xml-content-name is assumed to be content.

Given a sample.yml file of:

cat:
  +name: tiger
  +content: cool

then

yq -o=xml '.' sample.yml

will output

<cat name="tiger">cool</cat>

Encode xml: comments

A best attempt is made to copy comments to xml.

Given a sample.yml file of:

# above_cat
cat: # inline_cat
  # above_array
  array: # inline_array
    - val1 # inline_val1
    # above_val2
    - val2 # inline_val2
# below_cat

then

yq -o=xml '.' sample.yml

will output

<!-- above_cat inline_cat --><cat><!-- above_array inline_array -->
  <array>val1<!-- inline_val1 --></array>
  <array><!-- above_val2 -->val2<!-- inline_val2 --></array>
</cat><!-- below_cat -->

Round trip: with comments

A best effort is made, but comment positions and white space are not preserved perfectly.

Given a sample.xml file of:


<!-- before cat -->
<cat>
	<!-- in cat before -->
	<x>3<!-- multi
line comment 
for x --></x>
	<!-- before y -->
	<y>
		<!-- in y before -->
		<d><!-- in d before -->z<!-- in d after --></d>
		
		<!-- in y after -->
	</y>
	<!-- in_cat_after -->
</cat>
<!-- after cat -->

then

yq -p=xml -o=xml '.' sample.xml

will output

<!-- before cat --><cat><!-- in cat before -->
  <x>3<!-- multi
line comment 
for x --></x><!-- before y -->
  <y><!-- in y before
in d before -->
    <d>z<!-- in d after --></d><!-- in y after -->
  </y><!-- in_cat_after -->
</cat><!-- after cat -->