16 KiB
General rules
✅ DO:
- You can use ./yq with the
--debug-node-infoflag to get a deeper understanding of the ast. - run ./scripts/format.sh then ./scripts/check.sh to format, then validate linting and spelling
- Add comprehensive tests to cover the changes
- Run test suite to ensure there is no regression
❌ DON'T:
- Git add or commit
Adding a New Encoder/Decoder
This guide explains how to add support for a new format (encoder/decoder) to yq without modifying candidate_node.go.
Overview
The encoder/decoder architecture in yq is based on two main interfaces:
- Encoder: Converts a
CandidateNodeto output in a specific format - Decoder: Reads input in a specific format and creates a
CandidateNode
Each format is registered in pkg/yqlib/format.go and made available through factory functions.
Architecture
Key Files
pkg/yqlib/encoder.go- Defines theEncoderinterfacepkg/yqlib/decoder.go- Defines theDecoderinterfacepkg/yqlib/format.go- Format registry and factory functionspkg/yqlib/operator_encoder_decoder.go- Encode/decode operatorspkg/yqlib/encoder_*.go- Encoder implementationspkg/yqlib/decoder_*.go- Decoder implementations
Interfaces
Encoder Interface:
type Encoder interface {
Encode(writer io.Writer, node *CandidateNode) error
PrintDocumentSeparator(writer io.Writer) error
PrintLeadingContent(writer io.Writer, content string) error
CanHandleAliases() bool
}
Decoder Interface:
type Decoder interface {
Init(reader io.Reader) error
Decode() (*CandidateNode, error)
}
Step-by-Step: Adding a New Encoder/Decoder
Step 1: Create the Encoder File
Create pkg/yqlib/encoder_<format>.go implementing the Encoder interface:
Encode()- Convert aCandidateNodeto your format and write to the output writerPrintDocumentSeparator()- Handle document separators if your format requires themPrintLeadingContent()- Handle leading content/comments if supportedCanHandleAliases()- Return whether your format supports YAML aliases
See encoder_json.go or encoder_base64.go for examples.
Step 2: Create the Decoder File
Create pkg/yqlib/decoder_<format>.go implementing the Decoder interface:
Init()- Initialize the decoder with the input reader and set up any needed stateDecode()- Decode one document from the input and return aCandidateNode, orio.EOFwhen finished
See decoder_json.go or decoder_base64.go for examples.
Step 3: Create Tests (Mandatory)
Create a test file pkg/yqlib/<format>_test.go using the formatScenario pattern:
- Define test scenarios as
formatScenariostructs with fields:description,input,expected,scenarioType scenarioTypecan be"decode"(test decoding to YAML) or"roundtrip"(encode/decode preservation)- Create a helper function
test<Format>Scenario()that switches onscenarioType - Create main test function
Test<Format>FormatScenarios()that iterates over scenarios
Test coverage must include:
- Basic data types (scalars, arrays, objects/maps)
- Nested structures
- Edge cases (empty inputs, special characters, escape sequences)
- Format-specific features or syntax
- Round-trip tests: decode → encode → decode should preserve data
See hcl_test.go for a complete example.
Step 4: Register the Format in format.go
Edit pkg/yqlib/format.go:
-
Add a new format variable:
"<format>"is the formal name (e.g., "json", "yaml")[]string{...}contains short aliases (can be empty)- The first function creates an encoder (can be nil for encode-only formats)
- The second function creates a decoder (can be nil for decode-only formats)
-
Add the format to the
Formatsslice in the same file
See existing formats in format.go for the exact structure.
Step 5: Handle Encoder Configuration (if needed)
If your format has preferences/configuration options:
- Create a preferences struct with your configuration fields
- Update the encoder to accept preferences in its factory function
- Update
format.goto pass the configured preferences - Update
operator_encoder_decoder.goif special indent handling is needed (see existing formats like JSON and YAML for the pattern)
This pattern is optional and only needed if your format has user-configurable options.
Build Tags
Use build tags to allow optional compilation of formats:
- Add
//go:build !yq_no<format>at the top of your encoder and decoder files - Create a no-build version in
pkg/yqlib/no_<format>.gothat returns nil for encoder/decoder factories
This allows users to compile yq without certain formats using: go build -tags yq_no<format>
Working with CandidateNode
The CandidateNode struct represents a YAML node with:
Kind: The node type (ScalarNode, SequenceNode, MappingNode)Tag: The YAML tag (e.g., "!!str", "!!int", "!!map")Value: The scalar value (for ScalarNode only)Content: Child nodes (for SequenceNode and MappingNode)
Key methods:
node.guessTagFromCustomType()- Infer the tag from Go typenode.AsList()- Convert to a list for processingnode.CreateReplacement()- Create a new replacement nodeNewCandidate()- Create a new CandidateNode
Key Points
✅ DO:
- Implement only the
EncoderandDecoderinterfaces - Register your format in
format.goonly - Keep format-specific logic in your encoder/decoder files
- Use the candidate_node style attribute to store style information for round-trip. Ask if this needs to be updated with new styles.
- Use build tags for optional compilation
- Add comprehensive tests
- Run the specific encoder/decoder test (e.g. test.go) whenever you make ay changes to the encoder or decoder_
- Handle errors gracefully
- Add the no build directive, like the xml encoder and decoder, that enables a minimal yq builds. e.g.
//go:build !yq_<format>. Be sure to also update the build_small-yq.sh and build-tinygo-yq.sh to not include the new format.
❌ DON'T:
- Modify
candidate_node.goto add format-specific logic - Add format-specific fields to
CandidateNode - Create special cases in core navigation or evaluation logic
- Bypass the encoder/decoder interfaces
- Use candidate_node tag attribute for anything other than indicate the data type
Examples
Refer to existing format implementations for patterns:
- Simple encoder/decoder:
encoder_json.go,decoder_json.go - Complex with preferences:
encoder_yaml.go,decoder_yaml.go - Encoder-only:
encoder_sh.go(ShFormat has nil decoder) - String-only operations:
encoder_base64.go,decoder_base64.go
Testing Your Implementation (Mandatory)
Tests must be implemented in <format>_test.go following the formatScenario pattern:
-
Create test scenarios using the
formatScenariostruct with fields:description: Brief description of what's being testedinput: Sample input in your formatexpected: Expected output (typically in YAML for decode tests)scenarioType: Either"decode"or"roundtrip"
-
Test coverage must include:
- Basic data types (scalars, arrays, objects/maps)
- Nested structures
- Edge cases (empty inputs, special characters, escape sequences)
- Format-specific features or syntax
- Round-trip tests: decode → encode → decode should preserve data
-
Test function pattern:
test<Format>Scenario(): Helper function that switches onscenarioTypeTest<Format>FormatScenarios(): Main test function that iterates over scenarios
-
Example from existing formats:
- See
hcl_test.gofor a complete example - See
yaml_test.gofor YAML-specific patterns - See
json_test.gofor more complex scenarios
- See
Common Patterns
Scalar-Only Formats
Some formats only work with scalars (like base64, uri):
if node.guessTagFromCustomType() != "!!str" {
return fmt.Errorf("cannot encode %v as <format>, can only operate on strings", node.Tag)
}
Format with Indentation
Use preferences to control output formatting:
type <format>Preferences struct {
Indent int
}
func (prefs *<format>Preferences) Copy() <format>Preferences {
return *prefs
}
Multiple Documents
Decoders should support reading multiple documents:
func (dec *<format>Decoder) Decode() (*CandidateNode, error) {
if dec.finished {
return nil, io.EOF
}
// ... decode next document ...
if noMoreDocuments {
dec.finished = true
}
return candidate, nil
}
Adding a New Operator
This guide explains how to add a new operator to yq. Operators are the core of yq's expression language and process CandidateNode objects without requiring modifications to candidate_node.go itself.
Overview
Operators transform data by implementing a handler function that processes a Context containing CandidateNode objects. Each operator is:
- Defined as an
operationTypeinoperation.go - Registered in the lexer in
lexer_participle.go - Implemented in its own
operator_<type>.gofile - Tested in
operator_<type>_test.go - Documented in
pkg/yqlib/doc/operators/headers/<type>.md
Architecture
Key Files
pkg/yqlib/operation.go- DefinesoperationTypeand operator registrypkg/yqlib/lexer_participle.go- Registers operators with their syntax patternspkg/yqlib/operator_<type>.go- Operator implementationpkg/yqlib/operator_<type>_test.go- Operator tests usingexpressionScenariopkg/yqlib/doc/operators/headers/<type>.md- Documentation header
Core Types
operationType:
type operationType struct {
Type string // Unique operator name (e.g., "REVERSE")
NumArgs uint // Number of arguments (0 for no args)
Precedence uint // Operator precedence (higher = higher precedence)
Handler operatorHandler // The function that executes the operator
CheckForPostTraverse bool // Whether to apply post-traversal logic
ToString func(*Operation) string // Custom string representation
}
operatorHandler signature:
type operatorHandler func(*dataTreeNavigator, Context, *ExpressionNode) (Context, error)
expressionScenario for tests:
type expressionScenario struct {
description string
subdescription string
document string
expression string
expected []string
skipDoc bool
expectedError string
}
Step-by-Step: Adding a New Operator
Step 1: Create the Operator Implementation File
Create pkg/yqlib/operator_<type>.go implementing the operator handler function:
- Implement the
operatorHandlerfunction signature - Process nodes from
context.MatchingNodes - Return a new
Contextwith results usingcontext.ChildContext() - Use
candidate.CreateReplacement()orcandidate.CreateReplacementWithComments()to create new nodes - Handle errors gracefully with meaningful error messages
See operator_reverse.go or operator_keys.go for examples.
Step 2: Register the Operator in operation.go
Add the operator type definition to pkg/yqlib/operation.go:
var <type>OpType = &operationType{
Type: "<TYPE>", // All caps, matches pattern in lexer
NumArgs: 0, // 0 for no args, 1+ for args
Precedence: 50, // Typical range: 40-55
Handler: <type>Operator, // Reference to handler function
}
Precedence guidelines:
- 10-20: Logical operators (OR, AND, UNION)
- 30: Pipe operator
- 40: Assignment and comparison operators
- 42: Arithmetic operators (ADD, SUBTRACT, MULTIPLY, DIVIDE)
- 50-52: Most other operators
- 55: High precedence (e.g., GET_VARIABLE)
Optional fields:
CheckForPostTraverse: true- If your operator can have another directly after it without the pipe character. Most of the time this is false.ToString: customToString- Custom string representation (rarely needed)
Step 3: Register the Operator in lexer_participle.go
Edit pkg/yqlib/lexer_participle.go to add the operator to the lexer rules:
- Use
simpleOp()for simple keyword patterns - Use object syntax for regex patterns or complex syntax
- Support optional characters with
_?and aliases with|
See existing operators in lexer_participle.go for pattern examples.
Step 4: Create Tests (Mandatory)
Create pkg/yqlib/operator_<type>_test.go using the expressionScenario pattern:
- Define test scenarios with
description,document,expression, andexpectedfields expectedis a slice of strings showing output format:"D<doc>, P[<path>], (<tag>)::<value>\n"- Set
skipDoc: truefor edge cases you don't want in generated documentation - Include
subdescriptionfor longer test names - Set
expectedErrorif testing error cases - Create main test function that iterates over scenarios
Test coverage must include:
- Basic data types and nested structures
- Edge cases (empty inputs, special characters, type errors)
- Multiple outputs if applicable
- Format-specific features
See operator_reverse_test.go for a simple example and operator_keys_test.go for complex cases.
Step 5: Create Documentation Header
Create pkg/yqlib/doc/operators/headers/<type>.md:
- Use the exact operator name as the title
- Include a concise 1-2 sentence summary
- Add additional context or examples if the operator is complex
See existing headers in doc/operators/headers/ for examples.
Working with Context and CandidateNode
Context Management
context.ChildContext(results)- Create child context with resultscontext.GetVariable("varName")- Get variables stored in contextcontext.SetVariable("varName", value)- Set variables in context
CandidateNode Operations
candidate.CreateReplacement(ScalarNode, "!!str", stringValue)- Create a replacement nodecandidate.CreateReplacementWithComments(SequenceNode, "!!seq", candidate.Style)- With style preservedcandidate.Kind- The node type (ScalarNode, SequenceNode, MappingNode)candidate.Tag- The YAML tag (!!str, !!int, etc.)candidate.Value- The scalar value (for ScalarNode only)candidate.Content- Child nodes (for SequenceNode and MappingNode)candidate.guessTagFromCustomType()- Infer the tag from Go typecandidate.AsList()- Convert to a list representation
Key Points
✅ DO:
- Implement the operator handler with the correct signature
- Register in
operation.gowith appropriate precedence - Add the lexer pattern in
lexer_participle.go - Write comprehensive tests covering normal and edge cases
- Create a documentation header in
doc/operators/headers/ - Use
Context.ChildContext()for proper context threading - Handle all node types gracefully
- Return meaningful error messages
❌ DON'T:
- Modify
candidate_node.go(operators shouldn't need this) - Modify core navigation or evaluation logic
- Bypass the handler function pattern
- Add format-specific or operator-specific fields to
CandidateNode - Skip tests or documentation
Examples
Refer to existing operator implementations for patterns:
- No-argument operator:
operator_reverse.go- Processes arrays/sequences - Single-argument operator:
operator_map.go- Takes an expression argument - Complex multi-output:
operator_keys.go- Produces multiple results - With preferences:
operator_to_number.go- Configuration options - Error handling:
operator_error.go- Control flow with errors - String operations:
operator_strings.go- Multiple related operators
Testing Patterns
Refer to existing test files for specific patterns:
- Basic expression tests in
operator_reverse_test.go - Multi-output tests in
operator_keys_test.go - Error handling tests in
operator_error_test.go - Tests with
skipDocflag to exclude from generated documentation
Common Patterns
Refer to existing operator implementations for these patterns:
- Simple transformation: see
operator_reverse.go - Type checking: see
operator_error.go - Working with arguments: see
operator_map.go - Post-traversal operators: see
operator_with.go