# General rules ✅ **DO:** - You can use ./yq with the `--debug-node-info` flag to get a deeper understanding of the ast. - run ./scripts/format.sh to format the code; then ./scripts/check.sh lint and finally ./scripts/spelling.sh to check spelling. - Add comprehensive tests to cover the changes - Run test suite to ensure there is no regression - Use UK english spelling (e.g. Colorisation not Colorization) ❌ **DON'T:** - Git add or commit # Adding a New Encoder/Decoder This guide explains how to add support for a new format (encoder/decoder) to yq without modifying `candidate_node.go`. ## Overview The encoder/decoder architecture in yq is based on two main interfaces: - **Encoder**: Converts a `CandidateNode` to output in a specific format - **Decoder**: Reads input in a specific format and creates a `CandidateNode` Each format is registered in `pkg/yqlib/format.go` and made available through factory functions. ## Architecture ### Key Files - `pkg/yqlib/encoder.go` - Defines the `Encoder` interface - `pkg/yqlib/decoder.go` - Defines the `Decoder` interface - `pkg/yqlib/format.go` - Format registry and factory functions - `pkg/yqlib/operator_encoder_decoder.go` - Encode/decode operators - `pkg/yqlib/encoder_*.go` - Encoder implementations - `pkg/yqlib/decoder_*.go` - Decoder implementations ### Interfaces **Encoder Interface:** ```go type Encoder interface { Encode(writer io.Writer, node *CandidateNode) error PrintDocumentSeparator(writer io.Writer) error PrintLeadingContent(writer io.Writer, content string) error CanHandleAliases() bool } ``` **Decoder Interface:** ```go type Decoder interface { Init(reader io.Reader) error Decode() (*CandidateNode, error) } ``` ## Step-by-Step: Adding a New Encoder/Decoder ### Step 1: Create the Encoder File Create `pkg/yqlib/encoder_.go` implementing the `Encoder` interface: - `Encode()` - Convert a `CandidateNode` to your format and write to the output writer - `PrintDocumentSeparator()` - Handle document separators if your format requires them - `PrintLeadingContent()` - Handle leading content/comments if supported - `CanHandleAliases()` - Return whether your format supports YAML aliases See `encoder_json.go` or `encoder_base64.go` for examples. ### Step 2: Create the Decoder File Create `pkg/yqlib/decoder_.go` implementing the `Decoder` interface: - `Init()` - Initialize the decoder with the input reader and set up any needed state - `Decode()` - Decode one document from the input and return a `CandidateNode`, or `io.EOF` when finished See `decoder_json.go` or `decoder_base64.go` for examples. ### Step 3: Create Tests (Mandatory) Create a test file `pkg/yqlib/_test.go` using the `formatScenario` pattern: - Define test scenarios as `formatScenario` structs with fields: `description`, `input`, `expected`, `scenarioType` - `scenarioType` can be `"decode"` (test decoding to YAML) or `"roundtrip"` (encode/decode preservation) - Create a helper function `testScenario()` that switches on `scenarioType` - Create main test function `TestFormatScenarios()` that iterates over scenarios Test coverage must include: - Basic data types (scalars, arrays, objects/maps) - Nested structures - Edge cases (empty inputs, special characters, escape sequences) - Format-specific features or syntax - Round-trip tests: decode → encode → decode should preserve data See `hcl_test.go` for a complete example. ### Step 4: Register the Format in format.go Edit `pkg/yqlib/format.go`: 1. Add a new format variable: - `""` is the formal name (e.g., "json", "yaml") - `[]string{...}` contains short aliases (can be empty) - The first function creates an encoder (can be nil for encode-only formats) - The second function creates a decoder (can be nil for decode-only formats) 2. Add the format to the `Formats` slice in the same file See existing formats in `format.go` for the exact structure. ### Step 5: Handle Encoder Configuration (if needed) If your format has preferences/configuration options: 1. Create a preferences struct with your configuration fields 2. Update the encoder to accept preferences in its factory function 3. Update `format.go` to pass the configured preferences 4. Update `operator_encoder_decoder.go` if special indent handling is needed (see existing formats like JSON and YAML for the pattern) This pattern is optional and only needed if your format has user-configurable options. ## Build Tags Use build tags to allow optional compilation of formats: - Add `//go:build !yq_no` at the top of your encoder and decoder files - Create a no-build version in `pkg/yqlib/no_.go` that returns nil for encoder/decoder factories This allows users to compile yq without certain formats using: `go build -tags yq_no` ## Working with CandidateNode The `CandidateNode` struct represents a YAML node with: - `Kind`: The node type (ScalarNode, SequenceNode, MappingNode) - `Tag`: The YAML tag (e.g., "!!str", "!!int", "!!map") - `Value`: The scalar value (for ScalarNode only) - `Content`: Child nodes (for SequenceNode and MappingNode) Key methods: - `node.guessTagFromCustomType()` - Infer the tag from Go type - `node.AsList()` - Convert to a list for processing - `node.CreateReplacement()` - Create a new replacement node - `NewCandidate()` - Create a new CandidateNode ## Key Points ✅ **DO:** - Implement only the `Encoder` and `Decoder` interfaces - Register your format in `format.go` only - Keep format-specific logic in your encoder/decoder files - Use the candidate_node style attribute to store style information for round-trip. Ask if this needs to be updated with new styles. - Use build tags for optional compilation - Add comprehensive tests - Run the specific encoder/decoder test (e.g. _test.go) whenever you make ay changes to the encoder_ or decoder_ - Handle errors gracefully - Add the no build directive, like the xml encoder and decoder, that enables a minimal yq builds. e.g. `//go:build !yq_`. Be sure to also update the build_small-yq.sh and build-tinygo-yq.sh to not include the new format. ❌ **DON'T:** - Modify `candidate_node.go` to add format-specific logic - Add format-specific fields to `CandidateNode` - Create special cases in core navigation or evaluation logic - Bypass the encoder/decoder interfaces - Use candidate_node tag attribute for anything other than indicate the data type ## Examples Refer to existing format implementations for patterns: - **Simple encoder/decoder**: `encoder_json.go`, `decoder_json.go` - **Complex with preferences**: `encoder_yaml.go`, `decoder_yaml.go` - **Encoder-only**: `encoder_sh.go` (ShFormat has nil decoder) - **String-only operations**: `encoder_base64.go`, `decoder_base64.go` ## Testing Your Implementation (Mandatory) Tests must be implemented in `_test.go` following the `formatScenario` pattern: 1. **Create test scenarios** using the `formatScenario` struct with fields: - `description`: Brief description of what's being tested - `input`: Sample input in your format - `expected`: Expected output (typically in YAML for decode tests) - `scenarioType`: Either `"decode"` or `"roundtrip"` 2. **Test coverage must include:** - Basic data types (scalars, arrays, objects/maps) - Nested structures - Edge cases (empty inputs, special characters, escape sequences) - Format-specific features or syntax - Round-trip tests: decode → encode → decode should preserve data 3. **Test function pattern:** - `testScenario()`: Helper function that switches on `scenarioType` - `TestFormatScenarios()`: Main test function that iterates over scenarios 4. **Example from existing formats:** - See `hcl_test.go` for a complete example - See `yaml_test.go` for YAML-specific patterns - See `json_test.go` for more complex scenarios ## Common Patterns ### Scalar-Only Formats Some formats only work with scalars (like base64, uri): ```go if node.guessTagFromCustomType() != "!!str" { return fmt.Errorf("cannot encode %v as , can only operate on strings", node.Tag) } ``` ### Format with Indentation Use preferences to control output formatting: ```go type Preferences struct { Indent int } func (prefs *Preferences) Copy() Preferences { return *prefs } ``` ### Multiple Documents Decoders should support reading multiple documents: ```go func (dec *Decoder) Decode() (*CandidateNode, error) { if dec.finished { return nil, io.EOF } // ... decode next document ... if noMoreDocuments { dec.finished = true } return candidate, nil } ``` --- # Adding a New Operator This guide explains how to add a new operator to yq. Operators are the core of yq's expression language and process `CandidateNode` objects without requiring modifications to `candidate_node.go` itself. ## Overview Operators transform data by implementing a handler function that processes a `Context` containing `CandidateNode` objects. Each operator is: 1. Defined as an `operationType` in `operation.go` 2. Registered in the lexer in `lexer_participle.go` 3. Implemented in its own `operator_.go` file 4. Tested in `operator__test.go` 5. Documented in `pkg/yqlib/doc/operators/headers/.md` ## Architecture ### Key Files - `pkg/yqlib/operation.go` - Defines `operationType` and operator registry - `pkg/yqlib/lexer_participle.go` - Registers operators with their syntax patterns - `pkg/yqlib/operator_.go` - Operator implementation - `pkg/yqlib/operator__test.go` - Operator tests using `expressionScenario` - `pkg/yqlib/doc/operators/headers/.md` - Documentation header ### Core Types **operationType:** ```go type operationType struct { Type string // Unique operator name (e.g., "REVERSE") NumArgs uint // Number of arguments (0 for no args) Precedence uint // Operator precedence (higher = higher precedence) Handler operatorHandler // The function that executes the operator CheckForPostTraverse bool // Whether to apply post-traversal logic ToString func(*Operation) string // Custom string representation } ``` **operatorHandler signature:** ```go type operatorHandler func(*dataTreeNavigator, Context, *ExpressionNode) (Context, error) ``` **expressionScenario for tests:** ```go type expressionScenario struct { description string subdescription string document string expression string expected []string skipDoc bool expectedError string } ``` ## Step-by-Step: Adding a New Operator ### Step 1: Create the Operator Implementation File Create `pkg/yqlib/operator_.go` implementing the operator handler function: - Implement the `operatorHandler` function signature - Process nodes from `context.MatchingNodes` - Return a new `Context` with results using `context.ChildContext()` - Use `candidate.CreateReplacement()` or `candidate.CreateReplacementWithComments()` to create new nodes - Handle errors gracefully with meaningful error messages See `operator_reverse.go` or `operator_keys.go` for examples. ### Step 2: Register the Operator in operation.go Add the operator type definition to `pkg/yqlib/operation.go`: ```go var OpType = &operationType{ Type: "", // All caps, matches pattern in lexer NumArgs: 0, // 0 for no args, 1+ for args Precedence: 50, // Typical range: 40-55 Handler: Operator, // Reference to handler function } ``` **Precedence guidelines:** - 10-20: Logical operators (OR, AND, UNION) - 30: Pipe operator - 40: Assignment and comparison operators - 42: Arithmetic operators (ADD, SUBTRACT, MULTIPLY, DIVIDE) - 50-52: Most other operators - 55: High precedence (e.g., GET_VARIABLE) **Optional fields:** - `CheckForPostTraverse: true` - If your operator can have another directly after it without the pipe character. Most of the time this is false. - `ToString: customToString` - Custom string representation (rarely needed) ### Step 3: Register the Operator in lexer_participle.go Edit `pkg/yqlib/lexer_participle.go` to add the operator to the lexer rules: - Use `simpleOp()` for simple keyword patterns - Use object syntax for regex patterns or complex syntax - Support optional characters with `_?` and aliases with `|` See existing operators in `lexer_participle.go` for pattern examples. ### Step 4: Create Tests (Mandatory) Create `pkg/yqlib/operator__test.go` using the `expressionScenario` pattern: - Define test scenarios with `description`, `document`, `expression`, and `expected` fields - `expected` is a slice of strings showing output format: `"D, P[], ()::\n"` - Set `skipDoc: true` for edge cases you don't want in generated documentation - Include `subdescription` for longer test names - Set `expectedError` if testing error cases - Create main test function that iterates over scenarios Test coverage must include: - Basic data types and nested structures - Edge cases (empty inputs, special characters, type errors) - Multiple outputs if applicable - Format-specific features See `operator_reverse_test.go` for a simple example and `operator_keys_test.go` for complex cases. ### Step 5: Create Documentation Header Create `pkg/yqlib/doc/operators/headers/.md`: - Use the exact operator name as the title - Include a concise 1-2 sentence summary - Add additional context or examples if the operator is complex See existing headers in `doc/operators/headers/` for examples. ## Working with Context and CandidateNode ### Context Management - `context.ChildContext(results)` - Create child context with results - `context.GetVariable("varName")` - Get variables stored in context - `context.SetVariable("varName", value)` - Set variables in context ### CandidateNode Operations - `candidate.CreateReplacement(ScalarNode, "!!str", stringValue)` - Create a replacement node - `candidate.CreateReplacementWithComments(SequenceNode, "!!seq", candidate.Style)` - With style preserved - `candidate.Kind` - The node type (ScalarNode, SequenceNode, MappingNode) - `candidate.Tag` - The YAML tag (!!str, !!int, etc.) - `candidate.Value` - The scalar value (for ScalarNode only) - `candidate.Content` - Child nodes (for SequenceNode and MappingNode) - `candidate.guessTagFromCustomType()` - Infer the tag from Go type - `candidate.AsList()` - Convert to a list representation ## Key Points ✅ **DO:** - Implement the operator handler with the correct signature - Register in `operation.go` with appropriate precedence - Add the lexer pattern in `lexer_participle.go` - Write comprehensive tests covering normal and edge cases - Create a documentation header in `doc/operators/headers/` - Use `Context.ChildContext()` for proper context threading - Handle all node types gracefully - Return meaningful error messages ❌ **DON'T:** - Modify `candidate_node.go` (operators shouldn't need this) - Modify core navigation or evaluation logic - Bypass the handler function pattern - Add format-specific or operator-specific fields to `CandidateNode` - Skip tests or documentation ## Examples Refer to existing operator implementations for patterns: - **No-argument operator**: `operator_reverse.go` - Processes arrays/sequences - **Single-argument operator**: `operator_map.go` - Takes an expression argument - **Complex multi-output**: `operator_keys.go` - Produces multiple results - **With preferences**: `operator_to_number.go` - Configuration options - **Error handling**: `operator_error.go` - Control flow with errors - **String operations**: `operator_strings.go` - Multiple related operators ## Testing Patterns Refer to existing test files for specific patterns: - Basic expression tests in `operator_reverse_test.go` - Multi-output tests in `operator_keys_test.go` - Error handling tests in `operator_error_test.go` - Tests with `skipDoc` flag to exclude from generated documentation ## Common Patterns Refer to existing operator implementations for these patterns: - Simple transformation: see `operator_reverse.go` - Type checking: see `operator_error.go` - Working with arguments: see `operator_map.go` - Post-traversal operators: see `operator_with.go`