Skip to content

Data Designer Configuration

DataDesignerConfig is the top-level configuration object passed to Data Designer. It declares the columns to generate and may include model configs, tool configs, seed settings, sampler constraints, processors, and profiler configs.

Prefer DataDesignerConfigBuilder for programmatic construction. Direct DataDesignerConfig instantiation is also supported.

Classes:

Name Description
DataDesignerConfig

Configuration for NeMo Data Designer.

DataDesignerConfig

Bases: ExportableConfigBase

Configuration for NeMo Data Designer.

This class defines the main configuration structure for NeMo Data Designer, which the engine consumes when generating synthetic data.

Attributes:

Name Type Description
columns list[Annotated[ColumnConfigT, Field(discriminator='column_type')]]

Required list of column configurations defining how each column should be generated. Must contain at least one column.

model_configs list[ModelConfig] | None

Optional list of model configurations for LLM-based generation. Each model config defines the model, provider, and inference parameters.

tool_configs list[ToolConfig] | None

Optional list of tool configurations for MCP tool calling. Each tool config defines the provider, allowed tools, and execution limits.

seed_config SeedConfig | None

Optional seed dataset settings to use for generation.

constraints list[ColumnConstraintInputT] | None

Optional list of column constraints.

profilers list[ColumnProfilerConfigT] | None

Optional list of column profilers for analyzing generated data characteristics.

processors list[Annotated[ProcessorConfigT, Field(discriminator='processor_type')]] | None

Optional list of processor configurations for post-generation transformations.

Methods:

Name Description
fingerprint

Compute a deterministic content-addressable fingerprint of this config.

fingerprint()

Compute a deterministic content-addressable fingerprint of this config.

See data_designer.config.fingerprint.fingerprint_config for the full list of identity-relevant and excluded fields, and how custom column generators are identified.

Returns:

Type Description
dict[str, str | int]

A dict with config_hash, config_hash_algo, and

dict[str, str | int]

config_hash_version.

Source code in packages/data-designer-config/src/data_designer/config/data_designer_config.py
48
49
50
51
52
53
54
55
56
57
58
59
def fingerprint(self) -> dict[str, str | int]:
    """Compute a deterministic content-addressable fingerprint of this config.

    See `data_designer.config.fingerprint.fingerprint_config` for the full
    list of identity-relevant and excluded fields, and how custom column
    generators are identified.

    Returns:
        A dict with `config_hash`, `config_hash_algo`, and
        `config_hash_version`.
    """
    return fingerprint_config(self)