Skip to content

Configuration

Every migration run is driven by a single TOML file:

Terminal window
pgferry migrate migration.toml
pgferry validate migration.toml

If you want the fastest safe starting point, use the wizard first:

Terminal window
pgferry wizard

Shorthand: pgferry run is the same as migrate, and generate / init are aliases for wizard — use whichever muscle memory you have.

The wizard keeps the default path short, then offers an optional advanced section for validation, resume, chunk_size, and index_workers. If you skip that section, the generated TOML stays on the minimal safe/default path.

These are command-line flags, not TOML keys:

FlagDefaultMeaning
--log-formattextValues: text, json. json writes one machine-readable summary object to stdout when migrate finishes. Human progress and errors stay on stderr.
--log-levelverboseValues: verbose, table, schema. Controls row-copy progress detail. verbose includes per-chunk start/done logs. table emits one row-copy start/done pair per table. schema suppresses row-copy detail.
--quiet, -qfalseShorthand for --log-level table. Cannot be combined with an explicit --log-level.
schema = "app"
[source]
type = "mysql" # or "mariadb" / "sqlite" / "mssql"
dsn = "root:root@tcp(127.0.0.1:3306)/source_db"
[target]
dsn = "postgres://postgres:postgres@127.0.0.1:5432/target_db?sslmode=disable"

Any PostgreSQL sslmode is supported. sslmode=disable is just a local example.

If you do not want credentials committed into the TOML, pgferry can override the DSNs from environment variables after loading the file:

Terminal window
export PGFERRY_SOURCE_DSN='root:root@tcp(127.0.0.1:3306)/source_db'
export PGFERRY_TARGET_DSN='postgres://postgres:postgres@127.0.0.1:5432/target_db?sslmode=disable'
pgferry migrate migration.toml

Rules:

  • PGFERRY_SOURCE_DSN overrides source.dsn when set to a non-empty value.
  • PGFERRY_TARGET_DSN overrides target.dsn when set to a non-empty value.
  • Empty strings do not override the TOML.
  • Validation still runs on the effective DSNs after the override is applied.

That means a committed config can omit the secret values entirely:

schema = "app"
[source]
type = "mysql"
[target]

[source].type is still required so pgferry knows which source backend and validation rules to apply.

Use one of these before you start tuning smaller details.

schema = "app"
on_schema_exists = "error"
unlogged_tables = false
resume = true
validation = "row_count"
chunk_size = 100000

Why: this keeps target data durable, enables checkpoint reuse, and gives you a row-count sanity check before cutover.

schema = "app"
on_schema_exists = "recreate"
unlogged_tables = true
clean_orphans = true

Why: this is the fastest full-load path when the target schema can be dropped and rebuilt.

KeyTypeDefaultMeaning
schemastringrequiredTarget PostgreSQL schema name.
table_filter_modestring"exact"Table-filter matching mode. "exact" preserves current exact-name behavior. "glob" enables case-insensitive * and ? patterns in include_tables and exclude_tables. Character classes like [] are intentionally not supported.
column_filter_modestring"exact"Column-filter matching mode. "exact" matches source column names exactly, case-insensitively. "glob" enables case-insensitive * and ? patterns in exclude_columns.
include_tablesarray of strings[]If non-empty, only migrate these source tables. Entries are exact names by default, or glob patterns when table_filter_mode = "glob".
exclude_tablesarray of strings[]Source tables to skip. Applied after include_tables, so excludes win when both match the same table.
exclude_columnsarray of strings[]Source columns to skip. Use ColumnName to skip matching columns in any table, or TableName.ColumnName to scope a rule to one source table. Entries are exact by default, or glob patterns in either the table or column part when column_filter_mode = "glob".
[table_renames]table of strings{}Explicit target table names keyed by source table name. Values are final PostgreSQL table names and are applied after table filters.
table_collision_modestring"error""error" reports generated PostgreSQL identifier collisions. "auto" automatically renames truncation-only table collisions within the target schema using deterministic hashed names.
[column_renames]table of strings{}Explicit target column names keyed by source TableName.ColumnName. Values are final PostgreSQL column names and are applied after table and column filters.
column_collision_modestring"error""error" reports generated PostgreSQL identifier collisions. "auto" automatically renames truncation-only column collisions within a table using deterministic hashed names.
on_schema_existsstring"error""error" aborts if the schema exists. "recreate" drops and recreates it. "use" requires the schema to already exist and be empty, then pgferry creates objects inside it.
schema_onlyboolfalseCreate schema objects only. Skip data COPY.
data_onlyboolfalseLoad data into an existing schema, then advance existing sequences with setval. Requires the target role to disable and re-enable triggers on the selected target tables during the load.
truncate_before_copybool or stringfalsetrue runs TRUNCATE TABLE ... on the selected target tables after before_data hooks and before COPY. "once" discovers all target tables in truncate_before_copy_schemas and runs one TRUNCATE TABLE ... CASCADE statement, which is useful before a multi-schema data_only batch with cross-schema foreign keys. In full migrations, target tables are normally freshly created, so this is usually a no-op.
truncate_before_copy_schemasarray of strings[schema] when "once"Target PostgreSQL schemas used only when truncate_before_copy = "once". Set this to every schema in the multi-schema batch, and enable "once" only for the first pgferry config in that batch. Each listed schema must exist and contain at least one ordinary target table.
source_snapshot_modestring"none""none" is fastest. "single_tx" gives one consistent source snapshot on MySQL, MariaDB, and MSSQL.
identifier_casestring"snake"How source identifiers map to PostgreSQL names. "snake" converts OrderItemsorder_items. "lower" lowercases only (OrderItemsorderitems). "preserve" keeps the source casing unchanged (OrderItemsOrderItems); PostgreSQL DDL is always quoted so mixed-case names are safe.
unlogged_tablesbooltrueUse UNLOGGED tables during full loads, then SET LOGGED later.
preserve_defaultsbooltrueKeep source column defaults in the created PostgreSQL schema.
add_unsigned_checksboolfalseAdd CHECK constraints for MySQL-family unsigned ranges.
clean_orphansbooltrueAutomatically delete or null invalid child rows before FK creation.
clean_orphans_modestring"apply""apply" fixes rows before FKs. "report" logs what would happen and stops before FK creation (dry-run style).
clean_orphans_max_rowsint0Safety cap: abort if orphan rows to fix would exceed this total. 0 means no cap.
replicate_on_update_current_timestampboolfalseCreate PostgreSQL trigger emulation for MySQL-family ON UPDATE CURRENT_TIMESTAMP.
workersintmin(runtime.NumCPU(), 8)Parallel worker count for data loading. SQLite is internally capped at 1.
index_workersintworkersConcurrent index builds during post-migration.
chunk_sizeint100000Key-range width for chunkable single-column numeric PK tables. Actual rows per chunk vary with key density.
copy_risk_analysisbooltrueLightweight COUNT/MIN/MAX probes to flag awkward COPY chunking (huge tables, sparse keys, etc.). Feeds plan copy-risk output and optional migrate warnings. On long runs, emits periodic progress logs (current table and probe stage). Turn off for quieter runs when you already know the shape.
resumeboolfalseReuse pgferry_checkpoint.json after interruptions.
validationstring"none""row_count" compares per-table counts after load. "sampled_hash" adds bounded content fingerprints for deterministic primary-key-addressable rows. pgferry validate reuses this setting without rerunning migration stages.

Table filters always match source table names, not the transformed PostgreSQL names. In glob mode the matching is case-insensitive, mirroring exact-mode normalization. If a filter entry matches no source table, pgferry fails early instead of silently migrating the wrong scope.

Column filters also match source names, not transformed PostgreSQL names. When a column is excluded, pgferry omits it from CREATE TABLE, source SELECT, and target COPY. Primary keys, plain-column indexes, and foreign keys that reference excluded columns are skipped. pgferry cannot inspect arbitrary expression index definitions for excluded column references; expression indexes are already reported as unsupported and skipped separately. If a filter entry matches no source column in the migrated schema, pgferry fails early.

Table renames override the target PostgreSQL name for a source table without changing source reads. Keys are source table names matched case-insensitively after table filters; schema-qualified keys such as dbo.Orders are not supported. Values are not passed through identifier_case; they are used as the final quoted PostgreSQL names and must fit PostgreSQL’s 63-byte identifier limit. Foreign keys are updated to reference renamed target tables.

[table_renames]
KP_SUMINA = "reserve_summary"

Column renames override the target PostgreSQL name for a source column without changing source reads. Keys must be source-qualified as TableName.ColumnName, matched case-insensitively after table and column filters; schema-qualified keys such as dbo.Table.Column are not supported. Values are not passed through identifier_case; they are used as the final quoted PostgreSQL names and must fit PostgreSQL’s 63-byte identifier limit. Primary keys, plain-column indexes, and foreign keys are updated to reference the renamed target columns.

[column_renames]
"KP_SUMINA.% ставка резерва по категории качества" = "reserve_rate_quality"
"KP_SUMINA.% ставка резерва по категории качества КД" = "reserve_rate_quality_kd"

For large schemas with many long source table or column names, the automatic collision modes can resolve PostgreSQL truncation collisions without listing every object manually:

table_collision_mode = "auto"
column_collision_mode = "auto"

Automatic collision renames are applied after explicit [table_renames] and [column_renames], so explicit mappings take precedence. pgferry only auto-renames table groups or column groups whose generated names are distinct before PostgreSQL’s 63-byte truncation but collide after that limit. Exact generated-name collisions, index/constraint/type collisions, and ambiguous references still fail and require explicit renames or manual handling. Each generated target name uses a UTF-8-safe prefix plus a stable hash and is logged during planning, validation, and migration.

Changing table_filter_mode, include_tables, exclude_tables, table_renames, table_collision_mode, column_filter_mode, exclude_columns, column_renames, or column_collision_mode can change the selected migration scope or target schema. When resume = true, that can invalidate the checkpoint fingerprint and force a fresh run.

truncate_before_copy = true follows the selected table scope after filters. Excluded tables are not named in pgferry’s generated TRUNCATE TABLE ... statement. pgferry intentionally omits CASCADE so PostgreSQL will fail rather than silently truncating dependent tables outside the selected scope.

Use truncate_before_copy = "once" for multi-schema data_only cutovers where cross-schema foreign keys make per-schema TRUNCATE fail. Configure truncate_before_copy_schemas with every target schema in the batch, enable "once" only on the first schema config, and leave later schema configs at false. pgferry discovers ordinary target tables in those schemas and runs a single TRUNCATE TABLE ... CASCADE before COPY, so the cascade is resolved while all configured schemas are still empty. truncate_before_copy_schemas is rejected with false or true modes to catch stale or misplaced config.

KeyTypeDefaultNotes
typestringrequired"mysql", "mariadb", "sqlite", or "mssql".
dsnstringrequired unless PGFERRY_SOURCE_DSN is setSource connection string or SQLite file path/URI.
charsetstring"utf8mb4"MySQL and MariaDB only. Injected into the DSN unless already present.
source_schemastring"dbo"MSSQL only. Limits introspection to one source schema.
KeyTypeDefaultNotes
dsnstringrequired unless PGFERRY_TARGET_DSN is setPostgreSQL connection string. If it does not set pool_max_conns, pgferry raises target pool MaxConns to at least max(workers, index_workers). If pool_max_conns is set explicitly, pgferry preserves it and warns when it is lower than that concurrency.

Use Type Mapping for the full behavior tables. These are the knobs you can set directly:

[type_mapping]
tinyint1_as_boolean = false
binary16_as_uuid = false
datetime_as_timestamptz = false
varchar_as_text = false
json_as_jsonb = true
widen_unsigned_integers = true
sanitize_json_null_bytes = true
unknown_as_text = false
enum_mode = "check"
set_mode = "text"
bit_mode = "bytea"
string_uuid_as_uuid = false
binary16_uuid_mode = "rfc4122"
time_mode = "time"
zero_date_mode = "null"
spatial_mode = "off"
nvarchar_as_text = false
money_as_numeric = true
xml_as_text = false
collation_mode = "none"
ci_as_citext = false

Optional MySQL-family collation remapping:

[type_mapping.collation_map]
utf8mb4_general_ci = "und-x-icu"
utf8mb4_unicode_ci = "und-x-icu"
KeyTypeDefaultNotes
enabledboolfalseMySQL only. Maps spatial columns to PostgreSQL geometry. MariaDB should use type_mapping.spatial_mode fallback modes instead.
create_extensionboolfalseWhen true, pgferry runs CREATE EXTENSION IF NOT EXISTS postgis.
[hooks]
before_data = []
after_data = []
before_fk = []
after_all = []

Hook file paths are resolved relative to the config file directory. See Hooks for phase details and templating behavior.

pgferry opens SQLite in read-only mode and requires a real file.

FormatExampleResult
Plain path/data/app.dbNormalized to file:/data/app.db?mode=ro
Relative path./relative.dbNormalized to file:./relative.db?mode=ro
File URIfile:/data/app.db?cache=sharedExisting params kept, mode=ro appended

Not supported: :memory:, file::memory:, or mode=memory.

FormatExampleNotes
URLsqlserver://user:pass@host:1433?database=mydbRecommended.
URL with instancesqlserver://user:pass@host/instance?database=mydbNamed instances.
ADOserver=host;user id=user;password=pass;database=mydbLegacy format.

The database parameter is required because pgferry extracts the DB name for introspection queries.

SettingMySQLMariaDBSQLiteMSSQL
source_snapshot_mode = "single_tx"YesYesNoYes
Worker parallelismYesYesForced to 1Yes
source.charsetYesYesErrorError
source.source_schemaN/AN/AN/AYes
MySQL-family type optionsYesYesErrorError
datetime_as_timestamptz for date/time typesYesYesYes (DATETIME / TIMESTAMPtimestamptz)Yes
MSSQL-only type optionsErrorErrorErrorYes
[postgis]YesErrorErrorError
collation_mode / collation_map / ci_as_citextYesYesErrorError
CombinationResult
resume = true + on_schema_exists = "recreate"Invalid because the target schema would be dropped.
resume = true + on_schema_exists = "use"Invalid because a resumed run would re-enter a now non-empty schema.
resume = true + schema_only = trueInvalid because there is no data stage to resume.
resume = true + truncate_before_copy = true or "once"Invalid because a resumed run would truncate rows that the checkpoint may skip.
resume = true + unlogged_tables = trueInvalid because checkpointed progress could outlive crash-truncated tables.
schema_only = true + data_only = trueInvalid. Choose one or neither.
schema_only = true + truncate_before_copy = true or "once"Invalid because no data phase runs.
SQLite + source_snapshot_mode = "single_tx"Invalid. SQLite only supports none.
MariaDB + [postgis]Invalid. Use type_mapping.spatial_mode fallback modes instead.
  • Run pgferry plan migration.toml before the first real migration.
  • Use pgferry validate migration.toml when you need to rerun built-in validation later without rerunning schema creation or data copy.
  • Use Operator tuning when runtime is dominated by PostgreSQL settings, source pressure, or network distance rather than config shape alone.
  • Use unlogged_tables = false whenever you also need resume = true.
  • Use source_snapshot_mode = "single_tx" when the source stays live during the migration and you need one consistent read view.
  • Keep on_schema_exists = "error" for the first production dry runs so you do not destroy previous target state by mistake.
  • Use on_schema_exists = "use" only when infrastructure or policy pre-creates the schema shell and you want pgferry to own everything inside it from empty.
  • Use truncate_before_copy = true with data_only when the target schema was created by another migrator and contains seed rows that should be replaced before COPY.
  • Use truncate_before_copy = "once" on only the first config in a multi-schema data_only batch when cross-schema foreign keys require one pre-copy TRUNCATE ... CASCADE across all batch schemas.
  • Expect data_only to fail early on PostgreSQL roles that cannot run ALTER TABLE ... DISABLE/ENABLE TRIGGER ALL; rehearse that path with the real target role before cutover.
  • Prefer hooks for views, routines, cleanup SQL, or post-load validation queries that are specific to your application.