Skip to content

Configuration

Every migration run is driven by a single TOML file:

Terminal window
pgferry migrate migration.toml
pgferry validate migration.toml

If you want the fastest safe starting point, use the wizard first:

Terminal window
pgferry wizard

Shorthand: pgferry run is the same as migrate, and generate / init are aliases for wizard — use whichever muscle memory you have.

The wizard keeps the default path short, then offers an optional advanced section for validation, resume, chunk_size, and index_workers. If you skip that section, the generated TOML stays on the minimal safe/default path.

schema = "app"
[source]
type = "mysql" # or "mariadb" / "sqlite" / "mssql"
dsn = "root:root@tcp(127.0.0.1:3306)/source_db"
[target]
dsn = "postgres://postgres:postgres@127.0.0.1:5432/target_db?sslmode=disable"

Any PostgreSQL sslmode is supported. sslmode=disable is just a local example.

If you do not want credentials committed into the TOML, pgferry can override the DSNs from environment variables after loading the file:

Terminal window
export PGFERRY_SOURCE_DSN='root:root@tcp(127.0.0.1:3306)/source_db'
export PGFERRY_TARGET_DSN='postgres://postgres:postgres@127.0.0.1:5432/target_db?sslmode=disable'
pgferry migrate migration.toml

Rules:

  • PGFERRY_SOURCE_DSN overrides source.dsn when set to a non-empty value.
  • PGFERRY_TARGET_DSN overrides target.dsn when set to a non-empty value.
  • Empty strings do not override the TOML.
  • Validation still runs on the effective DSNs after the override is applied.

That means a committed config can omit the secret values entirely:

schema = "app"
[source]
type = "mysql"
[target]

[source].type is still required so pgferry knows which source backend and validation rules to apply.

Use one of these before you start tuning smaller details.

schema = "app"
on_schema_exists = "error"
unlogged_tables = false
resume = true
validation = "row_count"
chunk_size = 100000

Why: this keeps target data durable, enables checkpoint reuse, and gives you a row-count sanity check before cutover.

schema = "app"
on_schema_exists = "recreate"
unlogged_tables = true
clean_orphans = true

Why: this is the fastest full-load path when the target schema can be dropped and rebuilt.

KeyTypeDefaultMeaning
schemastringrequiredTarget PostgreSQL schema name.
table_filter_modestring"exact"Table-filter matching mode. "exact" preserves current exact-name behavior. "glob" enables case-insensitive * and ? patterns in include_tables and exclude_tables. Character classes like [] are intentionally not supported.
column_filter_modestring"exact"Column-filter matching mode. "exact" matches source column names exactly, case-insensitively. "glob" enables case-insensitive * and ? patterns in exclude_columns.
include_tablesarray of strings[]If non-empty, only migrate these source tables. Entries are exact names by default, or glob patterns when table_filter_mode = "glob".
exclude_tablesarray of strings[]Source tables to skip. Applied after include_tables, so excludes win when both match the same table.
exclude_columnsarray of strings[]Source columns to skip. Use ColumnName to skip matching columns in any table, or TableName.ColumnName to scope a rule to one source table. Entries are exact by default, or glob patterns in either the table or column part when column_filter_mode = "glob".
on_schema_existsstring"error""error" aborts if the schema exists. "recreate" drops and recreates it. "use" requires the schema to already exist and be empty, then pgferry creates objects inside it.
schema_onlyboolfalseCreate schema objects only. Skip data COPY.
data_onlyboolfalseLoad data into an existing schema, then reset sequences. Requires the target role to disable and re-enable triggers on the selected target tables during the load.
source_snapshot_modestring"none""none" is fastest. "single_tx" gives one consistent source snapshot on MySQL, MariaDB, and MSSQL.
identifier_casestring"snake"How source identifiers map to PostgreSQL names. "snake" converts OrderItemsorder_items. "lower" lowercases only (OrderItemsorderitems). "preserve" keeps the source casing unchanged (OrderItemsOrderItems); PostgreSQL DDL is always quoted so mixed-case names are safe.
unlogged_tablesbooltrueUse UNLOGGED tables during full loads, then SET LOGGED later.
preserve_defaultsbooltrueKeep source column defaults in the created PostgreSQL schema.
add_unsigned_checksboolfalseAdd CHECK constraints for MySQL-family unsigned ranges.
clean_orphansbooltrueAutomatically delete or null invalid child rows before FK creation.
clean_orphans_modestring"apply""apply" fixes rows before FKs. "report" logs what would happen and stops before FK creation (dry-run style).
clean_orphans_max_rowsint0Safety cap: abort if orphan rows to fix would exceed this total. 0 means no cap.
replicate_on_update_current_timestampboolfalseCreate PostgreSQL trigger emulation for MySQL-family ON UPDATE CURRENT_TIMESTAMP.
workersintmin(runtime.NumCPU(), 8)Parallel worker count for data loading. SQLite is internally capped at 1.
index_workersintworkersConcurrent index builds during post-migration.
chunk_sizeint100000Key-range width for chunkable single-column numeric PK tables. Actual rows per chunk vary with key density.
copy_risk_analysisbooltrueLightweight COUNT/MIN/MAX probes to flag awkward COPY chunking (huge tables, sparse keys, etc.). Feeds plan copy-risk output and optional migrate warnings. On long runs, emits periodic progress logs (current table and probe stage). Turn off for quieter runs when you already know the shape.
resumeboolfalseReuse pgferry_checkpoint.json after interruptions.
validationstring"none""row_count" compares per-table counts after load. "sampled_hash" adds bounded content fingerprints for deterministic primary-key-addressable rows. pgferry validate reuses this setting without rerunning migration stages.

Table filters always match source table names, not the transformed PostgreSQL names. In glob mode the matching is case-insensitive, mirroring exact-mode normalization. If a filter entry matches no source table, pgferry fails early instead of silently migrating the wrong scope.

Column filters also match source names, not transformed PostgreSQL names. When a column is excluded, pgferry omits it from CREATE TABLE, source SELECT, and target COPY. Primary keys, plain-column indexes, and foreign keys that reference excluded columns are skipped. pgferry cannot inspect arbitrary expression index definitions for excluded column references; expression indexes are already reported as unsupported and skipped separately. If a filter entry matches no source column in the migrated schema, pgferry fails early.

Changing table_filter_mode, include_tables, exclude_tables, column_filter_mode, or exclude_columns can change the selected migration scope. When resume = true, that can invalidate the checkpoint fingerprint and force a fresh run.

KeyTypeDefaultNotes
typestringrequired"mysql", "mariadb", "sqlite", or "mssql".
dsnstringrequired unless PGFERRY_SOURCE_DSN is setSource connection string or SQLite file path/URI.
charsetstring"utf8mb4"MySQL and MariaDB only. Injected into the DSN unless already present.
source_schemastring"dbo"MSSQL only. Limits introspection to one source schema.
KeyTypeDefaultNotes
dsnstringrequired unless PGFERRY_TARGET_DSN is setPostgreSQL connection string. If it does not set pool_max_conns, pgferry raises target pool MaxConns to at least max(workers, index_workers). If pool_max_conns is set explicitly, pgferry preserves it and warns when it is lower than that concurrency.

Use Type Mapping for the full behavior tables. These are the knobs you can set directly:

[type_mapping]
tinyint1_as_boolean = false
binary16_as_uuid = false
datetime_as_timestamptz = false
varchar_as_text = false
json_as_jsonb = true
widen_unsigned_integers = true
sanitize_json_null_bytes = true
unknown_as_text = false
enum_mode = "check"
set_mode = "text"
bit_mode = "bytea"
string_uuid_as_uuid = false
binary16_uuid_mode = "rfc4122"
time_mode = "time"
zero_date_mode = "null"
spatial_mode = "off"
nvarchar_as_text = false
money_as_numeric = true
xml_as_text = false
collation_mode = "none"
ci_as_citext = false

Optional MySQL-family collation remapping:

[type_mapping.collation_map]
utf8mb4_general_ci = "und-x-icu"
utf8mb4_unicode_ci = "und-x-icu"
KeyTypeDefaultNotes
enabledboolfalseMySQL only. Maps spatial columns to PostgreSQL geometry. MariaDB should use type_mapping.spatial_mode fallback modes instead.
create_extensionboolfalseWhen true, pgferry runs CREATE EXTENSION IF NOT EXISTS postgis.
[hooks]
before_data = []
after_data = []
before_fk = []
after_all = []

Hook file paths are resolved relative to the config file directory. See Hooks for phase details and templating behavior.

pgferry opens SQLite in read-only mode and requires a real file.

FormatExampleResult
Plain path/data/app.dbNormalized to file:/data/app.db?mode=ro
Relative path./relative.dbNormalized to file:./relative.db?mode=ro
File URIfile:/data/app.db?cache=sharedExisting params kept, mode=ro appended

Not supported: :memory:, file::memory:, or mode=memory.

FormatExampleNotes
URLsqlserver://user:pass@host:1433?database=mydbRecommended.
URL with instancesqlserver://user:pass@host/instance?database=mydbNamed instances.
ADOserver=host;user id=user;password=pass;database=mydbLegacy format.

The database parameter is required because pgferry extracts the DB name for introspection queries.

SettingMySQLMariaDBSQLiteMSSQL
source_snapshot_mode = "single_tx"YesYesNoYes
Worker parallelismYesYesForced to 1Yes
source.charsetYesYesErrorError
source.source_schemaN/AN/AN/AYes
MySQL-family type optionsYesYesErrorError
datetime_as_timestamptz for date/time typesYesYesYes (DATETIME / TIMESTAMPtimestamptz)Yes
MSSQL-only type optionsErrorErrorErrorYes
[postgis]YesErrorErrorError
collation_mode / collation_map / ci_as_citextYesYesErrorError
CombinationResult
resume = true + on_schema_exists = "recreate"Invalid because the target schema would be dropped.
resume = true + on_schema_exists = "use"Invalid because a resumed run would re-enter a now non-empty schema.
resume = true + schema_only = trueInvalid because there is no data stage to resume.
resume = true + unlogged_tables = trueInvalid because checkpointed progress could outlive crash-truncated tables.
schema_only = true + data_only = trueInvalid. Choose one or neither.
SQLite + source_snapshot_mode = "single_tx"Invalid. SQLite only supports none.
MariaDB + [postgis]Invalid. Use type_mapping.spatial_mode fallback modes instead.
  • Run pgferry plan migration.toml before the first real migration.
  • Use pgferry validate migration.toml when you need to rerun built-in validation later without rerunning schema creation or data copy.
  • Use Operator tuning when runtime is dominated by PostgreSQL settings, source pressure, or network distance rather than config shape alone.
  • Use unlogged_tables = false whenever you also need resume = true.
  • Use source_snapshot_mode = "single_tx" when the source stays live during the migration and you need one consistent read view.
  • Keep on_schema_exists = "error" for the first production dry runs so you do not destroy previous target state by mistake.
  • Use on_schema_exists = "use" only when infrastructure or policy pre-creates the schema shell and you want pgferry to own everything inside it from empty.
  • Expect data_only to fail early on PostgreSQL roles that cannot run ALTER TABLE ... DISABLE/ENABLE TRIGGER ALL; rehearse that path with the real target role before cutover.
  • Prefer hooks for views, routines, cleanup SQL, or post-load validation queries that are specific to your application.