# `.cloning.yaml` Reference

A `.cloning.yaml` file describes how to transfer and anonymize one source database. It is safe to commit to version control — it contains only a connection *name*, never credentials.

Generate a starting file with [`cloning:dump`](03-cloning-dump.md), then review and adjust before running [`cloning:run`](04-cloning-run.md).

---

## File structure

```yaml
# yaml-language-server: $schema=https://clonio.dev/schema/cloning-v1.json
version: "1"
connection: <connection-name>

options:
  ...

tables:
  <table-name>:
    rows:
      ...
    columns:
      <column-name>:
        ...
```

| Field | Required | Description |
|---|:---:|---|
| `version` | yes | Schema version. Must be `"1"`. |
| `connection` | yes | Name of the source connection from `clonio.json`. |
| `options` | yes | Global transfer settings. |
| `tables` | yes | Map of table names to their transfer configuration. At least one table is required. |

---

## Options

All fields in `options` must be present. `cloning:dump` writes all of them automatically.

```yaml
options:
  chunk_size: 1000
  enforce_column_types: false
  drop_unknown_tables: false
  drop_extra_columns: false
  disable_foreign_key_checks: true
  faker_locale: en_US
```

| Field | Type | Default | Description |
|---|---|---|---|
| `chunk_size` | integer ≥ 1 | `1000` | Rows fetched and inserted per batch. |
| `enforce_column_types` | boolean | `false` | Add columns to existing target tables that are present in source but missing from target. |
| `drop_unknown_tables` | boolean | `false` | Drop tables from target that do not exist in source. |
| `drop_extra_columns` | boolean | `false` | Drop columns from existing target tables that exist in target but not in source. ⚠ Irreversible. |
| `disable_foreign_key_checks` | boolean | `true` | Disable FK constraint checks on the target during transfer. |
| `faker_locale` | string | `en_US` | [FakerPHP locale](https://fakerphp.github.io/localization/). Examples: `de_DE`, `fr_FR`, `ja_JP`. |

> **⚠ Caution — `drop_extra_columns: true`:** Dropping columns is irreversible. Enable only on ephemeral environments.

---

## Tables

Each key under `tables` is the exact table name in the source database.

### `rows`

Controls which rows are transferred and whether the target is cleared first.

| Field | Required | Values | Description |
|---|:---:|---|---|
| `strategy` | yes | `full` \| `first` \| `last` | Row selection strategy. |
| `limit` | when `first`/`last` | integer ≥ 1 | Number of rows to transfer. |
| `sort_by` | no | column name | Column to order by. Defaults to the primary key. |
| `clear` | no | `false` \| `truncate` \| `delete` | Whether to empty the target table before inserting. Default: `false`. |

| Strategy | Behaviour |
|---|---|
| `full` | Copy all rows. |
| `first` | Copy the first `limit` rows (ascending by `sort_by`). |
| `last` | Copy the last `limit` rows (descending by `sort_by`). |

| Clear value | SQL issued | Notes |
|---|---|---|
| `false` | *(none)* | Rows are appended to existing target rows. |
| `truncate` | `TRUNCATE TABLE …` | Fastest. Falls back to `DELETE FROM` on SQLite. |
| `delete` | `DELETE FROM …` | Safer on targets with FK constraints enforced at the statement level. |

### `columns`

Lists columns that need transformation. **Columns not listed are implicitly kept as-is.**

---

## Column strategies

### `keep`

Copy the value unchanged. You rarely need to write this explicitly.

```yaml
id:
  strategy: keep
```

---

### `fake`

Replace with a realistic synthetic value generated by [FakerPHP](https://fakerphp.github.io/).

```yaml
email:
  strategy: fake
  faker_method: safeEmail
  faker_arguments: []
```

| Field | Required | Description |
|---|:---:|---|
| `faker_method` | yes | FakerPHP method name. See [Faker method reference](#faker-method-reference) below. |
| `faker_arguments` | yes | Positional arguments. Use `[]` when none are needed. |

---

### `hash`

Replace the value with a deterministic one-way hash. The same input always produces the same output — useful for preserving referential integrity without exposing real values.

```yaml
password:
  strategy: hash
  algorithm: sha256
  salt: ""
```

| Field | Required | Values | Description |
|---|:---:|---|---|
| `algorithm` | yes | `sha256` \| `sha512` \| `md5` \| `sha1` | PHP `hash()` algorithm. |
| `salt` | yes | string | Prefix prepended before hashing. Use `""` for no salt. |

---

### `mask`

Reveal only the first N characters; replace the rest with a mask character.

```yaml
phone:
  strategy: mask
  visible_chars: 4
  mask_char: "*"
  preserve_format: false
```

| Field | Required | Description |
|---|:---:|---|
| `visible_chars` | yes | Number of leading characters to leave unmasked (`0` masks everything). |
| `mask_char` | yes | Single character for masked positions (e.g. `"*"` or `"X"`). |
| `preserve_format` | yes | When `true`, structural characters (`.`, `@`, `-`, spaces) are preserved in their original positions. |

**Examples:**

| Input | `visible_chars` | `preserve_format` | Output |
|---|:---:|:---:|---|
| `alice@example.com` | `3` | `false` | `ali***************` |
| `alice@example.com` | `3` | `true` | `ali**@*******.***` |
| `+44 20 7946 0958` | `0` | `true` | `+** ** **** ****` |

---

### `null`

Set the column value to `NULL`. Only valid on nullable columns.

```yaml
notes:
  strategy: "null"
```

Quote `"null"` to avoid YAML interpreting it as a null scalar.

---

### `static`

Replace every value with a fixed string.

```yaml
environment_tag:
  strategy: static
  value: "dev-imported"
```

---

### `remapping`

Assign a new primary key value to each transferred row and rewrite all FK columns that reference it, preventing ID collisions on the target.

```yaml
id:
  strategy: remapping
  arguments:
    - use: random_integer
    - min: 100000
    - max: 9999999
    - foreign_keys:
        - table: orders
          column: user_id
        - table: employees
          column: manager_id
          self_referential: true
```

See [`cloning:run` — Key Remapping](04-cloning-run.md#key-remapping) for a detailed worked example.

---

## Faker method reference

The Faker locale is set globally via `options.faker_locale`.

### Personal identity

| Method | Example |
|---|---|
| `name` | Jane Smith |
| `firstName` | Alice |
| `lastName` | Johnson |
| `prefix` | Dr. |
| `suffix` | Jr. |
| `gender` | female |
| `title` | Software Engineer |

### Contact

| Method | Example |
|---|---|
| `safeEmail` | alice.johnson@example.com |
| `email` | alice@domain.tld |
| `freeEmail` | alice.johnson@gmail.com |
| `companyEmail` | alice@acme.com |
| `userName` | alice.j42 |
| `phoneNumber` | +1-555-0142 |
| `e164PhoneNumber` | +15550142000 |

### Location

| Method | Example |
|---|---|
| `address` | 123 Main St, Springfield, IL 62701 |
| `streetAddress` | 123 Main St |
| `city` | Springfield |
| `state` | Illinois |
| `stateAbbr` | IL |
| `postcode` | 62701 |
| `country` | United States |
| `countryCode` | US |
| `latitude` | 37.7749 |
| `longitude` | -122.4194 |

### Company and finance

| Method | Example |
|---|---|
| `company` | Acme Corp |
| `jobTitle` | Senior Developer |
| `iban` | DE89370400440532013000 |
| `creditCardNumber` | 4111111111111111 |
| `currencyCode` | EUR |

### Internet and technology

| Method | Example |
|---|---|
| `url` | https://example.com/page |
| `domainName` | example.com |
| `ipv4` | 192.168.1.100 |
| `ipv6` | 2001:0db8:85a3::8a2e:0370:7334 |
| `macAddress` | 00:1A:2B:3C:4D:5E |
| `uuid` | 550e8400-e29b-41d4-a716-446655440000 |

### Numbers and patterns

| Method | Arguments | Example |
|---|---|---|
| `randomNumber` | `[max_digits, strict]` | 42 |
| `numberBetween` | `[min, max]` | 18 |
| `randomFloat` | `[decimals, min, max]` | 3.14 |
| `numerify` | `["###-###"]` | 123-456 |
| `boolean` | `[chance_percent]` | true |
| `randomElement` | `[["a","b","c"]]` | b |

### Date and time

| Method | Arguments | Example |
|---|---|---|
| `date` | `["Y-m-d", "now"]` | 1985-03-15 |
| `time` | `["H:i:s"]` | 14:32:00 |
| `dateTime` | `["now"]` | 1985-03-15 14:32:00 |
| `dateTimeBetween` | `["-5 years", "now"]` | 2022-07-14 09:11:00 |
| `unixTime` | — | 1708000000 |
| `year` | — | 1985 |
| `timezone` | — | Europe/Berlin |

---

## Complete example

```yaml
# yaml-language-server: $schema=https://clonio.dev/schema/cloning-v1.json
version: "1"
connection: production-db

options:
  chunk_size: 500
  enforce_column_types: true
  drop_unknown_tables: false
  drop_extra_columns: false
  disable_foreign_key_checks: true
  faker_locale: de_DE

tables:
  users:
    rows:
      strategy: last
      limit: 5000
      sort_by: created_at
      clear: delete
    columns:
      id:
        strategy: remapping
        arguments:
          - use: random_integer
          - min: 100000
          - max: 9999999
          - foreign_keys:
              - table: orders
                column: user_id
      email:
        strategy: fake
        faker_method: safeEmail
        faker_arguments: []
      first_name:
        strategy: fake
        faker_method: firstName
        faker_arguments: []
      last_name:
        strategy: fake
        faker_method: lastName
        faker_arguments: []
      phone:
        strategy: mask
        visible_chars: 0
        mask_char: "*"
        preserve_format: true
      date_of_birth:
        strategy: fake
        faker_method: date
        faker_arguments: ["Y-m-d"]
      password:
        strategy: hash
        algorithm: sha256
        salt: "clonio"
      internal_notes:
        strategy: "null"
      account_tag:
        strategy: static
        value: "dev-imported"

  orders:
    rows:
      strategy: full
      clear: delete
    columns:
      id:
        strategy: remapping
        arguments:
          - use: random_integer
          - min: 100000
          - max: 9999999
          - foreign_keys:
              - table: order_items
                column: order_id
      shipping_address:
        strategy: fake
        faker_method: address
        faker_arguments: []

  order_items:
    rows:
      strategy: full
    # no PII — columns kept as-is

  audit_logs:
    rows:
      strategy: first
      limit: 100
      sort_by: created_at
```
