Automate Exports with PGToTxt: Scripts, Tips, and Best Practices

Automate Exports with PGToTxt: Scripts, Tips, and Best Practices

Overview

PGToTxt is a tool/approach to export PostgreSQL table data into plain-text formats (CSV, TSV, fixed-width, or custom-delimited). Automating exports saves time, ensures reproducibility, and enables downstream processing (ETL, reporting, backups).

Key Features to Automate

  • Scheduled exports (cron, systemd timers)
  • Format options: CSV, TSV, pipe-delimited, fixed-width
  • Column selection and ordering
  • Incremental exports using timestamps or high-water marks
  • Compression (gzip) and rotation
  • Secure credential handling (environment variables, .pgpass)

Example Bash Script (cron-friendly)

bash

#!/usr/bin/env bash set -euo pipefail # Config PGHOST=localhost PGPORT=5432 PGUSER=export_user PGDATABASE=mydb EXPORT_DIR=/var/exports/pgtotxt TABLE=my_table COLUMNS=“id,name,created_at,amount” WHERE=“created_at >= NOW() - INTERVAL ‘1 day’” OUTFILE=\({EXPORT_DIR}</span><span class="token" style="color: rgb(163, 21, 21);">/</span><span class="token" style="color: rgb(54, 172, 170);">\){TABLE}_\((</span><span class="token" style="color: rgb(57, 58, 52);">date</span><span class="token" style="color: rgb(54, 172, 170);"> +%F</span><span class="token" style="color: rgb(54, 172, 170);">)</span><span class="token" style="color: rgb(163, 21, 21);">.csv.gz"</span><span> </span> <span></span><span class="token" style="color: rgb(57, 58, 52);">mkdir</span><span> -p </span><span class="token" style="color: rgb(163, 21, 21);">"</span><span class="token" style="color: rgb(54, 172, 170);">\)EXPORT_DIR psql “host=\(PGHOST</span><span class="token" style="color: rgb(163, 21, 21);"> port=</span><span class="token" style="color: rgb(54, 172, 170);">\)PGPORT user=\(PGUSER</span><span class="token" style="color: rgb(163, 21, 21);"> dbname=</span><span class="token" style="color: rgb(54, 172, 170);">\)PGDATABASE </span> -c \copy (SELECT \(COLUMNS</span><span class="token" style="color: rgb(163, 21, 21);"> FROM </span><span class="token" style="color: rgb(54, 172, 170);">\)TABLE WHERE \(WHERE</span><span class="token" style="color: rgb(163, 21, 21);">) TO STDOUT WITH CSV HEADER"</span><span> </span><span class="token" style="color: rgb(57, 58, 52);">\</span><span> </span><span> </span><span class="token" style="color: rgb(57, 58, 52);">|</span><span> </span><span class="token" style="color: rgb(57, 58, 52);">gzip</span><span> </span><span class="token" style="color: rgb(57, 58, 52);">></span><span> </span><span class="token" style="color: rgb(163, 21, 21);">"</span><span class="token" style="color: rgb(54, 172, 170);">\)OUTFILE

Scheduling

  • Use cron: add 0 2 * * * /path/to/export.sh for daily 02:00 runs.
  • For better reliability, prefer systemd timers (restart policies, logging) in production.

Incremental Export Patterns

  • High-water mark column (e.g., updated_at): store last max(updated_at) in a state file and export rows > last_value.
  • Use transactional consistency: run SELECT within same transaction or rely on snapshot isolation if necessary.

Security & Credentials

  • Prefer .pgpass with strict file permissions or environment variables injected by a secrets manager.
  • Avoid hardcoding passwords in scripts.
  • Restrict DB role to only SELECT privileges needed for export.

Performance Tips

  • Export only needed columns and use indexes in WHERE clauses.
  • Use COPY/psql \copy to stream data efficiently.
  • For large tables, export by range (partition or id-range) and parallelize.
  • Disable autocommit in transactional pipelines when combining multiple steps.

File Handling & Retention

  • Compress outputs (gzip, zstd) to save storage.
  • Use predictable filenames with timestamps.
  • Implement retention: delete files older than N days or move to cold storage.
  • Verify checksums (sha256) for integrity when transferring.

Error Handling & Monitoring

  • Exit with nonzero status on failure and log stderr/stdout to rotating logs.
  • Emit a small success/failure JSON or status file for orchestrators.
  • Integrate alerting (email, Slack) for repeated failures.

Best Practices Checklist

  • Automation: schedule with cron/systemd; add retries/backoff.
  • Security: avoid embedded secrets; minimal privileges.
  • Reliability: atomic writes (temp file + mv), checksums, retries.
  • Performance: use COPY, limit columns, parallelize large exports.
  • Observability: logs, metrics, alerts, and exported-file manifests.

If you want, I can generate a ready-to-deploy export script for your specific table schema, retention policy, and schedule.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *