How to Build Btrieve/Pervasive Data Definition Files (DDF) — Step‑by‑Step

Troubleshooting and Optimizing Btrieve Pervasive Data Definition File Makers

This article covers common problems with Data Definition File (DDF) makers for Btrieve/Pervasive (Pervasive.SQL) and practical steps to diagnose, fix, and optimize both the DDF generation process and the resulting DDFs so your applications and queries run reliably and efficiently.

1. Quick overview: what a DDF maker does

A DDF maker reads raw Btrieve file structures (fixed/variable key definitions, record layouts) and produces the three DDF files — FILE.DDF, FIELD.DDF, and INDEX.DDF — needed to expose Btrieve data via Pervasive SQL and ODBC/JDBC. Problems can arise from incomplete metadata, ambiguous key definitions, incorrect data types, or mismatches between file contents and the generated DDFs.

2. Common problems and immediate checks

  • Corrupt or inaccessible Btrieve files
    • Verify file system integrity and access permissions.
    • Use file-copy tools that preserve record boundaries; avoid editors that may alter binary structure.
  • Incorrect record/field lengths
    • Compare produced FIELD.DDF definitions with a binary dump of sample records.
    • Look for off‑by‑one length mistakes, wrong padding, or overlooked trailing fields.
  • Wrong data types (numeric vs. string, little/big‑endian)
    • Confirm numeric formats (signed/unsigned, integer vs. packed decimal, float) by sampling values and checking for gibberish.
  • Misdefined keys and duplicate or missing indexes
    • Ensure key start positions and lengths match the actual file layout.
    • Verify uniqueness flags on keys; a unique key marked non‑unique (or vice versa) breaks some queries.
  • Unexpected NULL/blank handling
    • Btrieve files often use sentinels or all‑FF/00 bytes for “missing” values — map those deterministically in the DDF.
  • Collation and character set mismatches
    • Ensure the DDF uses the correct code page/charset (ASCII vs. EBCDIC vs. UTF‑8). Wrong charset causes garbled strings and failed joins.
  • Permissions and locking when using generated DDFs
    • Check Pervasive engine settings and file-level locks; concurrent writes may require specific locking modes.

3. Diagnostic workflow (step-by-step)

  1. Back up the original Btrieve files.
  2. Extract a representative sample set of records (start/middle/end) for inspection.
  3. Use a hex editor or binary viewer to map field offsets and lengths.
  4. Compare those offsets to the FIELD.DDF output; note any mismatches.
  5. Inspect INDEX.DDF for key definitions; test simple index lookups with ISQL or ODBC.
  6. Run representative queries (SELECT, JOIN, ORDER BY) and capture any SQL errors or incorrect results.
  7. If results are wrong, isolate whether the issue is type conversion, indexing, or data corruption by testing:
    • Reading raw records directly with a small test program
    • Querying single fields via ODBC
    • Scanning index ranges
  8. Iterate on DDF edits, reloading them in the Pervasive engine after each change.

4. How to fix specific issues

  • Field offsets/lengths wrong
    • Recalculate offsets from the record layout and update FIELD.DDF; regenerate dependent INDEX.DDF entries.
  • Numeric display or arithmetic errors
    • Change the FIELD.DDF data type to the correct numeric format, and set the appropriate precision/scale.
  • Keys returning wrong rows
    • Correct key position/length; ensure duplicate characters (padding) are handled consistently.
  • Character garbling
    • Set the correct code page in the engine or convert the data to a consistent encoding before generating DDFs.
  • NULL/missing data treated as real values
    • Use application-side translation or define sentinel mappings in the DDF where supported.
  • Performance issues with generated DDFs
    • Reevaluate key definitions, add composite keys where queries need them, and remove unused indexes that slow inserts/updates.

5. Performance optimization checklist

  • Keep DDFs minimal: only define fields and indexes that applications require.
  • Match index types to query patterns:
    • Use single-column indexes for frequent equality lookups.
    • Use composite indexes for common multi-column filtering and ORDER BY requirements.
  • Avoid wide or variable-length fields in leading key positions.
  • Align field offsets to natural boundaries for numeric types to prevent misaligned reads.
  • Use appropriate data types (packed decimal for financials, integers for counters) to reduce storage and speed comparisons.
  • Rebuild or compact Btrieve files if fragmentation or deleted-record bloat is suspected.
  • Monitor statistics and query plans (where available) to find slow scans vs. indexed seeks.
  • If many reads are occurring, consider shadow copies or read-only replicas to offload reporting.

6. Automation and tooling tips

  • Validate DDFs automatically by running a suite of test queries after generation.
  • Incorporate binary-structure parsers into the DDF maker to infer types, but always surface ambiguous fields for review.
  • Provide a “dry-run” mode in the maker that reports inferred offsets, types, and confidence levels.
  • Version-control generated DDF files and add checksums for the source Btrieve files to detect drift.
  • Use scripting to regenerate DDFs and reload them into the engine as part of a deployment pipeline.

7. Migration considerations (long-term)

  • If DDF inconsistencies are frequent, evaluate migrating Btrieve data into a modern RDBMS (Postgres, MySQL, SQL Server). Migration helps eliminate DDF maintenance but requires:
    • Mapping DDF schemas to relational types.
    • Migrating indexes and constraints.
    • Converting legacy encodings and numeric formats.
  • For phased migrations, keep accurate, updated DDFs for compatibility while you extract historical data.

8. Example: quick checklist to run when a DDF-generated table returns wrong data

  1. Backup files.
  2. Dump sample records.
  3. Confirm field offsets/lengths.
  4. Verify numeric/char types and code page.
  5. Test indexes with simple WHERE clauses.
  6. Fix DDF entries, reload, retest.

9. When to seek expert help

  • Consistent data corruption across many files.
  • Ambiguous packed/BCD numeric formats you cannot safely interpret.
  • Complex, nested variable-length records with repeating segments.
  • Performance problems that persist after index and DDF tuning.

10. Useful commands and tools (examples)

  • Use Pervasive ISQL or ODBC client to run quick queries.
  • Hex editor (HxD, 010 Editor) to inspect raw records.
  • Scripting languages (Python with struct module) to parse and validate record layouts.
  • DDF maker utilities with verbose/dry-run modes.

Conclusion Follow a systematic diagnostic workflow: inspect raw data, compare to generated DDFs, fix types/offsets/indexes, and iterate with tests. Optimize by tailoring indexes to query patterns, minimizing unnecessary fields/indexes, and automating validation. If problems persist or the maintenance burden is high, plan a migration to a modern RDBMS.

(date: 2026-02-06)

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *