Pipeline Scaffold
Below is a visualization of the pipeline registry scaffolds. Each section shows the folder layout generated by the scaffold and the intent of the key folders.
Python implementation scaffold
No Python scaffold found.
TypeScript implementation scaffold
What the folders mean
{pipeline}/{version}/{author}/{language}/{implementation}/_meta
- Holds all pipeline metadata at the implementation level.
- Files: `pipeline.json` (identifier, name, author, version, source/destination config, schedule, etc.), `README.md`, `CHANGELOG.md`, `LICENSE`, and `assets/` for logos and lineage diagrams.
- The `assets/` folder contains:
- `from/` subdirectory for source system logos
- `to/` subdirectory for destination system logos
- Lineage diagrams (e.g., `lineage.mmd`, `lineage.svg`)
- Each implementation has its own `_meta` folder, allowing different implementations to have different configurations and schedules.
Language implementations under {pipeline}/{version}/{author}/{language}/{implementation}
- `python/{implementation}/` and `typescript/{implementation}/` contain helpers and runnable code.
- Prefer placing docs adjacent to the implementation:
- `docs/` for human-facing guides (getting started, config, outputs)
- `schemas/` at the top level of the language directory for machine-readable datasets/index
- `src/` for code (with subfolders like `extract/`, `transform/`, `load/`)
- `tests/` for unit tests
- `scripts/` for automation like lineage generation
- `lineage/` for lineage-specific schemas and manifests
Key Differences from Connectors
- Lineage tracking - Pipelines include lineage diagrams and manifests
- Source/Destination config - Pipeline metadata includes source and destination specifications
- Transformation focus - More emphasis on transformation logic and data flow
- Scheduling - Built-in support for cron schedules and timezone configuration
Notes
- The `_meta` folder is now at the implementation level, containing all metadata for that specific pipeline implementation.
- Documentation goes in the `docs/` folder within each implementation, not in the `_meta` folder.
- Place schemas at the top level of each language implementation in the `schemas/` folder (not under `src`).
- Lineage diagrams and definitions are stored within the implementation:
- `lineage/` folder for lineage schemas and manifests
- `moose/` folder for Moose-specific lineage manifests
- `_meta/assets/` for generated lineage diagrams and system logos