lighthouse Configuration | Peisong's Lighthouse

This document describes the YAML configuration schema consumed by the lighthouse CLI, including both the static-site pipeline and the Harbor platform runtime.

Config Loading
Top-Level Keys
GLOBAL
LIGHTHOUSE
HARBOR
EMAIL
CLOUDFLARE
REMOTE
RSS
RATINGS
SOURCES
Validation Rules
Command Notes

Config Loading

The CLI loads configuration from either:

a single YAML file
a config directory with command-specific files

Default config directory:

~/.lighthouse/config/

Use --config to point to a different file or directory.

The preferred layout is a config directory with separate files:

lighthouse.yml or lighthouse.yaml
harbor.yml or harbor.yaml

Those files should use the namespaced roots:

LIGHTHOUSE:
HARBOR:

Directory-loading behavior is command-specific:

static Lighthouse commands only scan lighthouse.yml|yaml
Harbor commands only scan harbor.yml|yaml

GLOBAL is reserved for future shared settings, but the current directory loaders do not scan global.yml.

For backward compatibility, explicit flat Lighthouse-only files still load when you point --config at that file directly.

Top-Level Keys

Supported root namespaces:

GLOBAL Reserved for shared operator settings.
LIGHTHOUSE Static-site build, deploy, rating, RSS, and source materialization config.
HARBOR Harbor build, runtime, auth, and app-hosting config.

GLOBAL

GLOBAL is intentionally sparse in v1. It exists so the config surface can stay unified while Lighthouse and Harbor keep separate schemas.

LIGHTHOUSE

LIGHTHOUSE uses the existing static-site schema. Required unless marked optional:

LIGHTHOUSE_CLONE_URL
LIGHTHOUSE_DIRECTORY
LIGHTHOUSE_BRANCH
BUILD_COMMAND
BUILD_OUTPUT_DIR
DEPLOY_ROOT
RETAIN_DEPLOYMENTS
EMAIL
CLOUDFLARE
REMOTE
RSS
RATINGS
SOURCES

The sections below still refer to the keys inside LIGHTHOUSE:.

HARBOR

Supported keys:

SOURCE_DIR
CLONE_URL
BRANCH
DEPLOY_ROOT
RETAIN_DEPLOYMENTS
RUNTIME_DIR
DATABASE_URL
BIND_HOST
BIND_PORT
PUBLIC_BASE_URL
OIDC
PERMISSIONS
MAINTENANCE
WORKSPACE
APPS

Key Harbor behaviors:

Harbor is server-local in v1
Harbor itself is sourced from exactly one mode:
- SOURCE_DIR
- or CLONE_URL + BRANCH
Harbor stores mutable platform state in Postgres via DATABASE_URL
docked apps are sourced from exactly one mode:
- SOURCE_DIR
- or CLONE_URL + BRANCH
docked apps are built and staged into Harbor releases
docked apps must load runtime settings from a config file
Harbor injects resolved identity, roles, and permissions when it proxies requests
Harbor owns the built-in /myself/ route; app slug myself is reserved and may not be used by docked apps

DATABASE_URL should point Harbor at a Postgres schema it owns for:

users
direct permissions
user preferences
maintenance state
Harbor API keys

When Harbor itself uses CLONE_URL + BRANCH, the CLI clones or updates that checkout inside the Harbor repo cache before building the Harbor runtime.

OIDC supported keys:

ENABLED
MODE
ISSUER_PATH
CLIENT_ID
CLIENT_SECRET
SESSION_COOKIE_NAME
SESSION_SECRET
EMBEDDED_USERS

Only MODE: embedded is supported in v1.

Each EMBEDDED_USERS entry supports:

USERNAME
DISPLAY_USERNAME
PASSWORD
SUBJECT
NAME
EMAIL
ROLES

Harbor normalizes canonical usernames to lowercase and only allows:

a-z
0-9
_
-

The normalized canonical username is used for identity and route-safe slugs. DISPLAY_USERNAME is the human-facing label shown in Harbor and apps. If DISPLAY_USERNAME is omitted, the raw configured USERNAME value is used for display before normalization.

EMBEDDED_USERS are bootstrap-only seed users, not the long-term mutable user store. Run lighthouse harbor migrate-db to create the Harbor schema and import either legacy state or embedded users into Postgres. After the Harbor database is initialized, Harbor ignores EMBEDDED_USERS for steady-state auth and reads mutable users, permissions, preferences, and maintenance state from Postgres. Once that database state exists, EMBEDDED_USERS may be removed from config.

Harbor DB-backed admin commands:

lighthouse harbor add-user
lighthouse harbor set-password
lighthouse harbor list-permissions
lighthouse harbor list-all-permissions
lighthouse harbor add-permissions
lighthouse harbor delete-permissions
lighthouse harbor delete-user
lighthouse harbor migrate-db
lighthouse harbor maintenance on
lighthouse harbor maintenance off

Permission notes:

effective permissions are the union of role-derived permissions and direct user permissions
the direct admin permission acts as a wildcard and grants all app permissions
Harbor resolves principals in this order:
- API key bearer token
- Harbor browser session cookie
- guest
docked apps receive the derived Harbor identity via forwarded X-Harbor-* headers, including auth method, theme, timezone, and API key ID when applicable

MAINTENANCE remains in config as default bootstrap values only. Harbor writes the live maintenance state into Postgres during database initialization and then mutates the database-backed value through the Harbor CLI.

WORKSPACE supported keys:

ROOT_DIR
MAX_LEASES
DEFAULT_SIZE_MB

APPS is a list. Each app supports:

ID
NAME
DESCRIPTION
SLUG
SOURCE_DIR
CLONE_URL
BRANCH
BUILD_COMMAND
RUN_COMMAND
CONFIG_PATH
PUBLIC_BIND_ADDR
MANAGEMENT_SOCKET_PATH
GUEST_CAN_VIEW
READ_PERMISSIONS
WRITE_PERMISSIONS

Each app must specify exactly one source mode:

SOURCE_DIR
or CLONE_URL together with BRANCH

DESCRIPTION is optional. Harbor uses it on the landing page cards.

Whiteboard example:

SLUG: whiteboard
GUEST_CAN_VIEW: true
READ_PERMISSIONS: []
WRITE_PERMISSIONS: []
Whiteboard relies on Harbor principal resolution rather than app-specific Harbor permissions; private server documents and MCP still require an authenticated Harbor principal inside Whiteboard

BUILD_COMMAND and RUN_COMMAND support these placeholders:

{release_dir}
{app_dir}
{config_path}
{runtime_dir}

EMAIL

Supported keys:

ENABLED
FROM
USER
SMTP_ADDR
SMTP_PORT
PASSWD
DESTINATIONS
STARTTLS
IMPLICIT_TLS

If EMAIL.ENABLED is true, the SMTP fields above must be fully specified and DESTINATIONS must be non-empty.

Run-log delivery is controlled per invocation with --email-log. SMTP configuration alone does not automatically send mail on every run.

CLOUDFLARE

Supported keys:

ENABLED
ZONE_ID
CLOUDFLARE_API_KEY

When CLOUDFLARE.ENABLED is true, these fields become required:

ZONE_ID
CLOUDFLARE_API_KEY

Behavior:

runs only during apply
executes after the new release is activated
sends a Cloudflare purge_everything request for the configured zone
fails the command loudly if the Cloudflare API request fails or the API reports an unsuccessful purge

Example:

CLOUDFLARE:
  ENABLED: true
  ZONE_ID: "replace-me"
  CLOUDFLARE_API_KEY: "replace-me"

REMOTE

Supported keys:

ENABLED
ROLE
SSH_HOST
SSH_PORT
REMOTE_RUN_DIR
LOCAL_CACHE_DIR
REMOTE_CACHE_DIR
LIGHTHOUSE_CLI_CLONE_URL
LIGHTHOUSE_CLI_REPO_PATH

Roles:

local
remote

Validation rules:

ROLE is required when REMOTE.ENABLED is true
REMOTE_RUN_DIR is required when REMOTE.ENABLED is true
SSH_PORT must be an integer >= 1
SSH_HOST is required when ROLE is local
LIGHTHOUSE_CLI_CLONE_URL is required when ROLE is local
LIGHTHOUSE_CLI_REPO_PATH is required when ROLE is local

Defaults:

SSH_PORT: 22
LOCAL_CACHE_DIR: <run-dir>/remote-cache
REMOTE_CACHE_DIR: <REMOTE_RUN_DIR>/remote-cache

Behavior:

remote build is valid only for ROLE=local
remote send is valid only for ROLE=local
remote apply is valid for both roles
local remote apply checks the remote cache and then SSHes into the target to invoke remote remote apply
before triggering remote deployment, local remote apply ensures the configured remote lighthouse-cli checkout exists, pulls or clones it, installs or upgrades the package, and syncs a derived remote-role config.yml
local remote apply also syncs portable content metadata (content-state.json) into the remote run dir
if local and remote content metadata differ, the CLI pauses and asks you to type local or remote
remote remote apply deploys a tarball already staged in REMOTE_CACHE_DIR/incoming/
remote artifact deploys never rebuild from source

Example local-side config:

REMOTE:
  ENABLED: true
  ROLE: "local"
  SSH_HOST: "lighthouse@ubuntu-main"
  SSH_PORT: 22
  REMOTE_RUN_DIR: "/var/lib/lighthouse"
  LIGHTHOUSE_CLI_CLONE_URL: "https://git.peisongxiao.com/peisongxiao/lighthouse-cli.git"
  LIGHTHOUSE_CLI_REPO_PATH: "/srv/lighthouse-cli"

Example remote-side config:

REMOTE:
  ENABLED: true
  ROLE: "remote"
  REMOTE_RUN_DIR: "/var/lib/lighthouse"

RSS

Supported keys:

ENABLED
RSS_FEED_RATING_THRESHOLD
UPDATED_DIFF_RATING_THRESHOLD

Defaults:

ENABLED: true
RSS_FEED_RATING_THRESHOLD: 7.0
UPDATED_DIFF_RATING_THRESHOLD: -1.0

Behavior:

RSS_FEED_RATING_THRESHOLD gates RSS inclusion by combined post rating
UPDATED_DIFF_RATING_THRESHOLD <= 0 disables the future diff-rating LLM path and treats updated posts mechanically
the CLI materializes RSS policy into machine-owned site data so the Jekyll site can render the final XML feeds

Example:

RSS:
  ENABLED: true
  RSS_FEED_RATING_THRESHOLD: 7.0
  UPDATED_DIFF_RATING_THRESHOLD: -1.0

RATINGS

Supported keys:

ENABLED
PROVIDER
OPENROUTER_API_KEY
RATINGS_MODEL
DEFAULT_SCORE
MAX_RETRIES
MAX_THREADS
HTTP_TIMEOUT_SECONDS
REASONING_EFFORT
PROMPT

Current provider support is intentionally narrow:

PROVIDER must be openrouter

When RATINGS.ENABLED is true, these fields become required:

OPENROUTER_API_KEY
RATINGS_MODEL
PROMPT

DEFAULT_SCORE must stay within [0.0, 5.0].

MAX_RETRIES must be an integer >= 1.

MAX_THREADS must be an integer >= 1.

HTTP_TIMEOUT_SECONDS must be an integer >= 1.

REASONING_EFFORT must be one of:

xhigh
high
medium
low
minimal
none

Rating generation behavior:

runs during build, apply, and local
never runs during validate
retries provider failures and invalid structured outputs up to MAX_RETRIES
runs fresh rating jobs through a bounded central worker pool sized by MAX_THREADS
uses HTTP_TIMEOUT_SECONDS for the OpenRouter HTTP read timeout
sends REASONING_EFFORT through OpenRouter’s reasoning.effort field
explicitly disables streaming for rating requests
clamps out-of-bounds scores into [0.0, 5.0] instead of retrying
falls back to DEFAULT_SCORE after the retry budget is exhausted

The prompt lives in config, not in the repo. The CLI appends the raw source document below that prompt at runtime.

Example:

RATINGS:
  ENABLED: true
  PROVIDER: "openrouter"
  OPENROUTER_API_KEY: "replace-me"
  RATINGS_MODEL: "openai/gpt-5-mini"
  DEFAULT_SCORE: 2.5
  MAX_RETRIES: 3
  MAX_THREADS: 1
  HTTP_TIMEOUT_SECONDS: 120
  REASONING_EFFORT: "low"
  PROMPT: |
    --- BEGIN TASK DESCRIPTION ---
    Rate the provided document for standalone long-term
    showcase value on a personal website.
    --- END TASK DESCRIPTION ---

    --- BEGIN SCORE EXPLANATION ---
    Use a 0.0 to 5.0 scale where 2.5 is neutral.
    --- END SCORE EXPLANATION ---

    --- BEGIN OUTPUT FORMAT ---
    Return JSON with score, reason, and signals.
    --- END OUTPUT FORMAT ---

SOURCES

Each source entry supports:

NAME
CLONE_URL
BRANCH
URL_PATH
POST_TAG
RECENT_POSTS

URL_PATH is optional. If omitted, the default is:

/projects/<normalized-name>/

Special cases can override this explicitly, for example:

/blogs/
/maestro/

URL_PATH is the source root navigator URL. Source roots cannot be nested inside one another: /prefix-1/ and /prefix-1/subprefix/ conflict, while /prefix/subprefix-1/ and /prefix/subprefix-2/ are valid siblings if /prefix/ is not itself configured as a source root.

POST_TAG is optional. It controls the badge shown on cards for materialized posts from that source. If omitted, the CLI derives the badge from NAME by replacing - and _ with spaces and uppercasing the result.

Examples:

POST_TAG: "BLOGS"

POST_TAG: "LANGUAGE DESIGN"

RECENT_POSTS is an optional list of glob patterns evaluated relative to the source repo root. Matching discovered source documents are marked for the homepage recent-posts section. Any **/ segment also matches the current directory at that level, so:

**/*.md includes both README.md and nested Markdown files
**/*.tex includes both README.tex and nested LaTeX files
thoughts/**/*.md includes both thoughts/post.md and deeper files

Examples:

RECENT_POSTS:
  - "**/*.md"
  - "**/*.tex"

RECENT_POSTS:
  - "thoughts/**/*.md"
  - "announcements/*.md"

If RECENT_POSTS is omitted or empty, that source repo does not contribute surfaced posts to the homepage recent-posts strip or to the surfaced-post ordering in navigator views.

Validation Rules

Validation is intentionally strict. The CLI will fail if it sees:

unknown config keys
wrong value types
empty required strings
RETAIN_DEPLOYMENTS < 1
RATINGS.DEFAULT_SCORE outside [0.0, 5.0]
RATINGS.MAX_RETRIES < 1
RATINGS.MAX_THREADS < 1
REMOTE.SSH_PORT < 1
enabled REMOTE local-role blocks without SSH_HOST
enabled REMOTE blocks without ROLE and REMOTE_RUN_DIR
enabled CLOUDFLARE blocks without ZONE_ID and CLOUDFLARE_API_KEY
invalid URL_PATH formatting
nested or duplicate source root URL_PATH values
URL or generated-path conflicts
missing local document references
references to unmanaged .md or .tex targets
missing referenced local assets
existing runtime lock files for non-config-only operations

Error messages are designed to say:

what key or path failed
what type or shape was expected
what value was actually received

Command Notes

validate --config-only validates merged config only
validate validates config and the pre-build deployment inputs
validate --fresh validates using the same full input pipeline but ignores prior incremental rating metadata for the current run
build validates, syncs, materializes, rates incrementally, builds without deployment, and writes a state snapshot for future incremental runs
apply runs the full rating, build, state, and deployment path, then purges Cloudflare if CLOUDFLARE.ENABLED is true
clean supports --target lighthouse and --target harbor
clean --target lighthouse acquires the runtime lock, removes the static-site cache/repos directory with rm -rf, and recreates the empty cache directory
clean --target harbor removes the Harbor repo cache directory under the configured Harbor runtime tree and recreates it empty
apply --target harbor treats both Harbor itself and clone-backed docked apps as deploy inputs for change detection, so app-only repo updates trigger Harbor redeploys
check-deps validates that the current environment has the executables required for the configured feature set
local PORT rates incrementally, builds, writes state, and serves locally without deploying to /var/www/lighthouse It ignores DEPLOY_ROOT.
remote build runs the full local materialize-and-build path, then packages the built site under LOCAL_CACHE_DIR
remote send uploads a packaged tarball into REMOTE_CACHE_DIR/incoming/
remote apply on ROLE=local, verifies the staged remote tarball and triggers deployment over SSH
remote apply on ROLE=local also resolves and syncs portable content metadata before deployment
remote apply on ROLE=remote, unpacks the staged tarball, activates a release, writes remote deploy state, rotates old releases, and optionally purges Cloudflare

Table of Contents

Config Loading

Top-Level Keys

GLOBAL

LIGHTHOUSE

HARBOR

EMAIL

CLOUDFLARE

REMOTE

RSS

RATINGS

SOURCES

Validation Rules

Command Notes