# g729 GitHub Pages

This directory is the source for <https://g729.huny.dev/>.

GitHub Pages should be configured to publish from the `main` branch and the
`/docs` path. `CNAME` pins the custom domain to `g729.huny.dev`.

## Search and AI-readable metadata

The Pages root includes static metadata intended to make the project easier for
search engines, GitHub previews, and AI retrieval systems to understand:

- `docs/index.html` — canonical landing page with static project copy,
  Open Graph/Twitter metadata, and JSON-LD structured data.
- `docs/robots.txt` — crawler policy with a sitemap pointer.
- `docs/sitemap.xml` — canonical public Pages URLs.
- `docs/llms.txt` — informal AI-readable orientation file.
- `docs/ai-summary.md` — quote-safe canonical facts and claim boundaries.
- `docs/seo-geo-checklist.md` — manual search-console and GitHub metadata
  checklist.

Google's generative AI Search guidance treats AEO/GEO work as SEO: crawlable,
useful, people-first content remains the main path. The `llms.txt` file is
provided as a practical orientation aid only. It is not special Google Search
markup and is not a claim that major AI platforms officially support or rank
`llms.txt` files.

## Verification note

The Pages assets are demo and listening assets, not conformance oracle data.
The current strict decoder conformance result is recorded in the repository
README: private ITU Annex A oracle verification matches `740800/740800` final
PCM samples sample-for-sample. The private oracle CSVs and official ITU
test-vector files are intentionally not stored in `docs/` or redistributed as
MIT-licensed source.

## Audio samples

The landing page publishes a small owner-provided source WAV and generated
listening samples:

- `docs/assets/audio/source-8k-16bit.wav` — source speech sample downloaded
  from `https://download.huny.dev/d/./8k_16bit.wav`.
- `docs/assets/audio/bcg729-encode.g729` — raw G.729 payload generated from
  the padded source sample by a local `bcg729` black-box executable.
- `docs/assets/audio/bcg729-encode-ffmpeg-decode.wav` — `bcg729` payload
  decoded by the local FFmpeg executable as a black-box decoder.
- `docs/assets/audio/bcg729-encode-g729-decode.wav` — `bcg729` payload decoded
  by this repository's exact local decoder.
- `docs/assets/audio/g729-encode.g729` — raw G.729 payload generated by this
  repository's current `EncoderProfileCore` default from the padded source
  sample.
- `docs/assets/audio/g729-encode-ffmpeg-decode.wav` — local encoder payload
  decoded by the local FFmpeg executable as a black-box decoder.
- `docs/assets/audio/g729-encode-g729-decode.wav` — local encoder payload
  decoded by this repository's exact local decoder.

The current sample has 64,271 source samples. It is padded with 49 zero samples
to close the final 10 ms G.729 frame.

```sh
src=docs/assets/audio/source-8k-16bit.wav
work=/tmp/g729-pages-audio
mkdir -p "$work"

ffmpeg -y -hide_banner -loglevel error \
  -i "$src" -af apad=pad_len=49 -t 8.04 -ar 8000 -ac 1 \
  -f s16le "$work/source-padded.pcm"

go run ./examples/encode_pcm < "$work/source-padded.pcm" \
  > docs/assets/audio/g729-encode.g729

third-party/bcg729-blackbox/bcg729_encode < "$work/source-padded.pcm" \
  > docs/assets/audio/bcg729-encode.g729

go run ./examples/decode_g729 < docs/assets/audio/g729-encode.g729 \
  > "$work/g729-encode-g729-decode.pcm"

go run ./examples/decode_g729 < docs/assets/audio/bcg729-encode.g729 \
  > "$work/bcg729-encode-g729-decode.pcm"

ffmpeg -y -hide_banner -loglevel error \
  -f s16le -ar 8000 -ac 1 -i "$work/g729-encode-g729-decode.pcm" \
  -c:a pcm_s16le docs/assets/audio/g729-encode-g729-decode.wav

ffmpeg -y -hide_banner -loglevel error \
  -f s16le -ar 8000 -ac 1 -i "$work/bcg729-encode-g729-decode.pcm" \
  -c:a pcm_s16le docs/assets/audio/bcg729-encode-g729-decode.wav

ffmpeg -y -hide_banner -loglevel error \
  -f g729 -i docs/assets/audio/g729-encode.g729 \
  -ar 8000 -ac 1 -c:a pcm_s16le \
  docs/assets/audio/g729-encode-ffmpeg-decode.wav

ffmpeg -y -hide_banner -loglevel error \
  -f g729 -i docs/assets/audio/bcg729-encode.g729 \
  -ar 8000 -ac 1 -c:a pcm_s16le \
  docs/assets/audio/bcg729-encode-ffmpeg-decode.wav
```

## Blind arena samples

The blind listening arena uses an additional short, freely available speech WAV
from the Open Speech Repository:

- `docs/assets/audio/arena/source-osr-us-0010-8k.wav` —
  `OSR_us_000_0010_8k.wav`, American English Harvard sentences, 16-bit PCM at
  8 kHz.
- `docs/assets/audio/arena/trial-XX-bcg729-ffmpeg.wav` — the corresponding
  2.8 second speech-active clip encoded by a local `bcg729` black-box
  executable and decoded by FFmpeg as a black-box decoder.
- `docs/assets/audio/arena/trial-XX-our-loopback.wav` — the same 2.8 second
  speech-active clip encoded by this repository's `EncoderProfileCore` default
  and decoded by this repository's exact local decoder.

The arena trial order is fixed. The web page randomizes only the left/right
placement for each trial. Each selected source clip is peak-normalized to 18000
before both codec paths so the blind comparison is not dominated by source
level differences.

Source attribution required by the provider: "Open Speech Repository".
Source page:
`https://www.voiptroubleshooter.com/open_speech/american.html`.

Regenerate arena outputs with:

```sh
G729_WRITE_PAGES_AUDIO_ARENA=1 go test -run TestPagesAudioArenaWriteGoldenOutputs -count=1 -v
```

Keep generated sample files small enough for GitHub Pages. If samples become
large, publish them through a reviewed static asset host and update
`docs/index.html` to point at those URLs.

## WebAssembly demo

The Pages site includes a Go WebAssembly build of the public codec API:

- `docs/assets/wasm/g729.wasm`
- `docs/assets/wasm/wasm_exec.js`

The browser demo accepts normal browser-decodable audio files, resamples them
to 8 kHz mono signed 16-bit PCM through Web Audio, runs the Go WASM
`EncoderProfileCore` encode/decode path, and previews the exact 8 kHz input
and decoded WAV through the same custom waveform player used by the listening
samples. The live loopback control feeds the same 8 kHz PCM to a WASM
streaming encoder session in 10 ms or 20 ms chunks and schedules decoded chunks
through Web Audio as they are emitted. Raw `.g729` payload uploads are decoded
directly through the WASM decoder. The WASM demo is a smoke/interoperability
check; the reviewed listening references are the generated sample files above.

The checked-in WASM binary must be rebuilt whenever the codec algorithm or
default encoder profile changes. The current public asset is built from the
Core-default code path and has SHA-256:

```text
1799f324b282916afe3382c79c934f213709906a2d365e263e28533bc7ff43cf  docs/assets/wasm/g729.wasm
```

Rebuild the WASM asset with:

```sh
wasm_exec="$(go env GOROOT)/lib/wasm/wasm_exec.js"
[ -f "$wasm_exec" ] || wasm_exec="$(go env GOROOT)/misc/wasm/wasm_exec.js"
cp "$wasm_exec" docs/assets/wasm/wasm_exec.js
GOOS=js GOARCH=wasm go build -o docs/assets/wasm/g729.wasm ./cmd/g729wasm
sha256sum docs/assets/wasm/g729.wasm
```

## Golden regression gate

`TestPagesAudioGoldenOutputs` is part of the default Go test suite. It parses
`source-8k-16bit.wav`, pads it to a 10 ms frame boundary, re-encodes it through
the current Go encoder, and byte-compares the result against
`g729-encode.g729`. It also decodes that payload through the current Go decoder
and byte-compares the PCM payload against `g729-encode-g729-decode.wav`. The
same test decodes the checked-in `bcg729-encode.g729` payload through the local
decoder and byte-compares it against `bcg729-encode-g729-decode.wav`; FFmpeg
decode WAVs are format/length checked because they are black-box executable
outputs.

If an intentional codec algorithm change updates the public sample output,
regenerate the sample files and review the byte-level diff in the same change.
