Tools

HTML Keyless Extractor

Browser-only prototype for converting pasted HTML into raw DOM evidence, cleaned text records, and positional keyless arrays.

  • Record JAI-TOOL-0024
  • Path /tools/html-keyless-extractor/
  • Use Canonical public record

Document status

Public standards page Published on JustAnIota.com as part of the current public standards record
Code
JAI-TOOL-0024
Surface
Tools
Access
Public and linkable

How to use this page

Use this page as part of the current Tools public record, then follow its linked standards pages for the next step.

Plain English

The HTML Keyless Extractor turns HTML source into a reviewable compact-data candidate without treating markup as semantic truth.

Technical summary

It separates raw parse evidence, cleaned visible text, normalization choice, registry candidates, warnings, and the final positional array so a reviewer can see what was preserved and what was compressed.

Deep spec

This is a local implementation prototype. It does not replace schema validation, signed registry snapshots, production canonicalization, or UAIX.org protocol authority.

Extractor stages

  • Parse pasted HTML in the browser and inventory source tags.
  • Remove script-like and invisible source concerns from the visible-text candidate.
  • Normalize text with NFC or NFKC and split it into reviewable units.
  • Attach demo registry hits separately from the positional keyless array.
  • Warn when raw HTML evidence and cleaned text are not enough for production reuse.

HTML Keyless Extractor

Turn HTML into a reviewable keyless candidate

This browser-only prototype follows the intake reports: parse HTML as source evidence, clean visible text, normalize it, then emit a compact positional array that still depends on declared registry meaning.

Raw DOM evidence

			
Cleaned text record

			
Keyless array