ASG Model Guide
The Abstract Semantic Graph (ASG) is Ascribe's output model, matching the official AsciiDoc specification's semantic schema. While the AST captures parser-level structure, the ASG represents the document in the canonical form expected by the AsciiDoc TCK.
ASG vs AST
| Aspect | AST | ASG |
|---|---|---|
| Purpose | Parser output | Canonical document model |
| Positions | Span (past-end) |
Location (inclusive, 1-based) |
| Headings | Flat Heading nodes |
Nested Section containers |
| Inline markup | Bold, Italic, Mono |
Span with variant/form |
| Lists | UnorderedList/OrderedList |
List with variant/marker |
| JSON | Not serializable | Schema-derived codecs |
Node Type Hierarchy
All ASG types are defined in io.eleven19.ascribe.asg:
sealed trait Node derives Schema:
def location: Location
def nodeType: String // "block", "inline", or "string"
sealed trait Block extends Node // all block-level content
sealed trait Inline extends Node // all inline content
Block Types
- Document -- top-level container (extends
Nodedirectly, notBlock) - Section -- heading + nested blocks, with
level - Heading -- discrete heading (not part of a section)
- Paragraph -- contains
inlines: Chunk[Inline] - Listing, Literal, Pass, Stem, Verse -- verbatim/special blocks with
form,delimiter,inlines - Sidebar, Example, Admonition, Open, Quote -- parent blocks with
form,delimiter,blocks - List, DList -- ordered/unordered and description lists
- ListItem, DListItem -- list entries with
marker,principal - Table -- table block with
cols(column specs),rows(header, body, footer groups) - TableRow -- a row containing
cells - TableCell -- a cell with optional
style,colspan,rowspan, and content (inlines or nested blocks) - Break -- thematic/page breaks
- Audio, Video, Image, Toc -- block macros
Inline Types
- Span -- formatting (strong, emphasis, code, mark) with
variantandform(constrained/unconstrained) - Ref -- links and cross-references with
variant,target - Text -- plain text content (
nodeType = "string") - CharRef -- character references (
nodeType = "string") - Raw -- passthrough content (
nodeType = "string")
Schema-Derived Codecs
JSON serialization uses zio-blocks-schema:
object AsgCodecs:
private val codec = summon[Schema[Node]].derive(
JsonBinaryCodecDeriver
.withDiscriminatorKind(DiscriminatorKind.Field("name"))
.withCaseNameMapper(NameMapper.Custom(mapCaseName))
.withTransientDefaultValue(true)
)
def encode(node: Node): String = new String(codec.encode(node).toArray)
def decode(json: String): Either[String, Node] = ...
Key aspects:
DiscriminatorKind.Field("name")-- The type discriminator is a"name"field in the JSON, e.g.,"name": "paragraph".mapCaseName-- Converts Scala case class names to ASG names:Paragraphbecomes"paragraph",DListbecomes"dlist",CharRefbecomes"charref".withTransientDefaultValue(true)-- Fields withNoneor empty default values are omitted from JSON output.
Private Constructors and Smart Apply
ASG case classes have private constructors to enforce invariants:
case class Paragraph private (
id: Option[String],
title: Option[Chunk[Inline]],
// ...
@Modifier.rename("type") nodeType: String
) extends Block derives Schema
object Paragraph:
def apply(
id: Option[String] = None,
// ...
location: Location
): Paragraph = new Paragraph(id, ..., location, "block")
The nodeType field (serialized as "type" in JSON) is always set to the correct value ("block", "inline", or "string") by the companion apply method.
Location Type
Location wraps start and end Position values. It serializes as a JSON array rather than an object, matching the TCK schema:
case class Location(start: Position, end: Position)
object Location:
given Schema[Location] = summon[Schema[Chunk[Position]]].transform[Location](
chunk => Location(chunk(0), chunk(1)),
loc => Chunk(loc.start, loc.end)
)
This produces JSON like: "location": [[1, 1], [3, 15]]
Position has 1-based line and col fields, with an optional file for multi-file documents.