Skip to content

Docx to Text

Docx to Text (docx-to-text)

Extract content from Microsoft Word documents with optional image emission.

Transform binary json

Minimal example

actions:
- docx-to-text: {}
JSON
{
"actions": [
{
"docx-to-text": {}
}
]
}

Contents

Fields

FieldTypeRequiredDescription
description GeneralstringShort summary shown next to the action in the editor.
condition Generallua-expression (string)Conditional expression that must evaluate truthy for the action to run.
Examples: 2 * count()
mode OutputstringOutput mode: markdown
split OutputstringSplitting strategy: none
parser ParserstringParser backend: ooxml (quick-xml).
include-images Advancedboolean (bool)Emit embedded images as separate binary events alongside extracted text.
preserve-styles Advancedboolean (bool)Preserve inline style markers (bold/italic/etc.) in markdown output.
emit-document-events Advancedboolean (bool)Emit a synthetic document-level event with metadata alongside page/paragraph events.

General

Show fields
FieldTypeRequiredDescription
descriptionstringShort summary shown next to the action in the editor.
conditionlua-expression (string)Conditional expression that must evaluate truthy for the action to run.
Examples: 2 * count()

Output

Show fields
FieldTypeRequiredDescription
modestringOutput mode: markdown
splitstringSplitting strategy: none

Parser

Show fields
FieldTypeRequiredDescription
parserstringParser backend: ooxml (quick-xml).

Advanced

Show fields
FieldTypeRequiredDescription
include-imagesboolean (bool)Emit embedded images as separate binary events alongside extracted text.
preserve-stylesboolean (bool)Preserve inline style markers (bold/italic/etc.) in markdown output.
emit-document-eventsboolean (bool)Emit a synthetic document-level event with metadata alongside page/paragraph events.