Skip to content

Docx to Text

Docx to Text (docx-to-text)

Extract content from Microsoft Word documents with optional image emission.

Transform binary json

Minimal example

actions:
- docx-to-text: {}
JSON
{
"actions": [
{
"docx-to-text": {}
}
]
}

Contents

Advanced

Advanced
FieldTypeRequiredDescription
include-imagesboolean (bool)Emit embedded images as separate binary events alongside extracted text.
Default: false
preserve-stylesboolean (bool)Preserve inline style markers (bold/italic/etc.) in markdown output.
Default: false
emit-document-eventsboolean (bool)Emit a synthetic document-level event with metadata alongside page/paragraph events.
Default: false

General

General
FieldTypeRequiredDescription
descriptionstringShort summary shown next to the action in the editor.
conditionlua-expression (string)Conditional expression that must evaluate truthy for the action to run.
Examples: 2 * count()

Output

Output
FieldTypeRequiredDescription
modestringOutput mode: markdown
splitstringSplitting strategy: none

Parser

Parser
FieldTypeRequiredDescription
parserstringParser backend: ooxml (quick-xml).