For development, there are some additional prerequisites that are required for building the Apache Daffodil™ Extension for Visual Studio Code:
yarn watch
As watch
runs, fix any problems that arise in the
Problems tab
.
yarn
to update the local dependencies.F5
(or launch “Extension” under the “Run and
Debug” pane) to build and launch the extension in another VS Code
window.The local Apache Daffodil™ Extension for Visual Studio Code downloads
and caches the Apache Daffodil™ Debugger corresponding to the latest
extension release. If you want to test a local version of the
Apache Daffodil Debugger, you need to: * add
"useExistingServer": true
to the configuration in your
launch.json
in the sample workspace; * launch the backend
debugger locally, using a launch configuration like below:
json { "type": "scala", "name": "DAPodil", "request": "launch", "mainClass": "org.apache.daffodil.debugger.dap.DAPodil", "args": [] }
This will start the debug adapter and await a connection from the Apache
Daffodil VS Code Extension (usually on TCP port 4711); and * debug your
schema file, as long as it has the useExistingServer
setting above.
The Apache Daffodil VS Code Extension comes with an automated test suite. Run it as follows:
yarn test
By default, the test suite will use the earliest supported release of
VS Code. To test against any specific version of VS Code (in
this example, VS Code version 1.74.3), execute the test suite as
follows, setting DAFFODIL_TEST_VSCODE_VERSION
to the
desired version:
DAFFODIL_TEST_VSCODE_VERSION=1.74.3 yarn test
Set DAFFODIL_TEST_VSCODE_VERSION
to stable
to use the latest stable release, or to insiders
to use the
latest (nightly) insiders build.
HTTPS TLS certificates are verified by default. When running the test suite in certain environments (e.g., company VPN that uses endpoint protection), TLS certificate verifications may fail with a self-signed certificate error. If this is the case, either have node trust the endpoint protection certificate, or use one of these workarounds to disable the certificate verification:
NODE_TLS_REJECT_UNAUTHORIZED=0 yarn test
or
node ./out/tests/runTest.js --disable_cert_verification
WARNING: Do not
export NODE_TLS_REJECT_UNAUTHORIZED=0
into your environment
as it will disable TLS certificate verification on all node
HTTPS connections done in that shell session.
To build docx
(Word formatted) documentation, from the
top of the cloned repository, run:
cd docs && make all
For GitHub CI action updates (pull requests that start with Bump actions/…), make sure the affected workflows still operate as expected (they are automatically CI tested). GitHub CI actions update workflow YAML files, and are part of the CI infrastructure and not a code dependency. These should be relatively quick and easy to assess compared to code dependencies.
If the updates are not GitHub CI action updates, then additional scrutiny is required. When reviewing and verifying dependency bot updates that are part the software supply chain being distributed, please use the following checklist:
Milestone-level project status can be monitored using the Projects tab in the Project’s GitHub repository.
While the most recent release of the Apache Daffodil™ Extension for Visual Studio Code focused on the schema and the infoset, the theme of the next version will place additional emphasis on the input data. The input data could be any kind of file, with different byte sizes, byte ordering, and alignments, so having robust hex editing capabilities is important.
It is also important to have the ability to set breakpoints not only in the schema, but also in the data, and allow for manipulating the data and watch it affect the parse outcome. In other words, what happens to the parse when the data changes in some way. While stepping through the debugger, the schema, the infoset, and the data views need to be kept in sync.
For organizational purposes, the desired features for the Apache Daffodil™ Extension for Visual Studio Code are broken down into eight functional areas.
1.1 The data editor needs to support any fixed length (non-streaming) file Daffodil is capable of opening. Generally, any file type can be opened and displayed by a hex editor. The file type and extension do not influence the rendering of the file in hex or binary formats.
2.1 The data editor needs to be responsive and provide a good VS Code User Experience. Existing third-party VS Code hex editors will decrease in responsiveness while rendering medium to large size files. The editor will handle file sizes common to Daffodil without impacting overall usability.
2.2 The data editor needs to be designed as a composition of display panels that allow for multiple data representations to be rendered on the same screen. A data file may be segmented into multiple representations of data, from differing on byte boundaries to endianness. The editor will render differing representations within the same user interface.
2.3 The data editor needs to allow individual display panels to maintain their own position in the data to allow viewing different segments of data in different display panels. The editor will manage each composable view as a separate Viewport capable of displaying a view into the data at a specified offset and capacity.
2.4 The data editor viewports need to be interactive to allow mouse and keyboard interactions such as scrolling and context menus. User interaction will drive the function of the editor as such the ability to interpret keyboard and mouse actions on individual and block data selections are critical.
2.5 The data editor needs to include a Properties View component. The property view will provide a static region on the display to place file and selection metadata. The property view is not associated to a specific region in the file, so it is not a viewport component. It is tied to events such as selection events and is updated based on notification of events occurring.
2.6 The data editor needs to include a property display mode for a single unit selection. The Properties View will allow multiple representations for a single unit, eg byte, to be displayed simultaneously.
2.7 The data editor needs to include a property display mode for multiple unit selection. Selecting up to some limit of bytes, for example four, could still be rendered in the Properties View. For example, selecting four bytes could render a 32-bit integer value.
3.1 The data editor needs to allow edits to be saved as a new file. The editor will not attempt to write the file that is held open by Daffodil. Instead, a copy of the file will be written to disk.
3.2 The data editor needs to provide an auto-incremented file revision number to save without prompting the user. When saving edits to a file it may be preferrable for the save-as-new-file to be transparent to the user. In this case the user will not be prompted for a file name but instead use an autogenerated name.
3.3 The data editor needs to provide a save-as option to name a new file. When saving edits to a file the user may want to specify where the edited file will be saved. In this case a file picker dialog or something similar can be used to allow the user to specify the location for the save file.
3.4 The data editor will provide a convenient way of restarting the Daffodil debugger with the specified edits. After saving the edits to a file the debugger can be restarted and automatically set to use the new files path as the input. This convenience allows the user to avoid editing their launch profile to point to the new file.
Hex and binary representations for both viewing and editing.
4.1 The data editor needs to implement support for multiple data representations. The editor will use the viewport component design to deliver a composable multiple representation rendering capability.
4.2 The data editor needs to provide a viewport for viewing byte delimited data. The viewport will display hex bytes similar to the common hex editor displayed.
4.3 The data editor needs to provide a viewport for viewing data as individual bits. The viewport will render binary 1-0 display. The details of the rendering such as unit length can be modified using properties associated with the viewport.
4.4 The data editor needs to provide configurable rendering properties for any given representation. The UI will allow the user to view and edit viewport properties
4.5 The data editor needs to provide configurable endianness properties for viewport rendering. Configuring big or little endian for a viewport.
4.6 Ability to represent data where MSB or LSB bit can be the first bit displayed. Ability to view and edit bytes represented in binary where the most significant bit can be the first bit of the byte, or the last bit of the byte.
5.1 The data editor needs to implement inline editing within a viewport. The viewport will support mouse and keyboard interaction to initiate editing a value.
5.2 The data editor needs to default to editing in the same representation as the view. The editor will allow editing using the same viewport rendering as the representation, e.g., hex from hex, binary from binary can be represented using the native rendering logic of the viewport.
5.3 The data editor needs to provide undo / redo capability related to edits. A common expectation of editors such as this would be to provide commands to undo and redo edits that have been made.
5.4 The data editor needs to provide editing in differing representations as the view. The editor could provide something similar to a pop-out component that allows editing a value in a format that differs from the viewport representation, e.g., editing binary from the hex view.
6.1 The debugger needs to provide extension points which allow executing debug commands from the editor. There are certain non-standard operations such as setting breakpoints on data locations that are to be supported. This will require the debugger to provide extension points that allow the editor to pass instructions that augment the debugger flow.
6.2 The debugger will support breakpoints to be set at data positions in the input file. Setting breakpoints on data locations indicates to the debugger that when the input stream reaches a specified point in the file it will break execution as if it hit a code breakpoint.
6.3 The data editor will allow breakpoints to be set at data positions in the input file. The data editor will allow creation of and then render data breakpoints in a similar way to how code breakpoints are set and rendered.
6.4 The data editor will support starting debug from a specified position. The editor provides a function via a context menu that indicates a starting point in the file for the input stream. This will drop all bytes prior to this location when starting the debug.
6.5 The data editor will support stopping debug at a specified position. The editor provides a function via a context menu that indicates the stopping point in the input stream. All data after this point will be ignored by the input stream, ending the debug at the specified point.
6.6 The debugger will support the latest version of Apache Daffodil™ released. The extension will be kept up to date with the latest version of Apache Daffodil™.
In this section a “block” is defined as a range that has been selected by the user.
7.1 The data editor needs to support adding individual bytes. The editor will provide a function to insert a single byte at a position in the file.
7.2 The data editor needs to support adding blocks of bytes. The editor will provide a function to insert multiple bytes starting at a position in the file.
7.3 The data editor needs to support deleting individual bytes. The editor will provide a function to delete a single byte from the file.
7.4 The data editor needs to support deleting blocks of bytes. The editor will provide a function to delete blocks of bytes from the file.
7.5 The data editor needs to support modifying the value of an individual byte. The editor will provide a function to overwrite the value of a byte in the file.
7.6 The data editor needs to support modifying the value of a block of bytes. The editor will provide a function to overwrite the value of a block of bytes in the file.
7.7 The data editor needs to support copying byte(s). The editor will provide the ability to select and copy a range of bytes to the clipboard for convenience and interoperability. The size of bytes that can be copied will need an upper limit depending on the file size and system memory availability.
7.8 The data editor needs to support pasting byte(s). The editor will provide the ability to past bytes from the system clipboard into the file at a specified position for convenience and interoperability.
7.9 The data editor needs to support searching for patterns. The editor will provide a search function similar to a text editor find text using literal text. This pattern would literally be searched for in each given representation.
7.10 The data editor needs to support replacing search results with new patterns. The editor will provide a search function similar to a text editor find text using literal text and replace the found text with alternate text. This pattern would literally be searched for in each given representation and replaced using text that is valid within said representation.
7.11 The data editor needs to use the native clipboard provided by the operating system for interoperability with other applications. The editor will use the operating system clipboard for copy and paste operations to improve interoperability with other applications.
7.12 The data editor needs to support applying a bit mask to an individual byte. The editor will provide function to apply a mask to a byte at a position in the file.
7.13 The data editor needs to support applying a bit mask to a block of bytes. The editor will provide a function to apply a mask to a selection of bytes in the file.
8.1 All external files needed by the TDML file will be incorporated as relative paths into the TDML file.
8.2 TDML features need to be as modular as possible. Modularization allows for the future removal of TDML from the repository of the DFDL extension and addition to a library that can be shared by the DFDL repository.
8.3 TDML features need to be written in Scala and will read/write XML by using XML bindings (e.g., Jaxb/scalaxb).
8.4 The extension needs to provide an item in the command palette (ctrl + shift + p) for ‘Generate TDML File’.
Selecting this command will display menus allowing the user to select the following:
This selection will work in the same way as the DFDL debugger. If the user selects the command from a DFDL Schema, it will automatically use that in place of a selection.
8.5 The extension needs to provide an item in the command palette (ctrl + shift + p) for ‘Add Test Case to TDML File’.
Selecting this command will display menus allowing the user to select the following:
This selection will work in the same way as the DFDL debugger. If the user selects the command from a DFDL Schema, it will automatically use that in place of a selection.
8.6 The extension needs to provide an item in the command palette (ctrl + shift + p) for ‘Run Test Case in TDML File’.
Selecting this command will display menus allowing the user to select the following:
This command will start the Daffodil process in run mode. This command will provide an option to start the Daffodil process in debug mode. The location of the DFDL Schema is expected to be relative to the location of the TDML File. It will be the responsibility of the user who created the TDML file to ensure that packaging of their TDML file is correct.
9.1 The extension needs to provide context sensitive auto completion suggestion (IntelliSense) based on the DFDL language.
9.2 The IntelliSense suggestions for attributes needs to supply an appropriate list of choices where applicable.
9.3 The IntelliSense for element tags needs to supply attribute appropriate for that specific tag.
9.4 The IntelliSense for element tags needs to supply attribute suggestions for newly insert tags as well as editing existing tags.
9.5 The IntelliSense needs to supply suggestions based on the contextual cursor position.
9.6 The IntelliSense suggestions need to work when multiple tags are on a single line as well as when each tag is on a single line.
9.7 IntelliSense needs to supply a closing tag when a closing tag is missing.
9.8 IntelliSense suggestions need to work when attributes are split on multiple lines.
10.1 Provide DFDL syntax colorization.
10.2 Matching tags within the dfdl schema need to be highlighted.
10.3 XPath expressions embedded within dfdl schema should be highlighted.
The goal is to have these Apache Daffodil VS Code Extension capabilities incrementally released, and published to the Marketplace every few months.
The following table will be updated as new releases are published, or the themes/emphasis of a release change.
However, this is all highly subject to change based on the needs of the user community, and on what community developers choose to work.
The semantic versioning release identifications are also subject to change.
Release | Published to Marketplace? | Description | Issues |
---|---|---|---|
1.1.0 Target: July, 2022 |
✅ Yes | UI wireframes showing a vision of the data editor has been posted for discussion and feedback. The main editing viewport now has support for the delete and insert editing primitives in addition to overwrite. Support for multiple viewports, being able to undo and redo changes, cut and paste, and file saving are implemented. | Issues |
1.2.0 Target: December, 2022 |
✅ Yes | Search and replace is implemented. Full-stack testing is in place. | Issues |
1.3.0 Target: July, 2023 |
✅ Yes | Improvements to DFDL auto-completion (aka, “Intellisense”). Basic support for TDML. Editing is permitted in any of several viewports. Each viewport can display data in different formats (e.g, binary, hex, ascii, big and little endian integers). | Issues |
1.3.1 Target: August, 2023 |
✅ Yes | Refinement of DFDL auto-completion (aka, “Intellisense”), Data editor large file support, mode simplification, incremental search and replace, updates to views and selections, multitasking support, data profiler, content discovery and editing additions | Issues |
1.4.0 Target: November, 2023 |
❌ Not yet | Unicode detection and profiling, language guessing, adjustable viewports, additional data display features, segment saving to file, streaming transforms MVP, breakpoints can be set at data offsets and debugging can start and stop at specified offsets. | Issues |
Support for:
Properties View
component.More to come…
The Apache Daffodil™ Extension for Visual Studio Code is an extension to the Microsoft® Visual Studio Code (VS Code) editor which enables Data Format Description Language (DFDL) syntax highlighting, code completion, and the interactive debugging of DFDL Schema parsing operations using Apache Daffodil™.
DFDL is a data modeling language used to describe file formats. The DFDL language is a subset of eXtensible Markup Language (XML) Schema Definition (XSD). Just as file formats are rich and complex, so is the modeling language to describe them. Developing DFDL Schemas can be challenging, requiring a lot of iterative development, and testing.
The purpose of Apache Daffodil™ Extension for Visual Studio Code is to ease the burden on DFDL Schema developers, enabling them to develop high quality, DFDL Schemas, in less time. VS Code is free, open source, cross-platform, well-maintained, extensible, and ubiquitous in the developer community. These attributes align well with the Apache Daffodil™ project and the Apache Daffodil™ Extension for Visual Studio Code.
DFDL is rich and complex. Developers using modern code editors expect some degree of built-in language support for the language in which they are developing, and DFDL should be no different. The Apache Daffodil™ Extension for Visual Studio Code provides syntax highlighting to improve the readability and context of the text. In addition, the syntax highlighting provides feedback to the developer indicating the structure and code appear syntactically correct.
The Apache Daffodil™ Extension for Visual Studio Code provides code completion, also known as “Intellisense”, offering context-aware code segment predictions that can dramatically speed up DFDL Schema development by reducing keyboard input, memorization by the developer, and typos.
The Apache Daffodil™ Extension for Visual Studio Code provides a Daffodil Data Parse Debugger which enables the developer to carefully control the execution of Apache Daffodil™ parse operations. Given a DFDL Schema and a target data file, the developer can step through the execution of a parse line by line, or until the parse reaches some developer-defined location, known as a break point, in the DFDL Schema. What is particularly helpful is that the developer can watch the parsed output, known as the “infoset”, as it’s being created by the parser, and see where the parser is parsing in the data file. This enables the developer to quickly discover and correct issues, improving DFDL Schema development and testing cycles.
The Apache Daffodil™ Extension for Visual Studio Code provides an integrated data editor as a new experimental feature that is currently under development. It is akin to a hex editor, but tuned specifically for challenging Daffodil use cases. It is designed to support virtually any sized file, well beyond the limits of the standard text editor in VS Code, and it can handle non-text data just as well as text data. It has support for setting Daffodil debugger breakpoints on offset positions in the data file in addition to the positions in the DFDL Schema. It handles non-standard byte sizes, non-aligned bytes, and byte ordering where the Least Significant Byte (LSB) can be the first or last bit in a byte. As an editor designed for Daffodil developers by Daffodil developers, features of the tool will evolve quickly to address the specific needs of the Daffodil community.
This guide assumes VS Code and a Java Runtime Environment (Java 8 or greater) are installed.
The Apache Daffodil™ Extension for Visual Studio Code can be installed using one of two methods.
The Apache Daffodil™ Extension for Visual Studio Code is available in the Visual Studio Code Extension Marketplace.
The latest .vsix
(the file extension used for VS Code
extensions) file can also be downloaded from the Apache Daffodil™
Extension for Visual Studio Code releases
page and installed by either:
code --install-extension <path-to-downloaded-vsix-file>
;
orvsix
to bring up
the command and pointing it at the downloaded .vsix
file,
as demonstrated in the following animation.Since DFDL Schema files end with .xsd
(XML Schema
Definition or XSD), the editor needs to be informed specifically that
DFDL mode is desired over the more general XML mode, the following
animation demonstrates how to set the desired mode for DFDL.
Auto suggest is triggered using control space or typing the beginning characters of an item, as demonstrated in the following animation.
📝 NOTE: Intellisense is context aware, so
there is no need to begin a block with <
, just start
typing the tag name and code completion will automatically handle it as
appropriate.
Typing one or more unique characters will further limit the results, as demonstrated in the following animation.
Code completion can be used to add the schema block, with just a couple of keystrokes, as demonstrated in the following animation.
Code completion can make short work out of completing a DFDL Format Block, offering context-sensitive suggestions for the format attribute values, as demonstrated in the following animation.
The >
or /
characters are used to close
XML tags. Use tab
to select an item from the drop down and
to exit double quotes, as demonstrated in the following animation.
Code completion supports creating self-defined
dfdl:complextypes
and dfdl:simpleTypes
, as
demonstrated in the following animation.
The tab
key can be used to complete an auto-complete
item within an XML tag. After auto-complete is triggered, typing the
initial character or characters will limit the suggestion results.
Inside an XML tag a space
or carriage return
will trigger a list of context sensitive attribute suggestions, as
demonstrated in the following animation.
The following animation demonstrates how code completion can be used to efficiently help create self-defined types.
The following animation demonstrates how code completion can be use
to efficiently create xs:choice
s and
dfdl:discriminator
s.
The following animation demonstrates how code completion can help
authors use hidden references and dfdl:inputValueCalc
.
The following animation demonstrates how code completion can help
with creating elements using dfdl:outputValueCalc
.
The following animation demonstrates examples of code completion assisting in the creation of more user-defined types.
XPath expressions can be code completed. The following animation
demonstrates how the Path expressions are completed when calculating
dfdl:Length
values.
The following animation demonstrates how code completion can be used
to help create dfdl:assert
blocks.
The following animation demonstrates another couple of examples of
dfdl:assert
block creation using code completion.
Debugging a DFDL Schema needs both the DFDL Schema to use and a data file to parse. Instead of having to select the DFDL Schema and the data file each time from a file picker, a “launch configuration” can be created, which is a JSON description of the debugging session.
To create the launch profile:
Select Run -> Open Configurations
from the VS
Code menubar. This will load a launch.json
file into the
editor. There may be existing configurations
, or it may be
empty.
Press Add Configuration...
and select the
Daffodil Debug - Launch
option.
Once the launch.json
file has been created it will look
something like this
{
"type": "dfdl",
"request": "launch",
"name": "Ask for file name",
"program": "${command:AskForProgramName}",
"stopOnEntry": true,
"data": "${command:AskForDataName}",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
This default configuration will prompt the user to select the DFDL Schema and data files. If desired, the “program” and “data” elements can be mapped specifically to the user’s files to avoid being prompted each time.
📝 Note: Use ${workspaceFolder}
for files in the VS Code
workspace and use absolute paths for files outside of the workspace.
{
"type": "dfdl",
"request": "launch",
"name": "DFDL parse: My Data",
"program": "${workspaceFolder}/schema.dfdl.xsd",
"stopOnEntry": true,
"data": "/path/to/my/data",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
Using the launch profile above a DFDL parse: My Data
menu item at the top of the Run and Debug
pane
(Command-Shift-D) will display. Then press the play
button
to start the debugging session.
In the Terminal, log output from the DFDL debugger backend service will display. If something is not working as expected, check the output in this Terminal window for hints.
The DFDL Schema file will also be loaded in VS Code and there should
be a visible marking at the beginning where the debugger has paused upon
entry to the debugging session. Control the debugger using the available
VS Code debugger controls such as setting breakpoints
,
removing breakpoints
, continue
,
step over
, step into
, and
step out
.
Daffodil Debug:
Daffodil Debug: Debug File
- This will allow
for the user to fully step through the DFDL Schema. Once fully
completed, it will produce an infoset to a file named
SCHEMA-infoset.xml
which it then opens as well.Daffodil Debug: Run File
- This will run the
DFDL Schema, producing the infoset to a file named
SCHEMA-infoset.xml
.Debug File
- This will allow for the user to
fully step through the schema (WIP). Once fully completed, it will
produce a infoset to a file named SCHEMA-infoset.xml
which
it then opens as well.Run File
- This will run the DFDL Schema,
producing the infoset to a file named SCHEMA-infoset.xml
which it then opens as well.Find the infoset tools from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
Find the hex view from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
To enable the Apache Daffodil™ Extension for Visual Studio Code
experimental features, from the command menu start typing ‘daffodil’,
then select Daffodil Debug: Enable Experimental Features
,
then select Yes
.
🧪 Warning: This is currently an experimental feature in development.
Ωedit is being integrated as the experimental data editor in the
Apache Daffodil™ Extension for Visual Studio Code. Once experimental
features are enabled, find the Data Editor in the command menu by typing
‘omega’, then select OmegaEdit: Data Editor
.
After selecting a file to edit, a Data Editor tab will appear.
As of v1.2.0, this experimental feature is far from functional, but will be improving over time.
On MacOS, using Homebrew:
# Install Java 11 from a macOS terminal
brew install java11
Add change JAVA_HOME
in the ~/.zshrc file (or
equivalent):
# Java 11
export JAVA_HOME=/usr/local/Cellar/openjdk@11/11.0.12
Be sure code
is in the PATH
by following
the instructions here.
With JAVA_HOME
set to the Java 11 install, run
code
in the terminal.
If problems are encountered or new features are desired, create tickets here.
If additional help or guidance on using Daffodil and its tooling is needed, please engage with the community on mailing lists and/or review the archives.
The Apache Daffodil™ Extension for Visual Studio Code is an extension to the Microsoft® Visual Studio Code (VS Code) editor which enables Data Format Description Language (DFDL) syntax highlighting, code completion, and the interactive debugging of DFDL Schema parsing operations using Apache Daffodil™.
DFDL is a data modeling language used to describe file formats. The DFDL language is a subset of eXtensible Markup Language (XML) Schema Definition (XSD). Just as file formats are rich and complex, so is the modeling language to describe them. Developing DFDL Schemas can be challenging, requiring a lot of iterative development, and testing.
The purpose of Apache Daffodil™ Extension for Visual Studio Code is to ease the burden on DFDL Schema developers, enabling them to develop high quality, DFDL Schemas, in less time. VS Code is free, open source, cross-platform, well-maintained, extensible, and ubiquitous in the developer community. These attributes align well with the Apache Daffodil™ project and the Apache Daffodil™ Extension for Visual Studio Code.
DFDL is rich and complex. Developers using modern code editors expect some degree of built-in language support for the language in which they are developing, and DFDL should be no different. The Apache Daffodil™ Extension for Visual Studio Code provides syntax highlighting to improve the readability and context of the text. In addition, the syntax highlighting provides feedback to the developer indicating the structure and code appear syntactically correct.
The Apache Daffodil™ Extension for Visual Studio Code provides code completion, also known as “Intellisense”, offering context-aware code segment predictions that can dramatically speed up DFDL Schema development by reducing keyboard input, memorization by the developer, and typos.
The Apache Daffodil™ Extension for Visual Studio Code provides a Daffodil Data Parse Debugger which enables the developer to carefully control the execution of Apache Daffodil™ parse operations. Given a DFDL Schema and a target data file, the developer can step through the execution of a parse line by line, or until the parse reaches some developer-defined location, known as a break point, in the DFDL Schema. What is particularly helpful is that the developer can watch the parsed output, known as the “infoset”, as it’s being created by the parser, and see where the parser is parsing in the data file. This enables the developer to quickly discover and correct issues, improving DFDL Schema development and testing cycles.
The Apache Daffodil™ Extension for Visual Studio Code provides an integrated data editor. It is akin to a hex editor, but tuned specifically for challenging Daffodil use cases. As an editor designed for Daffodil developers by Daffodil developers, features of the tool will evolve quickly to address the specific needs of the Daffodil community.
The Apache Daffodil™ Extension for Visual Studio Code provides TDML support. TDML is a way of specifying a DFDL schema, input test data, and expected result or expected error/diagnostic messages, all self-contained in an XML file. A TDML file is often useful just to ask a question about how something in DFDL works. For example, when uploading files to the daffodil users mailing list, it may be easier to upload a zip file containing a TDML file, the DFDL Schema file, the input data file, and, optionally, the infoset file. Sending this file to the users mailing list will allow other users to unpack your zip file and run your test case. It becomes even easier if you have multiple test cases. It allows for a level of precision that is often lacking, but also often required when discussing complex data format issues. As such, providing a TDML file along with a bug report is the absolutely best way to demonstrate a problem. You can read more about TDML here on the Apache Daffodil™ website.
This guide assumes VS Code and a Java Runtime Environment (Java 8 or greater) are installed.
The Apache Daffodil™ Extension for Visual Studio Code can be installed using one of two methods.
The Apache Daffodil™ Extension for Visual Studio Code is available in the Visual Studio Code Extension Marketplace.
The latest .vsix
(the file extension used for VS Code
extensions) file can also be downloaded from the Apache Daffodil™
Extension for Visual Studio Code releases
page and installed by either:
code --install-extension <path-to-downloaded-vsix-file>
;
orvsix
to bring up
the command and pointing it at the downloaded .vsix
file.Since DFDL Schema files end with .xsd
(XML Schema
Definition or XSD), the editor needs to be informed specifically that
DFDL mode is desired over the more general XML mode. The mode is
selected in the status bar at the bottom of the editor window.
Auto suggest is triggered using control space
or typing
the beginning characters of an item. Typing one or more unique
characters will further limit the results.
📝 NOTE: Intellisense is context aware, so
there is no need to begin a block with <
, just start
typing the tag name and code completion will automatically handle it as
appropriate.
Code completion can be used to add a schema block, with just a couple of keystrokes. Code completion can make short work out of completing a DFDL Format Block, offering context-sensitive suggestions attribute values.
The >
or /
characters are used to close
XML tags. Use tab
to select an item from the drop down and
to exit double quotes.
Code completion supports creating self-defined
dfdl:complextypes
and dfdl:simpleTypes
.
The tab
key can be used to complete an auto-complete
item within an XML tag. After auto-complete is triggered, typing the
initial character or characters will limit the suggestion results.
Inside an XML tag a space
or carriage return
will trigger a list of context sensitive attribute suggestions.
Install the Apache Daffodil VS Code Extension from the VS Code Marketplace.
Open a schema file in the editor and set the language mode located in the bottom right corner to dfdl.
Click the language in the bottom right of the status bar or type Ctrl+Shift+p and enter ‘language mode’, then select dfdl from the list of available languages.
Press ctrl+space in the empty editor window. The XML version declaration should appear as the only choice. Select that choice by pressing the enter key.
Press ctrl+space again and the schema choice will show. Press enter to accept the schema choice.
Select nul, or one of the other choices in the choice list. If you select nul for no namespace, you will need to backspace over the null character to remove it. If you want to type in a different namespace choice, remove null and type in your namespace choice followed by a colon ‘:’. If you select a namespace option here, it will be used throughout the schema as a namespace prefix to standard XML elements. The dfdl namespace prefix will automatically be added to dfdl elements. After selecting or writing in a namespace option, press the tab key to move to the end of the schema tag block.
At the end of the schema tag block, you can type ‘>’ to auto-end the schema block. Intellisense will place the end tag character on the schema open tag block, create the schema closing tag, and position the cursor between the tags.
Press ctrl+space to get a list of element type choices available within the schema tags. Select a choice and press enter.
Attributes can be supplied in the sequence open tag. To get a list of attribute choices press space at the cursor position. Intellisense will open a menu that allows a selection of an attribute. If the attribute has predetermined choices a list of those will appear after the attribute is selected.
The separator attribute doesn’t have a specific list of choices. The comma was manually entered to provide a value to the field. Press tab to exit the double quotes. The cursor will be positioned immediately after the ending double quote.
Type space again to choose another attribute, or type / to create a self-closing tag. After typing a slash to close the tag, the cursor will be positioned at the end of the tag. Press enter to continue on the next line.
Press ctrtl+space to get a list of element choices.
A tag can also be closed by typing ‘>’ at the cursor position after the tag.
Closing a tag with a ‘>’ will normally result in a closing tag on a new line and the cursor positioned between the two tags. (If an open tag is split over multiple lines, the closing tag is not moved to the next line. This behavior can be changed based on community input).
Press ctrl+space on the empty line to get a list of element choices available between tags.
Select a choice by pressing enter. In this example the element tag with the attribute name was selected and a value for name entered. Press tab to exit the double quotes after entering a name value. The name attribute doesn’t have a specific list of choices.
Type ctrl+space to get a list of attribute choices for the element tag.
Selecting an attribute that has predetermined choices will supply a list of those choice. Select an item from the list and press enter. End the tag with ‘>’ to get a closing tag on a new line with the cursor positioned between the tags.
On the new line press ctrl+space to get a list of element choices for the element tag.
Select a choice and press ctrl+space to get list of choices for the selected annotation tag set.
Select a choice and press ctrl+space to supply a list of choices available in the appinfo tag set.
Select a choice by pressing enter.
The discriminator test dfdl attribute doesn’t have a specific list of choices. Press tab to exit the double quotes. The cursor will be positioned immediately after the ending double quote.
To add additional attributes to an existing element tag, position the cursor within the opening tag, press ctrl+space, or space to get a list of attribute choices for that tag.
Adding a new line anywhere in the schema and pressing ctrl+space will provide a list of choices available between the tags at the current position.
If a closing tag is deleted or missing, type ‘>’ to re-add the closing tag at the cursor position.
The closing tag will be re-added and cursor will be placed at the end of the line.
XPath expressions can be code completed.
Debugging a DFDL Schema needs both the DFDL Schema to use and a data file to parse. Instead of having to select the DFDL Schema and the data file each time from a file picker, a “launch configuration” can be created, which is a JSON description of the debugging session.
A launch configuration can be created using the Launch Wizard or done manually through the ./vscode/launch.json file
The launch wizard can be accessed two ways, either from the edit window when editing a DFDL schema file as shown below
Or it can be accessed through the Command Palette (Ctrl + Shift + P)
and search for Configure launch.json
A new tab will be created with the Launch Config Wizard
Here you can create or edit Daffodil Debugger Config Settings
The drop down under Launch Config
will allow you to
create a new config and name it or you can select an already created
config from the drop down.
The Daffodil Debugger Classpath
is for additional
classpaths that you would like the debugger to retrieve files from. Use
${workspaceFolder} for files in the VS Code workspace, and use absolute
paths for files outside of the workspace.
Under the Data
section, you can specify an absolute path
to the data input file or leave it as a command and the debugger will
ask you each time you run it.
The Debug Server
specifies the port that the debug
server should be running on.
The Infoset Format
gives the user the ability to have
their infosets generated as a XML or JSON format.
The Infoset Output Type
gives the user the ability to
specify a destination for their infoset file being a file placed at the
path given by the user, printed out in console, or none for no output of
an infoset.
The three checkboxes will open each of the additional views upon running the debugger, those are the
Hex View
– Shows daffodil schema in a datafile-hex
view
Infoset Diff View
– Shows a side-by-side diff of the
previous and current infoset file
Infoset View
– Shows the infoset file being created in
real time as the debugger runs
The TDML Action
section allows the user to specify
whether a TDML file should be generated, appended to the end of a
previously created TDML file, or should not be created.
If set to generate or append, a TDML file name, description, and file path must be given.
Under Program
, an absolute path can be given to the DFDL
schema file leave it as a command and the debugger will ask you each
time you run it.
The Stop On Entry
checkbox will make the debugger
automatically pause after launching. This allows the user to set
breakpoints before running the file through.
The Trace
checkbox enables the logging of the Debug
Adapter Protocol.
Under Data Editor Settings
, there is configurations for
Omega Edit, here you
can specify the port, log file location, and log level.
The Use Existing Server
check box will enable a
connection to a Debug Adapter Protocol (DAP) Server
Once all configurations have been completed, they can be saved and a launch.json file will be created.
Select Run -> Open Configurations
from the VS
Code menubar. This will load a launch.json
file into the
editor. There may be existing configurations
, or it may be
empty.
Press Add Configuration...
and select the
Daffodil Debug - Launch
option.
Once the launch.json
file has been created it will look
something like this
{
"type": "dfdl",
"request": "launch",
"name": "Ask for file name",
"program": "${command:AskForProgramName}",
"stopOnEntry": true,
"data": "${command:AskForDataName}",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
This default configuration will prompt the user to select the DFDL Schema and data files. If desired, the “program” and “data” elements can be mapped specifically to the user’s files to avoid being prompted each time.
📝 Note: Use ${workspaceFolder}
for files in the VS Code
workspace, and use absolute paths for files outside of the
workspace.
{
"type": "dfdl",
"request": "launch",
"name": "DFDL parse: My Data",
"program": "${workspaceFolder}/schema.dfdl.xsd",
"stopOnEntry": true,
"data": "/path/to/my/data",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
Using the launch profile above a DFDL parse: My Data
menu item at the top of the Run and Debug
pane
(Command-Shift-D) will display. Then press the play
button
to start the debugging session.
In the Terminal, log output from the DFDL debugger backend service will display. If something is not working as expected, check the output in this Terminal window for hints.
The DFDL Schema file will also be loaded in VS Code and there should
be a visible marking at the beginning where the debugger has paused upon
entry to the debugging session. Control the debugger using the available
VS Code debugger controls such as setting breakpoints
,
removing breakpoints
, continue
,
step over
, step into
, and
step out
.
Daffodil Debug:
Daffodil Debug: Debug File
- This will allow
for the user to fully step through the DFDL Schema. Once fully
completed, it will produce an infoset to a file named
SCHEMA-infoset.xml
which it then opens as well.Daffodil Debug: Run File
- This will run the
DFDL Schema, producing the infoset to a file named
SCHEMA-infoset.xml
.Debug File
- This will allow for the user to
fully step through the schema (WIP). Once fully completed, it will
produce a infoset to a file named SCHEMA-infoset.xml
which
it then opens as well.Run File
- This will run the DFDL Schema,
producing the infoset to a file named SCHEMA-infoset.xml
which it then opens as well.Find the infoset tools from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
Find the hex view from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
This version of the Apache Daffodil™ Extension for Visual Studio Code
includes a new Data Editor. To use the Data Editor, open the VS Code
command palette and select Daffodil Debug: Data Editor
.
A notification message will appear that informs where the Data Editor will write its logs to. If problems happen, check this log file for clues.
Once the extension is connected to the server, the bottom left corner of the Data Editor shows the version of the Ωedit server powering the editor, and the port its connected to. Hovering over the filled circle shows the CPU load average, the memory usage of the server in bytes, the server session count, the server uptime measured in seconds, and the round trip latency measured in milli-seconds.
After selecting a file to edit, there will be a table with controls at the top of the Data Editor.
The first section of the table is called File Metrics
and it contains the path of the file being edited, its initial size in
bytes, and the size as the file is being edited. When changes are
committed, the Save
button will become enabled, allowing
the changes to be saved to file.
The second section of the table is called Search
, and it
allows for Searching of byte sequences in the given
Edit Encoding
. If the Edit Encoding
can be
case-insensitive, a Case Insensitive
checkbox will be
displayed allowing for that option to be enabled. The found sequences
can be examined using the Prev
and Next
buttons found in this section. Found sequences can also be replaced in
the given Edit Encoding
by filling in a replacement
sequence. Currently all the sequences will be replaced.
The third section of the table is called Settings
, and
it allows for toggling the Byte Edit Mode
from
Single
to Multiple
.
In Single
byte edit mode, individual bytes may be
deleted, inserted (to the left or to the right of the
selected byte), and overwritten in the
Ephemeral Edit Window
that appears when a byte in the
Physical
or Logical
viewports is clicked.
Mouseover the buttons of the Ephemeral Edit Window
to
determine what each button does. Mouseover the Input Box
and it will show the byte offset position in the selected
Address Radix
. Buttons will become enabled or disabled
depending on whether there is valid input in the Input Box
or not. Values entered in the Input Box
must match the
format set by the byte display radix when editing bytes in the
Physical
viewport or be in Latin-1 (8-bit ASCII) format
when editing bytes in the Logical
viewport.
In Multiple
byte edit mode, a segment of bytes is
selected from either the Physical
or Logical
viewports, then the selected segment of bytes is edited in the
Edit
viewport using the selected
Edit Encoding
. Once editing of the selected segment is
completed, the Commit
button is pressed, and the edited
segment replaces the selected segment
Byte addresses can be expressed in hexadecimal, decimal, or octal.
The selected Address Radix
is also what is used entering an
offset into the Offset
input. If an offset was entered in
the Offset
input and the Address Radix
is
changed, the offset will automatically be converted into the selected
radix.
In Single
byte edit mode, byte editing can be done in
the Physical
viewport, or the Logical
viewport. The Physical
viewport shows the bytes as they are
stored in the file and can be represented in Hexadecimal
,
Decimal
, Octal
, or Binary
depending on the Byte Display Radix
. The
Logical
viewport always shows the bytes as
Latin-1
. The Data View
shows the integer and
floating point values of the bytes starting at the selected address. The
values in the Data View
will be expressed in the selected
Endianness
(Little
or Big
).
In Multiple
byte edit mode, byte editing can only be
done in the Edit
viewport using a selection of bytes from
the Physical
or Logical
viewports. The
Edit
viewport shows the bytes represented in
Hexadecimal
, Binary
, ASCII
,
Latin-1
, UTF-8
, or UTF-16LE
(UTF-16 Little Endian), depending on the Edit Encoding
.
Once the editing of that segment is done, the Commit
button
is pressed, and the edited segment replaces the selected segment in the
Physical
and Logical
viewports.
Regardless of the Byte Edit Mode
, changes can be Undoed
and Redone using the Undo
and Redo
buttons.
The Revert All
button will revert all changes made to the
file since it was opened in the Data Editor.
The Data Editor supports light and dark modes. The mode is determined by the VSCode theme. If the VSCode theme is set to a light theme, the Data Editor will be in light mode. If the VSCode theme is set to a dark theme, the Data Editor will be in dark mode.
Users can update the settings for the Data Editor using the launch
config file (.vscode/launch.json
). The way to add these
settings is by doing something like:
{
"version": "0.2.0",
"configurations": [
{
...
"dataEditor": {
"port": 9001,
"logFile": "/tmp/dataEditor-9001.log",
"logLevel": "debug"
}
}
]
}
If one or more of these items are not set, the items will be set to their default values. Below are the default values:
"dataEditor": {
"port": 9000,
"logFile": "${workspaceFolder}/dataEditor-${omegaEditPort}.log",
"logLevel": "info"
}
The current editing limit is 1,000,000 bytes. This is due to the amount of memory it takes to encode and display all the bytes in the viewports.
Only one Data Editor instance can be opened at one time.
Viewport selections do not persist when they lose focus. This is a limitation of implementing the display viewports using textarea elements.
Currently Replace will replace all instances of the given search pattern with the replacement pattern.
As of v1.3.0, this feature is minimally viable and will be improving over time. Expect these limitations to be removed in the next release.
📝 Note: The non-printable font being used (░) may appear different on different platforms and OS/font configurations.
To Generate a TDML file, use similar steps for Launching a DFDL Parse
Debugging Session: * Open the DFDL Schema file * From inside the file,
open the Command Palette (Mac = Command+Shift+P, Windows/Linux =
Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Generate TDML
command * From there, you
will be asked to provide the input data file, the TDML test case name,
the TDML test case description, and the location/name for the TDML
file.
Once the Daffodil Parse has finished, an infoset and a TDML file will be created. The TDML file contains relative paths to the DFDL Schema file, input data file, and infoset file. When creating an archive for these files, preserve the directory structure in the archive.
To Append a new test case to an existing TDML file, use similar steps
for Generating a TDML file: * Open the DFDL Schema file * From inside
the file, open the Command Palette (Mac = Command+Shift+P, Windows/Linux
= Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Append TDML
command * From there, you will
be asked to provide the input data file, the TDML test case name, the
TDML test case description, and the TDML file
Once the Daffodil Parse has finished, an infoset will be created, and a test case will be added to the existing TDML file. The TDML test case name OR description can be shared between test cases, but no two test cases should share TDML test case names and descriptions. To create an archive for a TDML file with multiple test cases, the same guidelines for creating an archive from a TDML file created from a ‘Generate TDML’ operation should be followed. All DFDL schema files, input data files, the TDML file, and, optionally, the infosets should be added to the archive. Additionally, any directory structure should be preserved in the archive to allow for the relative paths in the TDML file to be resolved.
When running a zip archive created from another user, extract the archive into your workspace folder. If there is an infoset in the zip archive that you wish to compare with your infoset, make sure that the infoset from the zip archive is not located at the same place as the default infoset for the Daffodil Parse that will be run when executing a test case from the TDML file. This is because the Daffodil Parse run by executing the TDML test case uses the default location for its infoset and will overwrite anything that already exists there.
To Execute a test case from a TDML file, use the following steps: *
Open a DFDL Schema file * From inside the file, open the Command Palette
(Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P) * Once the Command
Palette is opened, select the Daffodil Debug: Execute TDML
command * From there, you will be asked to provide the TDML file, TDML
test case name, and TDML test case description
A Daffodil Parse will then be launched. The DFDL Schema file and input data file to be used is determined by the selected test case in the TDML file. The infoset that is generated from this parse can optionally be compared to an infoset included in the zip archive the TDML file was extracted from.
A TDML file is comprised of Test Cases. Each test case describes a DFDL parse operation and points to the inputs and outputs of the DFDL parse operation. Inputs - DFDL Schema file and input data file Outputs - Infoset file
Additionally, each Test Case should be uniquely identified by the combination of its name and description. Currently, this is not enforced, and any duplications will never be selectable by the TDML Execute operation.
Below is a Sample TDML file with a single Test Case along with XPath expressions describing where each item can be found inside of a Test Case.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
ns1:testSuite xmlns:ns1="http://www.ibm.com/xmlns/dfdl/testData" xmlns:ns2="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:ns3="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext" xmlns:ns4="http://www.ogf.org/dfdl/dfdl-1.0/extensions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ns6="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:int" suiteName="Default Test Case" defaultRoundTrip="onePass">
<ns1:parserTestCase name="Default Test Case" root="file" model="png.dfdl.xsd" roundTrip="onePass" description="Generated by DFDL VSCode Extension">
<ns1:document>
<ns1:documentPart type="file">di4zg8Kie.png</ns1:documentPart>
<ns1:document>
</ns1:infoset>
<ns1:dfdlInfoset type="file">png-infoset.xml</ns1:dfdlInfoset>
<ns1:infoset>
</ns1:parserTestCase>
</ns1:testSuite> </
/ns1:testSuite/ns1:parserTestCase/@model
contains the
relative path to the DFDL Schema file. This path is relative to the
location of the TDML file/ns1:testSuite/ns1:parserTestCase/@name
contains the name
of the Test Case/ns1:testSuite/ns1:parserTestCase/@description
contains a
description of the Test Case/ns1:testSuite/ns1:parserTestCase/ns1:document/ns1:documentPart/text()
contains the relative path to the input data file. This path is relative
to the location of the TDML file/ns1:testSuite/ns1:parserTestCase/ns1:infoset/ns1:dfdlInfoset/text()
contains the relative path to the infoset file created with the
parameters of this test case. This path is relative to the location of
the TDML file
If problems are encountered or new features are desired, create tickets here.
If additional help or guidance on using Daffodil and its tooling is needed, please engage with the community on mailing lists and/or review the archives.
The Apache Daffodil™ Extension for Visual Studio Code is an extension to the Microsoft® Visual Studio Code (VS Code) editor which enables Data Format Description Language (DFDL) syntax highlighting, code completion, and the interactive debugging of DFDL Schema parsing operations using Apache Daffodil™.
DFDL is a data modeling language used to describe file formats. The DFDL language is a subset of eXtensible Markup Language (XML) Schema Definition (XSD). Just as file formats are rich and complex, so is the modeling language to describe them. Developing DFDL Schemas can be challenging, requiring a lot of iterative development, and testing.
The purpose of Apache Daffodil™ Extension for Visual Studio Code is to ease the burden on DFDL Schema developers, enabling them to develop high quality, DFDL Schemas, in less time. VS Code is free, open source, cross-platform, well-maintained, extensible, and ubiquitous in the developer community. These attributes align well with the Apache Daffodil™ project and the Apache Daffodil™ Extension for Visual Studio Code.
DFDL is rich and complex. Developers using modern code editors expect some degree of built-in language support for the language in which they are developing, and DFDL should be no different. The Apache Daffodil™ Extension for Visual Studio Code provides syntax highlighting to improve the readability and context of the text. In addition, the syntax highlighting provides feedback to the developer indicating the structure and code appear syntactically correct.
The Apache Daffodil™ Extension for Visual Studio Code provides code completion, also known as “Intellisense”, offering context-aware code segment predictions that can dramatically speed up DFDL Schema development by reducing keyboard input, memorization by the developer, and typos.
The Apache Daffodil™ Extension for Visual Studio Code provides a Daffodil Data Parse Debugger which enables the developer to carefully control the execution of Apache Daffodil™ parse operations. Given a DFDL Schema and a target data file, the developer can step through the execution of a parse line by line, or until the parse reaches some developer-defined location, known as a break point, in the DFDL Schema. What is particularly helpful is that the developer can watch the parsed output, known as the “infoset”, as it’s being created by the parser, and see where the parser is parsing in the data file. This enables the developer to quickly discover and correct issues, improving DFDL Schema development and testing cycles.
The Apache Daffodil™ Extension for Visual Studio Code provides an integrated data editor. It is akin to a hex editor, but tuned specifically for challenging Daffodil use cases. As an editor designed for Daffodil developers by Daffodil developers, features of the tool will evolve quickly to address the specific needs of the Daffodil community.
This guide assumes VS Code and a Java Runtime Environment (Java 8 or greater) are installed.
The Apache Daffodil™ Extension for Visual Studio Code can be installed using one of two methods.
The Apache Daffodil™ Extension for Visual Studio Code is available in the Visual Studio Code Extension Marketplace. This option is recommended for most users.
The latest .vsix
(the file extension used for VS Code
extensions) file can also be downloaded from the Apache Daffodil™
Extension for Visual Studio Code releases
page and installed by either:
code --install-extension <path-to-downloaded-vsix-file>
;
orvsix
to bring up
the command and pointing it at the downloaded .vsix
file.Since DFDL Schema files end with .xsd
(XML Schema
Definition or XSD), the editor needs to be informed specifically that
DFDL mode is desired over the more general XML mode. The mode is
selected in the status bar at the bottom of the editor window.
Auto suggest is triggered using control space
or typing
the beginning characters of an item. Typing one or more unique
characters will further limit the results.
📝 NOTE: Intellisense is context aware, so
there is no need to begin a block with <
, just start
typing the tag name and code completion will automatically handle it as
appropriate.
Code completion can be used to add a schema block, with just a couple of keystrokes. Code completion can make short work out of completing a DFDL Format Block, offering context-sensitive suggestions attribute values.
The >
or /
characters are used to close
XML tags. Use tab
to select an item from the drop down and
to exit double quotes.
Code completion supports creating self-defined
dfdl:complextypes
and dfdl:simpleTypes
.
The tab
key can be used to complete an auto-complete
item within an XML tag. After auto-complete is triggered, typing the
initial character or characters will limit the suggestion results.
Inside an XML tag a space
or carriage return
will trigger a list of context sensitive attribute suggestions.
XPath expressions can be code completed.
Debugging a DFDL Schema needs both the DFDL Schema to use and a data file to parse. Instead of having to select the DFDL Schema and the data file each time from a file picker, a “launch configuration” can be created, which is a JSON description of the debugging session.
To create the launch profile:
Select Run -> Open Configurations
from the VS
Code menubar. This will load a launch.json
file into the
editor. There may be existing configurations
, or it may be
empty.
Press Add Configuration...
and select the
Daffodil Debug - Launch
option.
Once the launch.json
file has been created it will look
something like this
{
"type": "dfdl",
"request": "launch",
"name": "Ask for file name",
"program": "${command:AskForProgramName}",
"stopOnEntry": true,
"data": "${command:AskForDataName}",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
This default configuration will prompt the user to select the DFDL Schema and data files. If desired, the “program” and “data” elements can be mapped specifically to the user’s files to avoid being prompted each time.
📝 Note: Use ${workspaceFolder}
for files in the VS Code
workspace, and use absolute paths for files outside of the
workspace.
{
"type": "dfdl",
"request": "launch",
"name": "DFDL parse: My Data",
"program": "${workspaceFolder}/schema.dfdl.xsd",
"stopOnEntry": true,
"data": "/path/to/my/data",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
Using the launch profile above a DFDL parse: My Data
menu item at the top of the Run and Debug
pane
(Command-Shift-D) will display. Then press the play
button
to start the debugging session.
In the Terminal, log output from the DFDL debugger backend service will display. If something is not working as expected, check the output in this Terminal window for hints.
The DFDL Schema file will also be loaded in VS Code and there should
be a visible marking at the beginning where the debugger has paused upon
entry to the debugging session. Control the debugger using the available
VS Code debugger controls such as setting breakpoints
,
removing breakpoints
, continue
,
step over
, step into
, and
step out
.
Daffodil Debug:
Daffodil Debug: Debug File
- This will allow
for the user to fully step through the DFDL Schema. Once fully
completed, it will produce an infoset to a file named
SCHEMA-infoset.xml
which it then opens as well.Daffodil Debug: Run File
- This will run the
DFDL Schema, producing the infoset to a file named
SCHEMA-infoset.xml
.Debug File
- This will allow for the user to
fully step through the schema (WIP). Once fully completed, it will
produce a infoset to a file named SCHEMA-infoset.xml
which
it then opens as well.Run File
- This will run the DFDL Schema,
producing the infoset to a file named SCHEMA-infoset.xml
which it then opens as well.Find the infoset tools from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
Find the hex view from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
When uploading files to the mailing list, it may be easier to upload a zip file containing a TDML file, the DFDL Schema file, the input data file, and, optionally, the infoset file. Sending this file to the mailing list will allow other users to unpack your zip file and run your test case. It becomes even easier if you have multiple test cases.
To Generate a TDML file, use similar steps for Launching a DFDL Parse
Debugging Session: * Open the DFDL Schema file * From inside the file,
open the Command Palette (Mac = Command+Shift+P, Windows/Linux =
Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Generate TDML
command * From there, you
will be asked to provide the input data file, the TDML test case name,
the TDML test case description, and the location/name for the TDML
file.
Once the Daffodil Parse has finished, an infoset and a TDML file will be created. The TDML file contains relative paths to the DFDL Schema file, input data file, and infoset file. When creating an archive for these files, preserve the directory structure in the archive.
To Append a new test case to an existing TDML file, use similar steps
for Generating a TDML file: * Open the DFDL Schema file * From inside
the file, open the Command Palette (Mac = Command+Shift+P, Windows/Linux
= Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Append TDML
command * From there, you will
be asked to provide the input data file, the TDML test case name, the
TDML test case description, and the TDML file
Once the Daffodil Parse has finished, an infoset will be created, and a test case will be added to the existing TDML file. The TDML test case name OR description can be shared between test cases, but no two test cases should share TDML test case names and descriptions. To create an archive for a TDML file with multiple test cases, the same guidelines for creating an archive from a TDML file created from a ‘Generate TDML’ operation should be followed. All DFDL schema files, input data files, the TDML file, and, optionally, the infosets should be added to the archive. Additionally, any directory structure should be preserved in the archive to allow for the relative paths in the TDML file to be resolved.
When running a zip archive created from another user, extract the archive into your workspace folder. If there is an infoset in the zip archive that you wish to compare with your infoset, make sure that the infoset from the zip archive is not located at the same place as the default infoset for the Daffodil Parse that will be run when executing a test case from the TDML file. This is because the Daffodil Parse run by executing the TDML test case uses the default location for its infoset and will overwrite anything that already exists there.
To Execute a test case from a TDML file, use the following steps: *
Open a DFDL Schema file * From inside the file, open the Command Palette
(Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P) * Once the Command
Palette is opened, select the Daffodil Debug: Execute TDML
command * From there, you will be asked to provide the TDML file, TDML
test case name, and TDML test case description
A Daffodil Parse will then be launched. The DFDL Schema file and input data file to be used is determined by the selected test case in the TDML file. The infoset that is generated from this parse can optionally be compared to an infoset included in the zip archive the TDML file was extracted from.
This version of the Apache Daffodil™ Extension for Visual Studio Code
includes a new Data Editor. To use the Data Editor, open the VS Code
command palette and select Daffodil Debug: Data Editor
.
A notification message will appear that informs where the Data Editor will write its logs to. If problems happen, check this log file for clues.
Once the extension is connected to the server, the bottom left corner of the Data Editor shows the version of the Ωedit™ server powering the editor, and the port its connected to. Hovering over the filled circle shows the CPU load average, the memory usage of the server in bytes, the server session count, the server uptime measured in seconds, and the round trip latency measured in milli-seconds.
After selecting a file to edit, there will be a table with controls at the top of the Data Editor.
The first section of the table is called File Metrics
and it contains the path of the file being edited, its initial size in
bytes, the size as the file is being edited, and the detected Content
Type. When changes are committed, the Save
button will
become enabled, allowing the changes to be saved to file. The
Redo
and Undo
buttons will redo and undo edit
change transactions that have been applied. The Revert All
button will revert all edit changes that have been applied since the
file was opened. The Profile
button will open the
Data Profiler
and allow profiling of all or a portion of
the edited file.
The Data Profiler
allows for byte frequency profiling of
all or a section of the file starting at an editable start offset and
ending at an editable end offset, or an editable length of bytes. The
offsets and lengths will use the chosen Address Radix
. The
frequency scale can be either Linear
or
Logrithmic
. The graph can have either an ASCII
overlay that appears behind the graph, or None
for no
overlay behind the graph. Hover over the bars to see the byte frequency
and value. The frequency data can be downloaded as a Comma Separated
Value (CSV) file using the Profile as CSV
button. Click
anywhere outside the Data Profiler
to close it.
📝 Note: The maximum length of bytes that can be profiled in this version is capped at 10,000,000 (10M).
The second section of the table is called Search
, and it
allows for seeking to a desired offset and searching of byte sequences
in the given Edit Encoding
in the edited file. The
Seek
input box uses the selected Address Radix
as the seek radix. If the Edit Encoding
can be
case-insensitive, a Case Insensitive
toggle (located inside
the Search
input box) will be displayed allowing for that
option to be enabled. The found sequences can be examined using the
First
, Prev
, Next
, and
Last
buttons found in this section. The search can be
canceled using the Cancel
button.
Found sequences can also be replaced in the given
Edit Encoding
by filling in a replacement sequence and
clicking the Replace...
button.
The third section of the table is called Settings
, and
it allows for setting the Display Radix
,
Edit Encoding
, and Editing
mode.
The Display Radix
can be one of Hexadecimal,
Decimal, Octal, or Binary, and will affect
the bytes displayed in the Physical
viewport.
The Edit Encoding
can be one of Hexadecimal,
Binary, ASCII (7-Bit), Latin-1 (8-bit),
UTF-8, or UTF-16LE and will affect the selected bytes
being edited in the Edit
viewport.
In Single Byte Edit Mode
, individual bytes may be
deleted, inserted (to the left or to the right of the
selected byte), and overwritten in the
Single Byte Edit Window
that appears when a byte in the
Physical
or Logical
viewports is clicked.
Mouseover the buttons of the Ephemeral Edit Window
to
determine what each button does. Mouseover the Input Box
and it will show the byte offset position in the
Address Radix
selected radix. Buttons will become enabled
or disabled depending on whether there is valid input in the
Input Box
or not. Values entered in the
Input Box
must match the format set by the byte
Display Radix
when editing bytes in the
Physical
viewport or be in Latin-1 (8-bit ASCII)
format when editing bytes in the Logical
viewport.
When clicking on a single byte in either the Physical
or
Logical
viewports, the Data Inspector
will
populate giving the value of the byte in latin-1, and various integer
formats with respect to the selected endianess. The
Data Inspector
will also show the byte offset position in
the Address Radix
selected radix. All of the values in the
Data Inspector
are editable by clicking on the value and
entering a new value.
In Multiple Byte Edit Mode
, a segment of bytes is
selected from either the Physical
or Logical
viewports, then the selected segment of bytes is edited in the
Edit
viewport using the selected
Edit Encoding
.
Now changes are made in the selected Edit Encoding
.
When valid changes have been made to the segment of bytes in the
Edit
viewport, the Apply
button will become
enabled.
Once editing of the selected segment is completed and is valid, the
Apply
button is pressed, and the edited segment replaces
the selected segment. As with changes made in
Single Byte Mode
, changes in
Multiple Byte Edit Mode
are also applied as edit
transactions that can be undone and redone.
Byte addresses can be expressed in hexadecimal,
decimal, or octal. The selected
Address Radix
is also what is used entering an offset into
the Offset
input and for offsets and length in the
Data Profiler
. If an offset was entered in the
Offset
input and the Address Radix
is changed,
the offset will automatically be converted into the selected radix.
The Data Editor supports light and dark modes. The mode is determined by the VSCode theme. If the VSCode theme is set to a light theme, the Data Editor will be in light mode. If the VSCode theme is set to a dark theme, the Data Editor will be in dark mode.
The Data Editor can be navigated using the mouse or keyboard.
Clicking on the File Progress Indicator Bar
will
navigate to the position in the file that corresponds to the position
clicked.
Below the File Progress Indicator Bar
are a series of
buttons that allow for navigating the file. The Home
button
will take you to the beginning of the file, the Page Up
button will take you to the previous page of the file, the
Page Down
button will take you to the next page of the
file, and the End
button will take you to the end of the
file. The Line Up
button will take you to the previous line
of the file, and the Line Down
button will take you to the
next line of the file.
The following keyboard shortcuts are available in the Data Editor:
For any input box, including the input box for
Single Byte Editing Mode
, ENTER
will submit
the input, and ESC
will cancel the input.
When using Single Byte Editing Mode
,
CTRL-ENTER
will insert a byte to the left of the selected
byte, SHIFT-ENTER
will insert a byte to the right of the
selected byte ,and DELETE
will delete the selected
byte.
When browsing the data in the Physical
or
Logical
viewports, Home
will take you to the
top of the edited file, End
will take you to the end of the
edited file, Page-Up
will give you the previous page of the
edited file, Page-Down
will give you the next page of the
edited file, Arrow-Up
will give you the previous line of
the edited file, and Arrow-Down
will give you the next line
of the edited file.
In Single Byte Editing Mode
, there is no
Insert Left
button when the cursor is at the
beginning of the file, and there is no
Insert Right
button when the cursor is at the end
of the file. There are three workarounds for this limitation:
Instead of using the Single Byte Editing Mode
buttons, use the keybindings (CTRL-ENTER
for
Insert Left
and SHIFT-ENTER
for
Insert Right
).
Use Insert-Right
to insert a byte next to the start
of the file, then mode the cursor back to the start of the file and edit
the byte. Use Insert-Left
to insert a byte next to the end
of the file, then move the cursor to the end of the file and edit the
byte.
Use Multiple Byte Editing Mode
to insert bytes at
the beginning or end of the file.
In Windows, both Windows 10 & 11, if a file of size
<=
1 is selected to be loaded into the data editor it
will cause a backend server failure. This server failure will not
properly present the file’s data and the server will not properly
terminate when closing the data editor instance associated with this
file.
See Issue #824 for failure resolutions and more information
If problems are encountered or new features are desired, create tickets here.
If additional help or guidance on using Daffodil and its tooling is needed, please engage with the community on mailing lists and/or review the archives.
The Apache Daffodil™ Extension for Visual Studio Code is an extension to the Microsoft® Visual Studio Code (VS Code) editor which enables Data Format Description Language (DFDL) syntax highlighting, code completion, and the interactive debugging of DFDL Schema parsing operations using Apache Daffodil™.
DFDL is a data modeling language used to describe file formats. The DFDL language is a subset of eXtensible Markup Language (XML) Schema Definition (XSD). Just as file formats are rich and complex, so is the modeling language to describe them. Developing DFDL Schemas can be challenging, requiring a lot of iterative development, and testing.
The purpose of Apache Daffodil™ Extension for Visual Studio Code is to ease the burden on DFDL Schema developers, enabling them to develop high quality, DFDL Schemas, in less time. VS Code is free, open source, cross-platform, well-maintained, extensible, and ubiquitous in the developer community. These attributes align well with the Apache Daffodil™ project and the Apache Daffodil™ Extension for Visual Studio Code.
DFDL is rich and complex. Developers using modern code editors expect some degree of built-in language support for the language in which they are developing, and DFDL should be no different. The Apache Daffodil™ Extension for Visual Studio Code provides syntax highlighting to improve the readability and context of the text. In addition, the syntax highlighting provides feedback to the developer indicating the structure and code appear syntactically correct.
The Apache Daffodil™ Extension for Visual Studio Code provides code completion, also known as “Intellisense”, offering context-aware code segment predictions that can dramatically speed up DFDL Schema development by reducing keyboard input, memorization by the developer, and typos.
Hovering over a DFDL schema element will provide information about that DFDL element.
The Apache Daffodil™ Extension for Visual Studio Code provides a Daffodil Data Parse Debugger which enables the developer to carefully control the execution of Apache Daffodil™ parse operations. Given a DFDL Schema and a target data file, the developer can step through the execution of a parse line by line, or until the parse reaches some developer-defined location, known as a break point, in the DFDL Schema. What is particularly helpful is that the developer can watch the parsed output, known as the “infoset”, as it’s being created by the parser, and see where the parser is parsing in the data file. This enables the developer to quickly discover and correct issues, improving DFDL Schema development and testing cycles.
The Apache Daffodil™ Extension for Visual Studio Code provides an integrated data editor. It is akin to a hex editor, but tuned specifically for challenging Daffodil use cases. As an editor designed for Daffodil developers by Daffodil developers, features of the tool will evolve quickly to address the specific needs of the Daffodil community.
The Apache Daffodil™ Extension for Visual Studio Code provides TDML support. TDML is a way of specifying a DFDL schema, input test data, and expected result or expected error/diagnostic messages, all self-contained in an XML file. A TDML file is often useful just to ask a question about how something in DFDL works. For example, when uploading files to the daffodil users mailing list, it may be easier to upload a zip file containing a TDML file, the DFDL Schema file, the input data file, and, optionally, the infoset file. Sending this file to the users mailing list will allow other users to unpack your zip file and run your test case. It becomes even easier if you have multiple test cases. It allows for a level of precision that is often lacking, but also often required when discussing complex data format issues. As such, providing a TDML file along with a bug report is the absolutely best way to demonstrate a problem. You can read more about TDML here on the Apache Daffodil™ website.
This guide assumes VS Code and a Java Runtime Environment (Java 8 or greater) are installed.
The Apache Daffodil™ Extension for Visual Studio Code can be installed using one of two methods.
The Apache Daffodil™ Extension for Visual Studio Code is available in the Visual Studio Code Extension Marketplace. This option is recommended for most users.
The latest .vsix
(the file extension used for VS Code
extensions) file can also be downloaded from the Apache Daffodil™
Extension for Visual Studio Code releases
page and installed by either:
code --install-extension <path-to-downloaded-vsix-file>
;
orvsix
to bring up
the command and pointing it at the downloaded .vsix
file.Since DFDL Schema files end with .xsd
(XML Schema
Definition or XSD), the editor needs to be informed specifically that
DFDL mode is desired over the more general XML mode. The mode is
selected in the status bar at the bottom of the editor window.
Auto suggest is triggered using control space
or typing
the beginning characters of an item. Typing one or more unique
characters will further limit the results.
📝 NOTE: Intellisense is context aware, so
there is no need to begin a block with <
, just start
typing the tag name and code completion will automatically handle it as
appropriate.
Code completion can be used to add a schema block, with just a couple of keystrokes. Code completion can make short work out of completing a DFDL Format Block, offering context-sensitive suggestions attribute values.
The >
or /
characters are used to close
XML tags. Use tab
to select an item from the drop down and
to exit double quotes.
Code completion supports creating self-defined
dfdl:complextypes
and dfdl:simpleTypes
.
The tab
key can be used to complete an auto-complete
item within an XML tag. After auto-complete is triggered, typing the
initial character or characters will limit the suggestion results.
Inside an XML tag a space
or carriage return
will trigger a list of context sensitive attribute suggestions.
XPath expressions can be code completed.
Debugging a DFDL Schema needs both the DFDL Schema to use and a data file to parse. Instead of having to select the DFDL Schema and the data file each time from a file picker, a “launch configuration” can be created, which is a JSON description of the debugging session.
To create the launch profile:
Select Run -> Open Configurations
from the VS
Code menubar. This will load a launch.json
file into the
editor. There may be existing configurations
, or it may be
empty.
Press Add Configuration...
and select the
Daffodil Debug - Launch
option.
Once the launch.json
file has been created it will look
something like this
{
"type": "dfdl",
"request": "launch",
"name": "Ask for file name",
"program": "${command:AskForProgramName}",
"stopOnEntry": true,
"data": "${command:AskForDataName}",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
This default configuration will prompt the user to select the DFDL Schema and data files. If desired, the “program” and “data” elements can be mapped specifically to the user’s files to avoid being prompted each time.
📝 Note: Use ${workspaceFolder}
for files in the VS Code
workspace, and use absolute paths for files outside of the
workspace.
{
"type": "dfdl",
"request": "launch",
"name": "DFDL parse: My Data",
"program": "${workspaceFolder}/schema.dfdl.xsd",
"stopOnEntry": true,
"data": "/path/to/my/data",
"infosetOutput": {
"type": "file",
"path": "${workspaceFolder}/infoset.xml"
},
"debugServer": 4711
}
Using the launch profile above a DFDL parse: My Data
menu item at the top of the Run and Debug
pane
(Command-Shift-D) will display. Then press the play
button
to start the debugging session.
In the Terminal, log output from the DFDL debugger backend service will display. If something is not working as expected, check the output in this Terminal window for hints.
The DFDL Schema file will also be loaded in VS Code and there should
be a visible marking at the beginning where the debugger has paused upon
entry to the debugging session. Control the debugger using the available
VS Code debugger controls such as setting breakpoints
,
removing breakpoints
, continue
,
step over
, step into
, and
step out
.
Daffodil Debug:
Daffodil Debug: Debug File
- This will allow
for the user to fully step through the DFDL Schema. Once fully
completed, it will produce an infoset to a file named
SCHEMA-infoset.xml
which it then opens as well.Daffodil Debug: Run File
- This will run the
DFDL Schema, producing the infoset to a file named
SCHEMA-infoset.xml
.Debug File
- This will allow for the user to
fully step through the schema (WIP). Once fully completed, it will
produce a infoset to a file named SCHEMA-infoset.xml
which
it then opens as well.Run File
- This will run the DFDL Schema,
producing the infoset to a file named SCHEMA-infoset.xml
which it then opens as well.Find the infoset tools from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
Find the hex view from the command menu (Mac = Command+Shift+P, Windows/Linux = Ctrl+Shift+P)
When uploading files to the mailing list, it may be easier to upload a zip file containing a TDML file, the DFDL Schema file, the input data file, and, optionally, the infoset file. Sending this file to the mailing list will allow other users to unpack your zip file and run your test case. It becomes even easier if you have multiple test cases.
To Generate a TDML file, use similar steps for Launching a DFDL Parse
Debugging Session: * Open the DFDL Schema file * From inside the file,
open the Command Palette (Mac = Command+Shift+P, Windows/Linux =
Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Generate TDML
command * From there, you
will be asked to provide the input data file, the TDML test case name,
the TDML test case description, and the location/name for the TDML
file.
Configure launch.json to generate a TDML file.
Run the debug extension, choose a dfdl schema and data file. Make sure the language mode is DFDL.
Press the continue button to produce the infoset.
When the infoset generates, a temporary TDML schema will generate.
Once the Daffodil Parse has finished, an infoset and a TDML file will be created. The TDML file contains relative paths to the DFDL Schema file, input data file, and infoset file. When creating an archive for these files, preserve the directory structure in the archive.
Close all windows except the dfdl schema window. Click “Copy TDML
File” in the dropdown.
Enter a name for the TDML file, click “Save TDML File.
To Append a new test case to an existing TDML file, use similar steps
for Generating a TDML file: * Open a TDML file * From inside the file,
open the Command Palette (Mac = Command+Shift+P, Windows/Linux =
Ctrl+Shift+P) * Once the Command Palette is opened, select the
Daffodil Debug: Append TDML
command * Or select the Append
TDML option at the top right of window * From there, you will be asked
to provide the input data file, the TDML test case name, the TDML test
case description, and the TDML file * The Append option will append the
TDML from the temp directory to the currently open TDML if the two files
are different
To append to the existing TDML file, open the TDML file and click the
button in the upper right corner to open in a text editor.
Change the test case name and save the file.
Select append from the TDML dropdown menu at the upper right.
The original default test case from the temp directory will be appended to the saved TDML file with the renamed new test case.
Once the Daffodil Parse has finished, an infoset will be created, and a test case will be added to the existing TDML file. The TDML test case name OR description can be shared between test cases, but no two test cases should share TDML test case names and descriptions. To create an archive for a TDML file with multiple test cases, the same guidelines for creating an archive from a TDML file created from a ‘Generate TDML’ operation should be followed. All DFDL schema files, input data files, the TDML file, and, optionally, the infosets should be added to the archive. Additionally, any directory structure should be preserved in the archive to allow for the relative paths in the TDML file to be resolved.
When running a zip archive created from another user, extract the archive into your workspace folder. If there is an infoset in the zip archive that you wish to compare with your infoset, make sure that the infoset from the zip archive is not located at the same place as the default infoset for the Daffodil Parse that will be run when executing a test case from the TDML file. This is because the Daffodil Parse run by executing the TDML test case uses the default location for its infoset and will overwrite anything that already exists there.
To Execute a test case from a TDML file, use the following steps: *
Open a TDML file * From inside the file, open the Command Palette (Mac =
Command+Shift+P, Windows/Linux = Ctrl+Shift+P) ’ * Once the Command
Palette is opened, select the Daffodil Debug: Execute TDML
command or select Execute TDML for the dropdown menu * From there, you
will be asked to select TDML test case name, and TDML test case
description
Click on the explore tab to display the file view. Select a TDML file.
After the TDML file opens select “Execute TDML” option from dropdown.
Quickly select a test case and description.
The dfdl schema and a new infoset will utilizing using the values from the TDML file.A Daffodil Parse will then be launched. The DFDL Schema file and input data file to be used is determined by the selected test case in the TDML file. The infoset that is generated from this parse can optionally be compared to an infoset included in the zip archive the TDML file was extracted from.
This version of the Apache Daffodil™ Extension for Visual Studio Code
includes a new Data Editor. To use the Data Editor, open the VS Code
command palette and select Daffodil Debug: Data Editor
.
A notification message will appear that informs where the Data Editor will write its logs to. If problems happen, check this log file for clues.
Once the extension is connected to the server, the bottom left corner of the Data Editor shows the version of the Ωedit™ server powering the editor, and the port its connected to. Hovering over the filled circle shows the CPU load average, the memory usage of the server in bytes, the server session count, the server uptime measured in seconds, and the round trip latency measured in milli-seconds.
After selecting a file to edit, there will be a table with controls at the top of the Data Editor.
The first section of the table is called File Metrics
and it contains the path of the file being edited, its initial size in
bytes, the size as the file is being edited, and the detected Content
Type. When changes are committed, the Save
button will
become enabled, allowing the changes to be saved to file. The
Redo
and Undo
buttons will redo and undo edit
change transactions that have been applied. The Revert All
button will revert all edit changes that have been applied since the
file was opened. The Profile
button will open the
Data Profiler
and allow profiling of all or a portion of
the edited file.
The Data Profiler
allows for byte frequency profiling of
all or a section of the file starting at an editable start offset and
ending at an editable end offset, or an editable length of bytes. The
offsets and lengths will use the chosen Address Radix
. The
frequency scale can be either Linear
or
Logrithmic
. The graph can have either an ASCII
overlay that appears behind the graph, or None
for no
overlay behind the graph. Hover over the bars to see the byte frequency
and value. The frequency data can be downloaded as a Comma Separated
Value (CSV) file using the Profile as CSV
button. Click
anywhere outside the Data Profiler
to close it.
📝 Note: The maximum length of bytes that can be profiled in this version is capped at 10,000,000 (10M).
The second section of the table is called Search
, and it
allows for seeking to a desired offset and searching of byte sequences
in the given Edit Encoding
in the edited file. The
Seek
input box uses the selected Address Radix
as the seek radix. If the Edit Encoding
can be
case-insensitive, a Case Insensitive
toggle (located inside
the Search
input box) will be displayed allowing for that
option to be enabled. The found sequences can be examined using the
First
, Prev
, Next
, and
Last
buttons found in this section. The search can be
canceled using the Cancel
button.
Found sequences can also be replaced in the given
Edit Encoding
by filling in a replacement sequence and
clicking the Replace...
button.
The third section of the table is called Settings
, and
it allows for setting the Display Radix
,
Edit Encoding
, and Editing
mode.
The Display Radix
can be one of Hexadecimal,
Decimal, Octal, or Binary, and will affect
the bytes displayed in the Physical
viewport.
The Edit Encoding
can be one of Hexadecimal,
Binary, ASCII (7-Bit), Latin-1 (8-bit),
UTF-8, or UTF-16LE and will affect the selected bytes
being edited in the Edit
viewport.
In Single Byte Edit Mode
, individual bytes may be
deleted, inserted (to the left or to the right of the
selected byte), and overwritten in the
Single Byte Edit Window
that appears when a byte in the
Physical
or Logical
viewports is clicked.
Mouseover the buttons of the Ephemeral Edit Window
to
determine what each button does. Mouseover the Input Box
and it will show the byte offset position in the
Address Radix
selected radix. Buttons will become enabled
or disabled depending on whether there is valid input in the
Input Box
or not. Values entered in the
Input Box
must match the format set by the byte
Display Radix
when editing bytes in the
Physical
viewport or be in Latin-1 (8-bit ASCII)
format when editing bytes in the Logical
viewport.
When clicking on a single byte in either the Physical
or
Logical
viewports, the Data Inspector
will
populate giving the value of the byte in latin-1, and various integer
formats with respect to the selected endianess. The
Data Inspector
will also show the byte offset position in
the Address Radix
selected radix. All of the values in the
Data Inspector
are editable by clicking on the value and
entering a new value.
In Multiple Byte Edit Mode
, a segment of bytes is
selected from either the Physical
or Logical
viewports, then the selected segment of bytes is edited in the
Edit
viewport using the selected
Edit Encoding
.
Now changes are made in the selected Edit Encoding
.
When valid changes have been made to the segment of bytes in the
Edit
viewport, the Apply
button will become
enabled.
Once editing of the selected segment is completed and is valid, the
Apply
button is pressed, and the edited segment replaces
the selected segment. As with changes made in
Single Byte Mode
, changes in
Multiple Byte Edit Mode
are also applied as edit
transactions that can be undone and redone.
Byte addresses can be expressed in hexadecimal,
decimal, or octal. The selected
Address Radix
is also what is used entering an offset into
the Offset
input and for offsets and length in the
Data Profiler
. If an offset was entered in the
Offset
input and the Address Radix
is changed,
the offset will automatically be converted into the selected radix.
The Data Editor supports light and dark modes. The mode is determined by the VSCode theme. If the VSCode theme is set to a light theme, the Data Editor will be in light mode. If the VSCode theme is set to a dark theme, the Data Editor will be in dark mode.
The Data Editor can be navigated using the mouse or keyboard.
Clicking on the File Progress Indicator Bar
will
navigate to the position in the file that corresponds to the position
clicked.
Below the File Progress Indicator Bar
are a series of
buttons that allow for navigating the file. The Home
button
will take you to the beginning of the file, the Page Up
button will take you to the previous page of the file, the
Page Down
button will take you to the next page of the
file, and the End
button will take you to the end of the
file. The Line Up
button will take you to the previous line
of the file, and the Line Down
button will take you to the
next line of the file.
The following keyboard shortcuts are available in the Data Editor:
For any input box, including the input box for
Single Byte Editing Mode
, ENTER
will submit
the input, and ESC
will cancel the input.
When using Single Byte Editing Mode
,
CTRL-ENTER
will insert a byte to the left of the selected
byte, SHIFT-ENTER
will insert a byte to the right of the
selected byte ,and DELETE
will delete the selected
byte.
When browsing the data in the Physical
or
Logical
viewports, Home
will take you to the
top of the edited file, End
will take you to the end of the
edited file, Page-Up
will give you the previous page of the
edited file, Page-Down
will give you the next page of the
edited file, Arrow-Up
will give you the previous line of
the edited file, and Arrow-Down
will give you the next line
of the edited file.
Debugger * Ubuntu 24.02 (release date 04/25/2024) When using the debugger to step through a dfdl schema utilizing the step over action, the step over action will trigger dfdl intellisense to display a list of suggestions when a line in the schema is reached that results in output to the infoset. This problem can be mitigated by disabling “WaylandEnable” by uncommenting “#WaylandEnable=false” in the /etc/gdm3/custom.conf configuration file and rebooting the system. * Ubuntu 20.04 (release date 04/23/2020) When using the debugger to step through a dfdl schema utilizing the step over action, the step over action will trigger dfdl intellisense to display a list of suggestions when a line in the schema is reached that results in output to the infoset. A cause and solution have not yet been discovered. Note that the mitigation listed above for Ubuntu 24.04 was not found to be an effective mitigation for Ubuntu 20.04. * Ubuntu 22.04 (release date 04/21/2022) The debugger problem that occurs with the step over action in Ubuntu 20.04 and Ubuntu 24.04 has not been found to be a problem in Ubuntu 22.04. * At this time the debugger step into and step out actions have no code behind them, using either button results in an unrecoverable error. We have not found a way to disable the step into and step out buttons. This problem occurs in all Operating Systems.
If problems are encountered or new features are desired, create tickets here.
If additional help or guidance on using Daffodil and its tooling is needed, please engage with the community on mailing lists and/or review the archives.
If you would like to contribute to the project, please checkout our Development.md for instructions on how to get started.
The Apache Daffodil™ Extension for Visual Studio Code is an extension to the Microsoft® Visual Studio Code (VS Code) editor, designed for Data Format Description Language1 (DFDL) Schema developers. The purpose of the Apache Daffodil™ Extension for Visual Studio Code is to ease the burden on DFDL Schema developers by enabling rapid development of high-quality DFDL Schemas, with syntax highlighting, code completion, data file editing, and debugging of DFDL Schema parsing operations using Apache Daffodil™.
The Apache Daffodil™ Extension for Visual Studio Code provides syntax highlighting to improve the readability and context of the text and provide instant feedback to the developer indicating the structure and code are syntactically correct.
The Apache Daffodil™ Extension for Visual Studio Code provides code completion offering context-aware code segment predictions that can dramatically speed up DFDL Schema development by reducing keyboard input, memorization by the developer, and typos.
The Apache Daffodil™ Extension for Visual Studio provides a Daffodil parse debugger enabling the developer to control the execution of Daffodil parse operations. Given a DFDL Schema and a target data file, the developer can step through the execution of parse operations line by line, or until the parse reaches some developer-defined location, known as a breakpoint, in the DFDL Schema or the data being parsed. What is particularly helpful is that the developer can watch the parsed output, known as the “Infoset”, as it is being created by the parser, and watch where the parser is parsing in the data file. This enables the developer to quickly discover and correct DFDL Schema issues, making development and testing cycles more efficient.
The Apache Daffodil™ Extension for Visual Studio Code provides an integrated data editor that is tuned specifically for challenging Daffodil use cases. It is designed to support large files, of any type, that are well beyond the limits of the standard text editor in VS Code. The Data Editor allows for editing of single or multiple bytes in different encodings. The Data Editor can seek to file offsets, search and replace byte sequences, profile data, and determine a file’s content type. Features of the Data Editor will evolve to address the specific needs of the Daffodil community.
The Data Editor component can be configured to run alongside and open the designated file specified by the data debugger. During this operation, whenever the debug session steps to a new byte position or stops from a breakpoint, the data content within the Data Editor will illustrate the byte location.
If additional help or guidance on using Apache Daffodil™, Apache Daffodil™ Extension for Visual Studio Code, or DFDL development in general is needed, please engage with the Daffodil user and developer communities on mailing lists (https://daffodil.apache.org/community/) and/or review the list archives (https://lists.apache.org/list.html?users@daffodil.apache.org).
Apache Daffodil™ and the Apache Daffodil™ Extension for Visual Studio Code are Apache Software Foundation (ASF) projects, are free open-source software, and under active development. Feedback and contributions are welcome.
Apache, Apache Feather Logo, Apache Daffodil, Daffodil, and the Apache Daffodil logo are trademarks of The Apache Software Foundation. Visual Studio Code, and VS Code are trademarks of Microsoft® Corporation. All rights reserved.
1 Data Format Description Language (DFDL) is a standard from the Open Grid Forum (www.ogf.org), available here (https://ogf.org/documents/GFD.240.pdf).
Copyright © 2023 The Apache
Software Foundation. Licensed under the Apache License,
Version 2.0.
Apache, Apache Daffodil, Daffodil, and the Apache
Daffodil logo are trademarks of The Apache Software Foundation.
User Documentation * 1.4.0 - latest * 1.3.1 * 1.3.0 * 1.2.0