This repository implements the data-stuffer command line tool, designed to parse structured workbooks containing test patient vital signs and post them as Observations to an EHR's FHIR server.
You will need Python 3.11 or greater and a few dependencies in order to run the data-stuffer. Fortunately, this repository is implemented as a package which can easily install most of these dependencies for you.
We recommend installing the package in virtual environment - this keeps all dependencies including Python local to the project. In order to create a virtual environment for your project, run the command:
$ python3 -m venv venv
where second venv in the command above is an arbitrary name for the virtual environment; though, "venv" is already included in .gitignore. You may choose any name you desire for you virtual environment, but be mindful to add the name of your virtual environment to your .gitignore file, as a directory will be created in the project with this name.
You should now see a venv directory in the current location. If you look inside venv/bin you'll see some profiles and python/pip binaries. For Bash, activate the activate profile using the source command:
$ source venv/bin/activate
This is only active for the current session. If you open another terminal window/tab you will need to source activate there as well. You may want to add an alias.
Many of the tools in this repository directly communicate with your EHR's FHIR server and, as such, must have a few configuration parameters specified during initial setup.
First, create a new directory, src/config/secrets, and copy the contents of src/config/sample/ into that new directory. Review the fields in these newly copied files and populate the variable based on your local institution.
This particular configuration is technically not necessary, namely when using the data-stuffer generate .. command. This is the only command that does not interact with the remote FHIR server, but can be quite useful for testing configuration changes.
Information related to configuring these values is available in the Configuration README. Likewise, information related to the commands is available in the Command README.
The following files exist in the src/config/ directory, and can be modified to tailor the application for your institution's data requirements:
For more information about custom configuration, see the configuration README.
Now that the venv virtual environment is active and you've set up your configuration files, you can use Python to install the package defined in setup.py. To install the package for research or operational use, run the following command:
$ python3 -m pip install .
Note: this step packages a snapshot of the current state of the package configuration files, so updates to the configuration files will require that the package be re-installed. This may not be desirable during initial setup, as the configuration parameters may require iteration before the tools function as expected. Instructions for installing the tools in "editable mode", see the Development section
This package includes command line entry points for two command line tools:
test-auth: used to test the generation of access tokens for the FHIR serverdata-stuffer: used to process case data and publish as FHIR Resources to an EHR FHIR server
Each of these command line tools is defined as a discrete Typer application, located in corresponding files under src/flcr/commands/. While the virtual environment is active, these commands are in the executable path and can be called by name from any directory:
$ data-stuffer --help
Usage: data-stuffer [OPTIONS] COMMAND [ARGS]...
Parses Observation workbook and posts data to EHR
Options:
--install-completion [bash|zsh|fish|powershell|pwsh]
Install completion for the specified shell.
--show-completion [bash|zsh|fish|powershell|pwsh]
Show completion for the specified shell, to
copy it or customize the installation.
--help Show this message and exit.
Commands:
generate Produces a JSON representation of FHIR...
generate-and-push This generates Observation resources...
generate-and-push-preauthorized
This generates Observation resources...
Information related to the command interfaces is available in the Command README.
The primary input to all of the data-stuffer commands is an Excel file that resembles the following formatting:
Note that some elements in the example above are required, while others are not. Format requirements for the Excel workbook include:
- Data be placed in a sheet named "Vitals", within the Excel workbook
- Patient ID (e.g. MRN) for the spreadsheet be placed in the second column of the first row. The
data-stufferrequires that this value be an integer, and pads the value from the left with zeroes (up to 8 total digits). If this is not compatible with your institution's patient ID format, you can consider altering the FHIR query code, located infhir_queries.py - Encounter ID (e.g. CSN) for the spreadsheet be placed in the second column of the second row
- Row labels in the first column, beginning on the third row
- A "Date" row is required, and should follow
mm/DD/YYformatting - A "Time" row is required, and should follow 4-digit military time (e.g.
1430for2:30pm) - Note: "Time" is assumed to be in UTC time (which
data-stufferformats as Zulu time), unlessinput_timezoneis provided in the configuration files.
- A "Date" row is required, and should follow
- Subsequent row labels are treated by the application as potential Observation types
- For each Observation label (e.g. "Temp", "SpO2"), the application determines whether it can process that data type based on the configuration of
observation_configuration.yaml
- For each Observation label (e.g. "Temp", "SpO2"), the application determines whether it can process that data type based on the configuration of
- The remaining cells in an Observation row should contain the values that represent each Observation instance
- In cases where a value is "compound" (e.g. Blood Pressure), and contains multiple discrete values embedded in a single format, the configuration defined in
observation_configuration.yamlmust specify each component element, and how compound values are to be split.
- In cases where a value is "compound" (e.g. Blood Pressure), and contains multiple discrete values embedded in a single format, the configuration defined in
An example Excel workbook is provided in the test data, at src/tests/data/Vitals.xlsx.
For more information about configuring the data-stuffer to parse different types of data, please see the Configuration README
If your institutional context prevents the application from generating an access token at runtime, data-stuffer allows users to provide a pre-authorized access token using the following command:
data-stuffer generate-and-push-preauthorized /path/to/observations.xlsx --override-token=<YOUR_AUTH_TOKEN_STRING>
Notes:
- This prevents the need to provide the following configuration elements in
fhir_access_configuration.yaml, which can all be left asnull:client_idtoken_auth_urlprivate_key_pathpublic_key_path
- Use of a pre-authorized access token does not currently support tracking of token expiration. As such, it is important to ensure that your token will remain valid through the expected runtime of the FLCR commands. Expiration of the token during runtime will not trigger failure of the tool, but may result in failed writes to the FHIR server with unpredictable error messages.
To run the automated tests included in this repository, we will make use of the pytest package. However, we will need to provide additional options to ensure that every test passes.
Certain tests prompt the user to provide a valid access token for the FHIR server. In order to respond to the prompt, we must invoke our tests with the -s option (e.g. pytest -s). Otherwise, tests marked with the requires_token pytest marker will fail.
If you wish to skip these tests, you can omit the -s option and, rather, invoke the tests as follows:
pytest -m "not requires_token"
By default, this tool generates an access token, under the assumption that the remote FHIR server will provide appropriate authorization. When the tool cannot be configured for automatic token authorization, users should skip the tests marked with the generates_token pytest marker, e.g.:
pytest -m "not generates_token"
If you wish to skip all tests related to access tokens, invoke the tests as follows:
pytest -m "not requires_token and not generates_token"
For those looking to modify the code in this package, there are some things that should be kept in mind.
Implementing changes to code that is installed as a package can be tedious, but one method to reduce that tedium is to install you package in "editable" mode, passing the -e option to the install command:
$ python3 -m pip install -e .
This is particularly useful when iteratively changing the configuration files used by these tools.
