Various converters to convert value sets from CSV to JSON, etc.

Health Open Terminology Ecosystem

Last update: Sep 8, 2022

Related tags

File & Path Utilities ValueSet-Converters

Overview

ValueSet Converters

Tools for converting value sets in different formats. Such as converting extensional value sets in CSV format to JSON format able to be uploaded to a FHIR server.

Set up / installation

You must have Python3 installed.
Run to clone repo: git clone https://github.com/HOT-Ecosystem/ValueSet-Converters.git
Change directory: cd ValueSet-Converters
Make & use virtual environment: virtualenv env; source env/bin/activate
Run to install dependencies: pip install -r requirements.txt
To use the "VSAC to OMOP/FHIR JSON" tool, which fetches from Google Sheets, you'll need the following:
3.a. Access to this google sheet.
3.b. Place credentials.json and token.json inside the env/ directory. These can be obtained from Joe (will upload them to a Google Drive folder later).
Create an env/.env file based on env/.env.example, replacing VSAC_API_KEY with your own VSAC API key as shown in your profile. More instructions on getting an API key can be found in "Step 1" on this page.

Tools

First, cd into the directory where this repository was cloned.

1. CSV to FHIR JSON

First, convert your CSV to have column names like the example below. Then can run these commands.

Syntax

python3 -m value_set_csv_to_fhir_json path/to/FILE.csv

Example

python3 -m value_set_csv_to_fhir_json examples/1/input/n3cLikeExtensionalValueSetExample.csv

Before:

valueSet.id,valueSet.name,valueSet.description,valueSet.status,valueSet.codeSystem,valueSet.codeSystemVersion,concept.code,concept.display
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1234,mama bear
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1235,papa bear
1,bear family,A family of bears.,draft,http://loinc.org,2.36,1236,baby bear

After:

\n\t\t\t

A family of bears.

\n\t\t

" }, "name": "bear family", "title": "bear family", "status": "draft", "description": "A family of bears.", "compose": { "include": [ { "system": "http://loinc.org", "version": 2.36, "concept": [ { "code": 1234, "display": "mama bear" }, { "code": 1235, "display": "papa bear" }, { "code": 1236, "display": "baby bear" } ] } ] } }">

{
    "resourceType": "ValueSet",
    "id": 1,
    "meta": {
        "profile": [
            "http://hl7.org/fhir/StructureDefinition/shareablevalueset"
        ]
    },
    "text": {
        "status": "generated",
        "div": "
  
   \"http://www.w3.org/1999/xhtml\">\n\t\t\t
   A family of bears.
\n\t\t
  
"
    },
    "name": "bear family",
    "title": "bear family",
    "status": "draft",
    "description": "A family of bears.",
    "compose": {
        "include": [
            {
                "system": "http://loinc.org",
                "version": 2.36,
                "concept": [
                    {
                        "code": 1234,
                        "display": "mama bear"
                    },
                    {
                        "code": 1235,
                        "display": "papa bear"
                    },
                    {
                        "code": 1236,
                        "display": "baby bear"
                    }
                ]
            }
        ]
    }
}

2. VSAC to OMOP/FHIR JSON

This will fetch from the following google sheet: https://docs.google.com/spreadsheets/d/1jzGrVELQz5L4B_-DqPflPIcpBaTfJOUTrVJT5nS_j18/edit#gid=1335629675

Syntax

With default options: python3 -m value_set_vsac_to_json
Choosing an output format: python3 -m value_set_vsac_to_json -f omop

Options:

Short flag	Long flag	Options	Default	Description
`-f`	`--format`	`['omop', 'fhir']`	'omop'	Output format.

Comments

generate a set of concept set datafile to manage the bulk import of concept sets to the enclave.
Concept set can be created (edit: Joe: In the future when they have added this feature) via a bulk import process and by pass the concept set editor for importing a bulk of externally authorized concept sets. Specify detail specification to handle this process.

Tables needed to generate (edit: Joe: Added links):

[x] 1. concept set container: (concept_set_container_edited table)

[x] 2. concept set version: (code_sets table)

[x] 3. concept set metadata: (concept_set_version_item_rv_edited table)

More details will be uploaded. This is a place holder issues for Joe and Steph.
new feature
opened by stephanieshong 11
Value set name collisions
Description

(@stephanieshong Please feel free to edit this and change this text if I got anything wrong.) There are some edge cases where we have value sets with the same name. And sometimes there is a single OID which represents a grouping of those value sets. And, in a smaller subset of those edge cases, there is no such OID / value set on UMLS.

In those cases, we should create a new grouping value set:

Create to be uploaded in enclave. We don't need to make in UMLS.

Will not have an OID

Members of the concept set should be all the members of all of the value sets with the same name.

Provenance should include the OIDs of those value sets.

Tasks

[x] 1. Update our code to revert the changes we made (the appending of code systems to the value set names).

[ ] 2. Handle collision cases

[ ] 2.a. Use existing grouping sets that exist.

[ ] 2.b. Handle cases where there isn't an existing grouping set.

Task 2 Options

Option (d) seems to be the best, based on our discussion at the 2022/02/17 BIDS data meeting.

~a. Merge the value sets~

I think this assumes that the value sets that would be merged are of entirely different code systems. I suppose we can work with that assumption for now, but if that isn't the case, we'll need to think about what to do. This one seems preferred.

~b. Give the value sets different names~

This is what we've already done. We decided to append code system names to the value set. But it seems this is not preferred.

~c. Only upload 1 of the value sets~

I this is the case, we'd need to store information about the value set collisions to some data storage layer (file or DB). Then wed need a curation process to determine which set is best.

d. Create a new set (when grouping not available)

Optional: Create new set in VSAC and get OID: https://www.nlm.nih.gov/vsac/support/authorguidelines/createvs.html (edit: We decided this doesn't need to be done)

Optional: Add new OID to cset.csv (only if we do (1))

Update code to programmatically create new sets in these collision cases where no grouping already exists.

Upload to enclave

Optional: Register IDs? (options: a, b) (only if we do (1))

Davera mentioned this, but I'm not sure it's necessary, as VSAC autogenerates new OIDs (step 1 above). HL7 also charges $500 to register an OID.

e. Use previously created combination value sets / OIDs (when grouping set available)

A "grouping set" is when a value set is defined as a "grouping" in UMLS VSAC, and its members are OIDs that point to its composite value sets.

Apparently, for some (or perhaps all?) instances of value sets which have more than 1 instance in VSAC, there exist other value sets which have grouped these multiple instances together. I believe this is mainly to account for different code systems. So, for example, there may be a value sets for blood transfusion. There may be a set that is (i) in one code system (e.g. SNOMED) and another that uses (ii) a different code system (e.g. ICD10CM). Then, there may be a (iii) 3rd value set which combines the codes with both. In this case, what we we want to do is to use and upload (iii) to the enclave, and not upload (i) and (ii).

We should use this option (e) as much as possible, and only use option (d) (creating a new value set) if VSAC doesn't already have such a grouped value set.
ux issue
opened by joeflack4 10
New outputs: GRAVITY value sets

Description

Allow vsac_wrangler to fetch from the Lisa2 VSAC GRAVITY sheet in this GoogleSheets spreadsheet: https://docs.google.com/spreadsheets/d/1jzGrVELQz5L4B_-DqPflPIcpBaTfJOUTrVJT5nS_j18/edit#gid=1272275514

Related

#42
update

opened by joeflack4 10
Get BIDS license for UMLS so we can share credentials for API access

Oh, @DaveraGabriel, you're talking about VSAC access, not JHU OMOP access. We should move that to a different issue. I already made the mistake of confusing the two in this thread.

Originally posted by @Sigfried in https://github.com/HOT-Ecosystem/ValueSet-Tools/issues/14#issuecomment-1011233079
security

opened by Sigfried 6
new enclave_wrangler bug

I'm getting a new error while trying to run enclave_wrangler for Lisa6, which has a single oid. I put the error message in the commit comment for https://github.com/HOT-Ecosystem/ValueSet-Tools/commit/fdfe3cfc0e47a2ba0e662797857e6ce7c4e8a213
bug

opened by Sigfried 5
Performance: Browser caching & global state
Overview

We want front-end experience to be fast, and we don't want to make repeated, unnecessary requests.

Options

Can do one or all of the following

[ ] 1. Global state for a given user session

[ ] 2. Browser caching (for multiple user sessions on same machine)

ux issue urgency:2/3 ease:2/3
opened by joeflack4 4
problem with checkboxes on rows in ComparisonDataTable

Getting errors with codeset_id 419757429, which is currently included in the Example Comparison set of cset ids. Need to figure out why. This is occurring in both the develop branch and the db branch. I don't remember it being an issue before we switched to working on the db branch, so maybe the problem is arising from some change in the data rather than a problem with code. Or maybe if we go to an older commit on the development branch the problem will disappear.

opened by Sigfried 3

Hierarchy

Updates

    Backend
    - Finished off hierarchy portion of route cr-hierarchy
    - Added functions for building hierarchy to utils
    - Misc: Codestyle updates

update

opened by joeflack4 3

Modify/create cset locally and upload to enclave
User experience & code flow description

Updating an existing cset

Frontend: User clicks on header / button to edit concept set.

Frontend: reacts to click ~a. Frontend: They get a page which shows: (i) list of concept names~ (can do this in the future) b. Frontend: A new column.

Frontend: The label should say (draft) instead of (v#). The user can check/uncheck the concepts they want / don't want

Frontend: They click a button to 'commit changes' a. Will need their user RID in order to set the created_by fields for concept_set_container and code_set b. What other metadata do we need (for MVP)?

Backend: Hit route (new route?)

Backend: Persist changes (multiple files? which files?) a. Persist by writing to prepped files? yes...and maybe use git diff and patch

Backend: Push to enclave?

Frontend: User sees confirmation message that commit succeeded?

Creating a new cset

TODO

Concerns

How to handle production server updates to termhub-csets vs local deveopment?

Maybe have them commit to different branches, e.g. if env variable is production, commit to main, else develop.

new feature
opened by Sigfried 3
Need to expand concept subset in datasets.py

Having problems with missing concepts and missing links between concepts... can explain later.

What I think we should do is expand the subset of concepts in prepped files to: all of the concepts appear in the concept_ancestor table where either the ancestor_concept_id or the descendant_concept_id is included in the concept_set_members table. Right

opened by Sigfried 3

Datasets download updates

Updates

WIP:
- Update: remove "Unnamed: x" columns from datasets
- Update: move jupyter transform code over to enclave_wrangler
- Update: datasets: filter concept sets that have no container

opened by joeflack4 3

Add check for auth token validity

Overview

Check token validity early and often.

Sub-tasks

[x] 1. If enclave_wrangler request fails, check if token is dead, and print err about that if so.
[x] 2. If token will expire soon (e.g. 2 weeks), print warning.
[ ] 3. Check on server start
[ ] 4. Optional: Check on schedule (e.g. daily) (harder)

Additional details

Here's how:

➜ curl -XGET https://unite.nih.gov/multipass/api/me -H "Authorization: Bearer $PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN"
{
  "id": "6387db50-9f12-48d2-b7dc-e8e88fdf51e3",
  "username": "[email protected]",
  "attributes": {
    "multipass:organization": [
      "NIH"
    ],
    "multipass:email:primary": [
      "[email protected]"
    ],
    "multipass:organization-rid": [
      "ri.multipass..organization.73f45502-dee1-46e9-ab49-64a738b13971"
    ],
    "upn": [
      "[email protected]"
    ],
    "multipass:realm": [
      "nih-adfs"
    ],
    "multipass:realm-name": [
      "NIH Auth"
    ]
  }
}

➜ curl -XGET https://unite.nih.gov/multipass/api/token/ttl -H "Authorization: Bearer $PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN"
12654423

That last number is time-to-live in seconds.

I'm not sure where to run these checks. When the time is getting close, we need to ask Mariam Deacy to generate a new one for us.

urgency:1/3 ease:3/3

opened by Sigfried 2

SAML Login

Overview

We would like to allow users to log in using NCATS unified authentication (example), or something like it.

We met w/ the JHU cloud infra team today, and Patrick Le mentioned to us that Hopkins is part of some federated network. It uses "shibboleth" along w/ SAML. There is a different team that can help us work through this, but this would require some programming on our end.
urgency:1/3 ease:1/3

opened by joeflack4 1
RDBMS setup
Overview

Set up RDBMS to serve data to the REST API, rather than loading datasets as global variables. Hopefully PostgreSQL. But given that the JHU hosting team cannot provide special services for that, we may consider another option.

Subtask list

[x] 1. Complete local setup

[ ] 2. Setup documentation

[ ] 3. Complete setup on server, and update documentation if changes

[x] 4. Optimize schema: e.g. TEXT --> VARCHAR(n)

[x] 5. Load dataset data

[ ] 6. Create pipeline for loading/refreshing dataset data

[ ] #186

[ ] 8. Build functions or whatever to populate mySQL tables with data from N3C ontology API. (Could we just do dataset data once and then keep up-to-date using ontology API?)

[ ] #187

urgency:2/3
opened by Sigfried 1
CRUD: (i) update: `concepts` on `concept_set_version`, (ii) create: `concept_set_version`
Sub-tasks

[ ] 1. Backend (@joeflack4)

[ ] 2. Frontend (@sigfried)

[ ] 3. Allow for other users (@amin)

Sub-task details

1. Backend

[x] enclave_wrangler functions

[ ] Unit tests (still need completion w/ teardowns)

[x] app.py routes (Joe: I think they're done)

2. Frontend

Pending backend.

3. Allow for other users

Amin needs to change permissions so that other users, using their TOKENs, I believe, can make calls.
new feature urgency:3/3 ease:3/3
opened by Sigfried 5

Owner

Health Open Terminology Ecosystem

GitHub

csv2ir is a script to convert ir .csv files to .ir files for the flipper.

csv2ir csv2ir is a script to convert ir .csv files to .ir files for the flipper. For a repo of .ir files, please see https://github.com/logickworkshop

38 Dec 31, 2022

LightCSV - This CSV reader is implemented in just pure Python.

LightCSV Simple light CSV reader This CSV reader is implemented in just pure Python. It allows to specify a separator, a quote char and column titles

6 Mar 5, 2022

Transforme rapidamente seu arquivo CSV (de qualquer tamanho) para SQL de forma rápida.

Transformador de CSV para SQL Transforme rapidamente seu arquivo CSV (de qualquer tamanho) para SQL de forma rápida, e com isso insira seus dados usan

4 Oct 17, 2022

Sheet Data Image/PDF-to-CSV Converter

5 Nov 22, 2021

CSV To VCF (Multiples en un archivo)

CSV To VCF Convierte archivo CSV a Tarjeta VCF (varias en una) How to use En main.py debes reemplazar CONTACTOS.csv por tu archivo csv, y debes respet

2 Jan 12, 2022

Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

3 Apr 28, 2022

CSV-Handler written in Python3

CSVHandler This code allows you to work intelligently with CSV files. A file in CSV syntax is converted into several lists, which are combined in a to

1 Jan 13, 2022

Nmap XML output to CSV and HTTP/HTTPS URLS.

xml-to-csv-url Convert NMAP's XML output to CSV file and print URL addresses for HTTP/HTTPS ports. NOTE: OS Version Parsing is not working properly ye

1 Dec 21, 2021

CleverCSV is a Python package for handling messy CSV files.

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

1k Dec 19, 2022

Automatically generates a TypeQL script for doing entity and relationship insertions from a .csv file, so you don't have to mess with writing TypeQL.

3 Feb 9, 2022

Generates a clean .txt file of contents of a 3 lined csv file

Generates a clean .txt file of contents of a 3 lined csv file. File contents is the .gml file of some function which stores the contents of the csv as a map.

1 Jan 9, 2022

Remove [x]_ from StudIP zip Archives and archive_filelist.csv completely

This tool removes the "[x]_" at the beginning of StudIP zip Archives. It also deletes the "archive_filelist.csv" file

1 Jan 19, 2022

Test app for importing contact information in CSV files.

Contact Import TestApp Test app for importing contact information in CSV files. Explore the docs » · Report Bug · Request Feature Table of Contents Ab

1 Feb 6, 2022

A simple Python code that takes input from a csv file and makes it into a vcf file.

Contacts-Maker A simple Python code that takes input from a csv file and makes it into a vcf file. Imagine a college or a large community where each y

1 Feb 13, 2022

Python Fstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems.

PyFstab Generator PyFstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems. NOTE : Th

2 Nov 9, 2021

This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that have that extension.

FileBulk This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that h

1 Jun 26, 2022

Various converters to convert value sets from CSV to JSON, etc.

Related tags

Overview

ValueSet Converters

Set up / installation

Tools

1. CSV to FHIR JSON

Syntax

Example

2. VSAC to OMOP/FHIR JSON

Syntax

Comments

Description

Tasks

Task 2 Options

~a. Merge the value sets~

~b. Give the value sets different names~

~c. Only upload 1 of the value sets~

d. Create a new set (when grouping not available)

e. Use previously created combination value sets / OIDs (when grouping set available)

Description

Related

Overview

Options

User experience & code flow description

Updating an existing cset

Creating a new cset

Concerns

Overview

Sub-tasks

Additional details

Overview

Overview

Subtask list

Sub-tasks

Sub-task details

1. Backend

2. Frontend

3. Allow for other users

Owner

Health Open Terminology Ecosystem

csv2ir is a script to convert ir .csv files to .ir files for the flipper.

LightCSV - This CSV reader is implemented in just pure Python.

Transforme rapidamente seu arquivo CSV (de qualquer tamanho) para SQL de forma rápida.

Sheet Data Image/PDF-to-CSV Converter

CSV To VCF (Multiples en un archivo)

Add Ranges and page numbers to IIIF Manifest from a CSV.

CSV-Handler written in Python3

Nmap XML output to CSV and HTTP/HTTPS URLS.

CleverCSV is a Python package for handling messy CSV files.

Automatically generates a TypeQL script for doing entity and relationship insertions from a .csv file, so you don't have to mess with writing TypeQL.

Generates a clean .txt file of contents of a 3 lined csv file

Remove [x]_ from StudIP zip Archives and archive_filelist.csv completely

Test app for importing contact information in CSV files.

A simple Python code that takes input from a csv file and makes it into a vcf file.

Python Fstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems.

This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that have that extension.

A bot discord that can create directories, file, rename, move, navigate throw directories etc....

Vericopy - This Python script provides various usage modes for secure local file copying and hashing.

Various technical documentation, in electronically parseable format