Part 3 ETL Tool
The ETL-Tool is the command line tool carrot.etl
, it just takes in a yaml
file to configure how to run
!carrot etl --help
Usage: carrot etl [OPTIONS] COMMAND [ARGS]...
Command group for running the full ETL of a dataset
Options:
--config, --config-file TEXT specify a yaml configuration file
-d, --daemon run the ETL as a daemon process
-l, --log-file TEXT specify the log file to write to
--help Show this message and exit.
Commands:
check-tables check tables
clean-table clean (delete all rows) of a given table name
clean-tables clean (delete all rows) in the tables defined in the...
create-tables create new bclink tables
delete-tables delete some tables
A very basic configuration for running locally (effectively just running the T-Tool carrot.run map
on one input)
definition = """
transform:
settings: &settings
output: output/
rules: ../data/rules.json
data:
- input: ../data/part1
<<: *settings
"""
with open('config.yml','w') as f:
f.write(definition)
!carrot etl --config config.yml
2022-06-17 14:48:53 - run_etl - INFO - running etl on config.yml (last modified: 1655473730.8621333)
2022-06-17 14:48:53 - _run_data - WARNING - output/ exists!
2022-06-17 14:48:53 - LocalDataCollection - INFO - DataCollection Object Created
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering Demographics.csv [<carrot.io.common.DataBrick object at 0x1273f1130>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering GP_Records.csv [<carrot.io.common.DataBrick object at 0x1273f12e0>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering Hospital_Visit.csv [<carrot.io.common.DataBrick object at 0x1273f1550>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering Serology.csv [<carrot.io.common.DataBrick object at 0x1273f17c0>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering Symptoms.csv [<carrot.io.common.DataBrick object at 0x1273f19d0>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - Registering Vaccinations.csv [<carrot.io.common.DataBrick object at 0x1273f1c10>]
2022-06-17 14:48:54 - LocalDataCollection - INFO - DataCollection Object Created
2022-06-17 14:48:54 - CommonDataModel - INFO - CommonDataModel (5.3.1) created with co-connect-tools version 0.0.0
2022-06-17 14:48:54 - CommonDataModel - INFO - Running with an DataCollection object
2022-06-17 14:48:54 - CommonDataModel - INFO - Turning on automatic cdm column filling
2022-06-17 14:48:54 - LocalDataCollection - WARNING - Loading existing person ids from...
2022-06-17 14:48:54 - LocalDataCollection - WARNING - ['output/person_ids.0x124cef850.2022-06-17T133735.tsv', 'output/person_ids.0x124cef970.2022-06-17T133735.tsv']
2022-06-17 14:48:54 - CommonDataModel - INFO - Added MALE 3025 of type person
2022-06-17 14:48:54 - CommonDataModel - INFO - Added FEMALE 3026 of type person
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Antibody 3027 of type observation
2022-06-17 14:48:54 - CommonDataModel - INFO - Added H/O: heart failure 3043 of type observation
2022-06-17 14:48:54 - CommonDataModel - INFO - Added 2019-nCoV 3044 of type observation
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Cancer 3045 of type observation
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Headache 3028 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Fatigue 3029 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Dizziness 3030 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Cough 3031 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Fever 3032 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Muscle pain 3033 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Pneumonia 3042 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Mental health problem 3046 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Mental disorder 3047 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Type 2 diabetes mellitus 3048 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Ischemic heart disease 3049 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added Hypertensive disorder 3050 of type condition_occurrence
2022-06-17 14:48:54 - CommonDataModel - INFO - Added COVID-19 vaccine 3034 of type drug_exposure
2022-06-17 14:48:54 - CommonDataModel - INFO - Added COVID-19 vaccine 3035 of type drug_exposure
2022-06-17 14:48:54 - CommonDataModel - INFO - Added COVID-19 vaccine 3036 of type drug_exposure
2022-06-17 14:48:54 - CommonDataModel - INFO - Added SARS-CoV-2 (COVID-19) vaccine, mRNA-1273 0.2 MG/ML Injectable Suspension 3040 of type drug_exposure
2022-06-17 14:48:54 - CommonDataModel - INFO - Added SARS-CoV-2 (COVID-19) vaccine, mRNA-BNT162b2 0.1 MG/ML Injectable Suspension 3041 of type drug_exposure
2022-06-17 14:48:54 - CommonDataModel - INFO - Starting processing in order: ['person', 'observation', 'condition_occurrence', 'drug_exposure']
2022-06-17 14:48:54 - CommonDataModel - INFO - Number of objects to process for each table...
{
"person": 2,
"observation": 4,
"condition_occurrence": 12,
"drug_exposure": 5
}
2022-06-17 14:48:54 - CommonDataModel - INFO - for person: found 2 objects
2022-06-17 14:48:54 - CommonDataModel - INFO - working on person
2022-06-17 14:48:54 - CommonDataModel - INFO - starting on MALE 3025
2022-06-17 14:48:54 - Person - INFO - Called apply_rules
2022-06-17 14:48:54 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Demographics.csv' for the first time
could not convert string to float: 'na'
2022-06-17 14:48:54 - Person - INFO - Mapped birth_datetime
2022-06-17 14:48:54 - Person - INFO - Mapped gender_concept_id
2022-06-17 14:48:54 - Person - INFO - Mapped gender_source_concept_id
2022-06-17 14:48:54 - Person - INFO - Mapped gender_source_value
2022-06-17 14:48:54 - Person - INFO - Mapped person_id
2022-06-17 14:48:54 - Person - WARNING - Requiring non-null values in gender_concept_id removed 438 rows, leaving 562 rows.
2022-06-17 14:48:54 - Person - WARNING - Requiring non-null values in birth_datetime removed 1 rows, leaving 561 rows.
2022-06-17 14:48:54 - Person - INFO - Automatically formatting data columns.
2022-06-17 14:48:54 - Person - INFO - created df (0x1274e2c40)[MALE_3025]
2022-06-17 14:48:54 - CommonDataModel - INFO - finished MALE 3025 (0x1274e2c40) ... 1/2 completed, 561 rows
2022-06-17 14:48:54 - CommonDataModel - ERROR - 'pk1' already found in the person_id_masker
2022-06-17 14:48:54 - CommonDataModel - ERROR - '1' assigned to this already
2022-06-17 14:48:54 - CommonDataModel - ERROR - was trying to set '997'
2022-06-17 14:48:54 - CommonDataModel - ERROR - Most likely cause is this is duplicate data!
2022-06-17 14:48:54 - _run_data - ERROR - Duplicate person found!
2022-06-17 14:48:54 - _run_data - ERROR - failed to map ['../data/part1/Blood_Test.csv', '../data/part1/Demographics.csv', '../data/part1/GP_Records.csv', '../data/part1/Hospital_Visit.csv', '../data/part1/Serology.csv', '../data/part1/Symptoms.csv', '../data/part1/Vaccinations.csv', '../data/part1/pks.csv'] because there were people
2022-06-17 14:48:54 - _run_data - ERROR - already processed and present in existing data. Check the person_id map/lookup!
Changing to have a load tab to configure the output for bclink:
definition = """
load: &load-bclink
cache: ./output/cache/
bclink:
dry_run: true
transform:
settings: &settings
output: *load-bclink
rules: ../data/rules.json
data:
- input: ../data/part1
<<: *settings
"""
with open('config.yml','w') as f:
f.write(definition)
!carrot etl --config config.yml
2022-06-17 14:48:57 - run_etl - INFO - running etl on config.yml (last modified: 1655473734.7268007)
2022-06-17 14:48:57 - LocalDataCollection - INFO - DataCollection Object Created
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering Demographics.csv [<carrot.io.common.DataBrick object at 0x114023130>]
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering GP_Records.csv [<carrot.io.common.DataBrick object at 0x1140232e0>]
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering Hospital_Visit.csv [<carrot.io.common.DataBrick object at 0x114023550>]
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering Serology.csv [<carrot.io.common.DataBrick object at 0x1140237c0>]
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering Symptoms.csv [<carrot.io.common.DataBrick object at 0x1140239d0>]
2022-06-17 14:48:57 - LocalDataCollection - INFO - Registering Vaccinations.csv [<carrot.io.common.DataBrick object at 0x114023c10>]
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - setup bclink collection
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'condition_occurrence' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - condition_occurrence (condition_occurrence) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'death' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - death (death) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'drug_exposure' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - drug_exposure (drug_exposure) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'measurement' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - measurement (measurement) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'observation' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - observation (observation) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'person' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - person (person) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'procedure_occurrence' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - procedure_occurrence (procedure_occurrence) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'specimen' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - specimen (specimen) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'visit_occurrence' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - visit_occurrence (visit_occurrence) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'person_ids' ) bclink
2022-06-17 14:48:57 - BCLinkHelpers - INFO - person_ids (person_ids) already exists --> all good
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM condition_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM death bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM drug_exposure bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM measurement bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM observation bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM person bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM procedure_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM specimen bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM visit_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM person_ids bclink
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - DataCollection Object Created
2022-06-17 14:48:57 - CommonDataModel - INFO - CommonDataModel (5.3.1) created with co-connect-tools version 0.0.0
2022-06-17 14:48:57 - CommonDataModel - INFO - Running with an DataCollection object
2022-06-17 14:48:57 - CommonDataModel - INFO - Turning on automatic cdm column filling
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT * FROM person_ids bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM condition_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'condition_occurrence' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM condition_occurrence ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM death bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'death' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM death ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM drug_exposure bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'drug_exposure' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM drug_exposure ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM measurement bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'measurement' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM measurement ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM observation bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'observation' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM observation ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM person bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'person' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM person ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM procedure_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'procedure_occurrence' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM procedure_occurrence ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM specimen bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'specimen' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM specimen ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT count(*) FROM visit_occurrence bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT column_name FROM INFORMATION_SCHEMA. COLUMNS WHERE table_name = 'visit_occurrence' LIMIT 1 bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - bc_sqlselect --user=bclink --query=SELECT person_id FROM visit_occurrence ORDER BY -person_id LIMIT 1; bclink
2022-06-17 14:48:57 - CommonDataModel - INFO - Added MALE 3025 of type person
2022-06-17 14:48:57 - CommonDataModel - INFO - Added FEMALE 3026 of type person
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Antibody 3027 of type observation
2022-06-17 14:48:57 - CommonDataModel - INFO - Added H/O: heart failure 3043 of type observation
2022-06-17 14:48:57 - CommonDataModel - INFO - Added 2019-nCoV 3044 of type observation
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Cancer 3045 of type observation
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Headache 3028 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Fatigue 3029 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Dizziness 3030 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Cough 3031 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Fever 3032 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Muscle pain 3033 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Pneumonia 3042 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Mental health problem 3046 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Mental disorder 3047 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Type 2 diabetes mellitus 3048 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Ischemic heart disease 3049 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added Hypertensive disorder 3050 of type condition_occurrence
2022-06-17 14:48:57 - CommonDataModel - INFO - Added COVID-19 vaccine 3034 of type drug_exposure
2022-06-17 14:48:57 - CommonDataModel - INFO - Added COVID-19 vaccine 3035 of type drug_exposure
2022-06-17 14:48:57 - CommonDataModel - INFO - Added COVID-19 vaccine 3036 of type drug_exposure
2022-06-17 14:48:57 - CommonDataModel - INFO - Added SARS-CoV-2 (COVID-19) vaccine, mRNA-1273 0.2 MG/ML Injectable Suspension 3040 of type drug_exposure
2022-06-17 14:48:57 - CommonDataModel - INFO - Added SARS-CoV-2 (COVID-19) vaccine, mRNA-BNT162b2 0.1 MG/ML Injectable Suspension 3041 of type drug_exposure
2022-06-17 14:48:57 - CommonDataModel - INFO - Starting processing in order: ['person', 'observation', 'condition_occurrence', 'drug_exposure']
2022-06-17 14:48:57 - CommonDataModel - INFO - Number of objects to process for each table...
{
"person": 2,
"observation": 4,
"condition_occurrence": 12,
"drug_exposure": 5
}
2022-06-17 14:48:57 - CommonDataModel - INFO - for person: found 2 objects
2022-06-17 14:48:57 - CommonDataModel - INFO - working on person
2022-06-17 14:48:57 - CommonDataModel - INFO - starting on MALE 3025
2022-06-17 14:48:57 - Person - INFO - Called apply_rules
2022-06-17 14:48:57 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Demographics.csv' for the first time
could not convert string to float: 'na'
2022-06-17 14:48:57 - Person - INFO - Mapped birth_datetime
2022-06-17 14:48:57 - Person - INFO - Mapped gender_concept_id
2022-06-17 14:48:57 - Person - INFO - Mapped gender_source_concept_id
2022-06-17 14:48:57 - Person - INFO - Mapped gender_source_value
2022-06-17 14:48:57 - Person - INFO - Mapped person_id
2022-06-17 14:48:57 - Person - WARNING - Requiring non-null values in gender_concept_id removed 438 rows, leaving 562 rows.
2022-06-17 14:48:57 - Person - WARNING - Requiring non-null values in birth_datetime removed 1 rows, leaving 561 rows.
2022-06-17 14:48:57 - Person - INFO - Automatically formatting data columns.
2022-06-17 14:48:57 - Person - INFO - created df (0x1141040a0)[MALE_3025]
2022-06-17 14:48:57 - CommonDataModel - INFO - finished MALE 3025 (0x1141040a0) ... 1/2 completed, 561 rows
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - saving person_ids.0x1140ee7f0.2022-06-17T134857 to ./output/cache//person_ids.0x1140ee7f0.2022-06-17T134857.tsv
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - dataset_tool --load --table=person_ids --user=data --data_file=./output/cache//person_ids.0x1140ee7f0.2022-06-17T134857.tsv --support --bcqueue bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=person_ids --user=data --database=bclink
2022-06-17 14:48:57 - CommonDataModel - INFO - saving dataframe (0x1141040a0) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - saving person.MALE_3025.0x1141040a0.2022-06-17T134857 to ./output/cache//person.MALE_3025.0x1141040a0.2022-06-17T134857.tsv
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - dataset_tool --load --table=person --user=data --data_file=./output/cache//person.MALE_3025.0x1141040a0.2022-06-17T134857.tsv --support --bcqueue bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=person --user=data --database=bclink
2022-06-17 14:48:57 - CommonDataModel - INFO - starting on FEMALE 3026
2022-06-17 14:48:57 - Person - INFO - Called apply_rules
could not convert string to float: 'na'
2022-06-17 14:48:57 - Person - INFO - Mapped birth_datetime
2022-06-17 14:48:57 - Person - INFO - Mapped gender_concept_id
2022-06-17 14:48:57 - Person - INFO - Mapped gender_source_concept_id
2022-06-17 14:48:57 - Person - INFO - Mapped gender_source_value
2022-06-17 14:48:57 - Person - INFO - Mapped person_id
2022-06-17 14:48:57 - Person - WARNING - Requiring non-null values in gender_concept_id removed 565 rows, leaving 435 rows.
2022-06-17 14:48:57 - Person - INFO - Automatically formatting data columns.
2022-06-17 14:48:57 - Person - INFO - created df (0x114104340)[FEMALE_3026]
2022-06-17 14:48:57 - CommonDataModel - INFO - finished FEMALE 3026 (0x114104340) ... 2/2 completed, 435 rows
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - saving person_ids.0x1140ee7c0.2022-06-17T134857 to ./output/cache//person_ids.0x1140ee7c0.2022-06-17T134857.tsv
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - dataset_tool --load --table=person_ids --user=data --data_file=./output/cache//person_ids.0x1140ee7c0.2022-06-17T134857.tsv --support --bcqueue bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=person_ids --user=data --database=bclink
2022-06-17 14:48:57 - CommonDataModel - INFO - saving dataframe (0x114104340) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - saving person.FEMALE_3026.0x114104340.2022-06-17T134857 to ./output/cache//person.FEMALE_3026.0x114104340.2022-06-17T134857.tsv
2022-06-17 14:48:57 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - dataset_tool --load --table=person --user=data --data_file=./output/cache//person.FEMALE_3026.0x114104340.2022-06-17T134857.tsv --support --bcqueue bclink
2022-06-17 14:48:57 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=person --user=data --database=bclink
2022-06-17 14:48:57 - CommonDataModel - INFO - finalised person on iteration 0 producing 996 rows from 2 tables
2022-06-17 14:48:57 - LocalDataCollection - INFO - Getting next chunk of data
2022-06-17 14:48:57 - LocalDataCollection - INFO - All input files for this object have now been used.
2022-06-17 14:48:57 - LocalDataCollection - INFO - resetting used bricks
2022-06-17 14:48:57 - CommonDataModel - INFO - for observation: found 4 objects
2022-06-17 14:48:57 - CommonDataModel - INFO - working on observation
2022-06-17 14:48:57 - CommonDataModel - INFO - starting on Antibody 3027
2022-06-17 14:48:57 - Observation - INFO - Called apply_rules
2022-06-17 14:48:57 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Serology.csv' for the first time
2022-06-17 14:48:57 - Observation - INFO - Mapped observation_concept_id
2022-06-17 14:48:57 - Observation - INFO - Mapped observation_datetime
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_value
2022-06-17 14:48:58 - Observation - INFO - Mapped person_id
2022-06-17 14:48:58 - Observation - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - Observation - INFO - created df (0x114177d00)[Antibody_3027]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished Antibody 3027 (0x114177d00) ... 1/4 completed, 413 rows
2022-06-17 14:48:58 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:48:58 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:48:58 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:48:58 - CommonDataModel - ERROR - 410/413 were good, 3 studies are removed.
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x114177d00) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving observation.Antibody_3027.0x114177d00.2022-06-17T134858 to ./output/cache//observation.Antibody_3027.0x114177d00.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=observation --user=data --data_file=./output/cache//observation.Antibody_3027.0x114177d00.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=observation --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on H/O: heart failure 3043
2022-06-17 14:48:58 - Observation - INFO - Called apply_rules
2022-06-17 14:48:58 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Hospital_Visit.csv' for the first time
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_datetime
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_value
2022-06-17 14:48:58 - Observation - INFO - Mapped person_id
2022-06-17 14:48:58 - Observation - WARNING - Requiring non-null values in observation_concept_id removed 937 rows, leaving 263 rows.
2022-06-17 14:48:58 - Observation - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - Observation - INFO - created df (0x114145a60)[H_O_heart_failure_3043]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished H/O: heart failure 3043 (0x114145a60) ... 2/4 completed, 263 rows
2022-06-17 14:48:58 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:48:58 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:48:58 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:48:58 - CommonDataModel - ERROR - 262/263 were good, 1 studies are removed.
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x114145a60) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving observation.H_O_heart_failure_3043.0x114145a60.2022-06-17T134858 to ./output/cache//observation.H_O_heart_failure_3043.0x114145a60.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=observation --user=data --data_file=./output/cache//observation.H_O_heart_failure_3043.0x114145a60.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=observation --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on 2019-nCoV 3044
2022-06-17 14:48:58 - Observation - INFO - Called apply_rules
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_datetime
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_value
2022-06-17 14:48:58 - Observation - INFO - Mapped person_id
2022-06-17 14:48:58 - Observation - WARNING - Requiring non-null values in observation_concept_id removed 1023 rows, leaving 177 rows.
2022-06-17 14:48:58 - Observation - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - Observation - INFO - created df (0x114191d90)[2019_nCoV_3044]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished 2019-nCoV 3044 (0x114191d90) ... 3/4 completed, 177 rows
2022-06-17 14:48:58 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:48:58 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:48:58 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:48:58 - CommonDataModel - ERROR - 176/177 were good, 1 studies are removed.
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x114191d90) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving observation.2019_nCoV_3044.0x114191d90.2022-06-17T134858 to ./output/cache//observation.2019_nCoV_3044.0x114191d90.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=observation --user=data --data_file=./output/cache//observation.2019_nCoV_3044.0x114191d90.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=observation --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on Cancer 3045
2022-06-17 14:48:58 - Observation - INFO - Called apply_rules
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_datetime
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_concept_id
2022-06-17 14:48:58 - Observation - INFO - Mapped observation_source_value
2022-06-17 14:48:58 - Observation - INFO - Mapped person_id
2022-06-17 14:48:58 - Observation - WARNING - Requiring non-null values in observation_concept_id removed 851 rows, leaving 349 rows.
2022-06-17 14:48:58 - Observation - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - Observation - INFO - created df (0x114191160)[Cancer_3045]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished Cancer 3045 (0x114191160) ... 4/4 completed, 349 rows
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x114191160) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving observation.Cancer_3045.0x114191160.2022-06-17T134858 to ./output/cache//observation.Cancer_3045.0x114191160.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=observation --user=data --data_file=./output/cache//observation.Cancer_3045.0x114191160.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=observation --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - finalised observation on iteration 0 producing 1197 rows from 4 tables
2022-06-17 14:48:58 - LocalDataCollection - INFO - Getting next chunk of data
2022-06-17 14:48:58 - LocalDataCollection - INFO - All input files for this object have now been used.
2022-06-17 14:48:58 - LocalDataCollection - INFO - resetting used bricks
2022-06-17 14:48:58 - CommonDataModel - INFO - for condition_occurrence: found 12 objects
2022-06-17 14:48:58 - CommonDataModel - INFO - working on condition_occurrence
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on Headache 3028
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:58 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Symptoms.csv' for the first time
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 55 rows, leaving 275 rows.
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_start_datetime removed 1 rows, leaving 274 rows.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - created df (0x114191df0)[Headache_3028]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished Headache 3028 (0x114191df0) ... 1/12 completed, 274 rows
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x114191df0) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving condition_occurrence.Headache_3028.0x114191df0.2022-06-17T134858 to ./output/cache//condition_occurrence.Headache_3028.0x114191df0.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Headache_3028.0x114191df0.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on Fatigue 3029
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 95 rows, leaving 235 rows.
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_start_datetime removed 1 rows, leaving 234 rows.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - created df (0x11419ca30)[Fatigue_3029]
2022-06-17 14:48:58 - CommonDataModel - INFO - finished Fatigue 3029 (0x11419ca30) ... 2/12 completed, 234 rows
2022-06-17 14:48:58 - CommonDataModel - INFO - saving dataframe (0x11419ca30) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - saving condition_occurrence.Fatigue_3029.0x11419ca30.2022-06-17T134858 to ./output/cache//condition_occurrence.Fatigue_3029.0x11419ca30.2022-06-17T134858.tsv
2022-06-17 14:48:58 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Fatigue_3029.0x11419ca30.2022-06-17T134858.tsv --support --bcqueue bclink
2022-06-17 14:48:58 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:58 - CommonDataModel - INFO - starting on Dizziness 3030
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 195 rows, leaving 135 rows.
2022-06-17 14:48:58 - ConditionOccurrence - WARNING - Requiring non-null values in condition_start_datetime removed 1 rows, leaving 134 rows.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:58 - ConditionOccurrence - INFO - created df (0x114191b80)[Dizziness_3030]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Dizziness 3030 (0x114191b80) ... 3/12 completed, 134 rows
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x114191b80) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Dizziness_3030.0x114191b80.2022-06-17T134859 to ./output/cache//condition_occurrence.Dizziness_3030.0x114191b80.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Dizziness_3030.0x114191b80.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Cough 3031
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 100 rows, leaving 230 rows.
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_start_datetime removed 1 rows, leaving 229 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x1141b5520)[Cough_3031]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Cough 3031 (0x1141b5520) ... 4/12 completed, 229 rows
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x1141b5520) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Cough_3031.0x1141b5520.2022-06-17T134859 to ./output/cache//condition_occurrence.Cough_3031.0x1141b5520.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Cough_3031.0x1141b5520.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Fever 3032
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 265 rows, leaving 65 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x1141b5a60)[Fever_3032]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Fever 3032 (0x1141b5a60) ... 5/12 completed, 65 rows
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x1141b5a60) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Fever_3032.0x1141b5a60.2022-06-17T134859 to ./output/cache//condition_occurrence.Fever_3032.0x1141b5a60.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Fever_3032.0x1141b5a60.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Muscle pain 3033
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 295 rows, leaving 35 rows.
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_start_datetime removed 1 rows, leaving 34 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x1141c3f10)[Muscle_pain_3033]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Muscle pain 3033 (0x1141c3f10) ... 6/12 completed, 34 rows
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x1141c3f10) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Muscle_pain_3033.0x1141c3f10.2022-06-17T134859 to ./output/cache//condition_occurrence.Muscle_pain_3033.0x1141c3f10.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Muscle_pain_3033.0x1141c3f10.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Pneumonia 3042
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Hospital_Visit.csv' for the first time
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1029 rows, leaving 171 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x1141a7f10)[Pneumonia_3042]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Pneumonia 3042 (0x1141a7f10) ... 7/12 completed, 171 rows
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x1141a7f10) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Pneumonia_3042.0x1141a7f10.2022-06-17T134859 to ./output/cache//condition_occurrence.Pneumonia_3042.0x1141a7f10.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Pneumonia_3042.0x1141a7f10.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Mental health problem 3046
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - LocalDataCollection - INFO - Retrieving initial dataframe for 'GP_Records.csv' for the first time
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1508 rows, leaving 444 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x1141ce160)[Mental_health_problem_3046]
2022-06-17 14:48:59 - CommonDataModel - INFO - finished Mental health problem 3046 (0x1141ce160) ... 8/12 completed, 444 rows
2022-06-17 14:48:59 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:48:59 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:48:59 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:48:59 - CommonDataModel - ERROR - 441/444 were good, 3 studies are removed.
2022-06-17 14:48:59 - CommonDataModel - INFO - saving dataframe (0x1141ce160) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - saving condition_occurrence.Mental_health_problem_3046.0x1141ce160.2022-06-17T134859 to ./output/cache//condition_occurrence.Mental_health_problem_3046.0x1141ce160.2022-06-17T134859.tsv
2022-06-17 14:48:59 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Mental_health_problem_3046.0x1141ce160.2022-06-17T134859.tsv --support --bcqueue bclink
2022-06-17 14:48:59 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:48:59 - CommonDataModel - INFO - starting on Mental disorder 3047
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:48:59 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1508 rows, leaving 444 rows.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:48:59 - ConditionOccurrence - INFO - created df (0x114191e80)[Mental_disorder_3047]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished Mental disorder 3047 (0x114191e80) ... 9/12 completed, 444 rows
2022-06-17 14:49:00 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:49:00 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:49:00 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:49:00 - CommonDataModel - ERROR - 441/444 were good, 3 studies are removed.
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x114191e80) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving condition_occurrence.Mental_disorder_3047.0x114191e80.2022-06-17T134900 to ./output/cache//condition_occurrence.Mental_disorder_3047.0x114191e80.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Mental_disorder_3047.0x114191e80.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on Type 2 diabetes mellitus 3048
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:49:00 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1688 rows, leaving 264 rows.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - created df (0x1141ed970)[Type_2_diabetes_mellitus_3048]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished Type 2 diabetes mellitus 3048 (0x1141ed970) ... 10/12 completed, 264 rows
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x1141ed970) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving condition_occurrence.Type_2_diabetes_mellitus_3048.0x1141ed970.2022-06-17T134900 to ./output/cache//condition_occurrence.Type_2_diabetes_mellitus_3048.0x1141ed970.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Type_2_diabetes_mellitus_3048.0x1141ed970.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on Ischemic heart disease 3049
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:49:00 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1738 rows, leaving 214 rows.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - created df (0x1141e4c70)[Ischemic_heart_disease_3049]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished Ischemic heart disease 3049 (0x1141e4c70) ... 11/12 completed, 214 rows
2022-06-17 14:49:00 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:49:00 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:49:00 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:49:00 - CommonDataModel - ERROR - 213/214 were good, 1 studies are removed.
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x1141e4c70) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving condition_occurrence.Ischemic_heart_disease_3049.0x1141e4c70.2022-06-17T134900 to ./output/cache//condition_occurrence.Ischemic_heart_disease_3049.0x1141e4c70.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Ischemic_heart_disease_3049.0x1141e4c70.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on Hypertensive disorder 3050
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Called apply_rules
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_end_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_concept_id
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_source_value
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped condition_start_datetime
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Mapped person_id
2022-06-17 14:49:00 - ConditionOccurrence - WARNING - Requiring non-null values in condition_concept_id removed 1822 rows, leaving 130 rows.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - ConditionOccurrence - INFO - created df (0x1141cea00)[Hypertensive_disorder_3050]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished Hypertensive disorder 3050 (0x1141cea00) ... 12/12 completed, 130 rows
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x1141cea00) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving condition_occurrence.Hypertensive_disorder_3050.0x1141cea00.2022-06-17T134900 to ./output/cache//condition_occurrence.Hypertensive_disorder_3050.0x1141cea00.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=condition_occurrence --user=data --data_file=./output/cache//condition_occurrence.Hypertensive_disorder_3050.0x1141cea00.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=condition_occurrence --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - finalised condition_occurrence on iteration 0 producing 2630 rows from 12 tables
2022-06-17 14:49:00 - LocalDataCollection - INFO - Getting next chunk of data
2022-06-17 14:49:00 - LocalDataCollection - INFO - All input files for this object have now been used.
2022-06-17 14:49:00 - LocalDataCollection - INFO - resetting used bricks
2022-06-17 14:49:00 - CommonDataModel - INFO - for drug_exposure: found 5 objects
2022-06-17 14:49:00 - CommonDataModel - INFO - working on drug_exposure
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on COVID-19 vaccine 3034
2022-06-17 14:49:00 - DrugExposure - INFO - Called apply_rules
2022-06-17 14:49:00 - LocalDataCollection - INFO - Retrieving initial dataframe for 'Vaccinations.csv' for the first time
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_end_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_start_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_value
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped person_id
2022-06-17 14:49:00 - DrugExposure - WARNING - Requiring non-null values in drug_concept_id removed 475 rows, leaving 245 rows.
2022-06-17 14:49:00 - DrugExposure - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - DrugExposure - INFO - created df (0x114215eb0)[COVID_19_vaccine_3034]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished COVID-19 vaccine 3034 (0x114215eb0) ... 1/5 completed, 245 rows
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x114215eb0) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving drug_exposure.COVID_19_vaccine_3034.0x114215eb0.2022-06-17T134900 to ./output/cache//drug_exposure.COVID_19_vaccine_3034.0x114215eb0.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=drug_exposure --user=data --data_file=./output/cache//drug_exposure.COVID_19_vaccine_3034.0x114215eb0.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=drug_exposure --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on COVID-19 vaccine 3035
2022-06-17 14:49:00 - DrugExposure - INFO - Called apply_rules
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_end_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_start_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_value
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped person_id
2022-06-17 14:49:00 - DrugExposure - WARNING - Requiring non-null values in drug_concept_id removed 494 rows, leaving 226 rows.
2022-06-17 14:49:00 - DrugExposure - WARNING - Requiring non-null values in drug_exposure_start_datetime removed 1 rows, leaving 225 rows.
2022-06-17 14:49:00 - DrugExposure - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - DrugExposure - INFO - created df (0x1141ed2b0)[COVID_19_vaccine_3035]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished COVID-19 vaccine 3035 (0x1141ed2b0) ... 2/5 completed, 225 rows
2022-06-17 14:49:00 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:49:00 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:49:00 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:49:00 - CommonDataModel - ERROR - 224/225 were good, 1 studies are removed.
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x1141ed2b0) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving drug_exposure.COVID_19_vaccine_3035.0x1141ed2b0.2022-06-17T134900 to ./output/cache//drug_exposure.COVID_19_vaccine_3035.0x1141ed2b0.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=drug_exposure --user=data --data_file=./output/cache//drug_exposure.COVID_19_vaccine_3035.0x1141ed2b0.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=drug_exposure --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on COVID-19 vaccine 3036
2022-06-17 14:49:00 - DrugExposure - INFO - Called apply_rules
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_end_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_start_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_value
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped person_id
2022-06-17 14:49:00 - DrugExposure - WARNING - Requiring non-null values in drug_concept_id removed 471 rows, leaving 249 rows.
2022-06-17 14:49:00 - DrugExposure - INFO - Automatically formatting data columns.
2022-06-17 14:49:00 - DrugExposure - INFO - created df (0x1141ff310)[COVID_19_vaccine_3036]
2022-06-17 14:49:00 - CommonDataModel - INFO - finished COVID-19 vaccine 3036 (0x1141ff310) ... 3/5 completed, 249 rows
2022-06-17 14:49:00 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:49:00 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:49:00 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:49:00 - CommonDataModel - ERROR - 248/249 were good, 1 studies are removed.
2022-06-17 14:49:00 - CommonDataModel - INFO - saving dataframe (0x1141ff310) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - saving drug_exposure.COVID_19_vaccine_3036.0x1141ff310.2022-06-17T134900 to ./output/cache//drug_exposure.COVID_19_vaccine_3036.0x1141ff310.2022-06-17T134900.tsv
2022-06-17 14:49:00 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - dataset_tool --load --table=drug_exposure --user=data --data_file=./output/cache//drug_exposure.COVID_19_vaccine_3036.0x1141ff310.2022-06-17T134900.tsv --support --bcqueue bclink
2022-06-17 14:49:00 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=drug_exposure --user=data --database=bclink
2022-06-17 14:49:00 - CommonDataModel - INFO - starting on SARS-CoV-2 (COVID-19) vaccine, mRNA-1273 0.2 MG/ML Injectable Suspension 3040
2022-06-17 14:49:00 - DrugExposure - INFO - Called apply_rules
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_end_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_exposure_start_datetime
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_concept_id
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped drug_source_value
2022-06-17 14:49:00 - DrugExposure - INFO - Mapped person_id
2022-06-17 14:49:00 - DrugExposure - WARNING - Requiring non-null values in drug_concept_id removed 475 rows, leaving 245 rows.
2022-06-17 14:49:00 - DrugExposure - INFO - Automatically formatting data columns.
2022-06-17 14:49:01 - DrugExposure - INFO - created df (0x1141ffa00)[SARS_CoV_2_COVID_19_vaccine_mRNA_1273_0_2_MG_ML_Injectable_Suspension_3040]
2022-06-17 14:49:01 - CommonDataModel - INFO - finished SARS-CoV-2 (COVID-19) vaccine, mRNA-1273 0.2 MG/ML Injectable Suspension 3040 (0x1141ffa00) ... 4/5 completed, 245 rows
2022-06-17 14:49:01 - CommonDataModel - INFO - saving dataframe (0x1141ffa00) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - saving drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_1273_0_2_MG_ML_Injectable_Suspension_3040.0x1141ffa00.2022-06-17T134901 to ./output/cache//drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_1273_0_2_MG_ML_Injectable_Suspension_3040.0x1141ffa00.2022-06-17T134901.tsv
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:01 - BCLinkHelpers - NOTICE - dataset_tool --load --table=drug_exposure --user=data --data_file=./output/cache//drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_1273_0_2_MG_ML_Injectable_Suspension_3040.0x1141ffa00.2022-06-17T134901.tsv --support --bcqueue bclink
2022-06-17 14:49:01 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=drug_exposure --user=data --database=bclink
2022-06-17 14:49:01 - CommonDataModel - INFO - starting on SARS-CoV-2 (COVID-19) vaccine, mRNA-BNT162b2 0.1 MG/ML Injectable Suspension 3041
2022-06-17 14:49:01 - DrugExposure - INFO - Called apply_rules
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped drug_concept_id
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped drug_exposure_end_datetime
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped drug_exposure_start_datetime
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped drug_source_concept_id
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped drug_source_value
2022-06-17 14:49:01 - DrugExposure - INFO - Mapped person_id
2022-06-17 14:49:01 - DrugExposure - WARNING - Requiring non-null values in drug_concept_id removed 471 rows, leaving 249 rows.
2022-06-17 14:49:01 - DrugExposure - INFO - Automatically formatting data columns.
2022-06-17 14:49:01 - DrugExposure - INFO - created df (0x1141ff190)[SARS_CoV_2_COVID_19_vaccine_mRNA_BNT162b2_0_1_MG_ML_Injectable_Suspension_3041]
2022-06-17 14:49:01 - CommonDataModel - INFO - finished SARS-CoV-2 (COVID-19) vaccine, mRNA-BNT162b2 0.1 MG/ML Injectable Suspension 3041 (0x1141ff190) ... 5/5 completed, 249 rows
2022-06-17 14:49:01 - CommonDataModel - ERROR - There are person_ids in this table that are not in the output person table!
2022-06-17 14:49:01 - CommonDataModel - ERROR - Either they are not in the original data, or while creating the person table,
2022-06-17 14:49:01 - CommonDataModel - ERROR - studies have been removed due to lack of required fields, such as birthdate.
2022-06-17 14:49:01 - CommonDataModel - ERROR - 248/249 were good, 1 studies are removed.
2022-06-17 14:49:01 - CommonDataModel - INFO - saving dataframe (0x1141ff190) to <carrot.io.plugins.bclink.BCLinkDataCollection object at 0x113fdee20>
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - saving drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_BNT162b2_0_1_MG_ML_Injectable_Suspension_3041.0x1141ff190.2022-06-17T134901 to ./output/cache//drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_BNT162b2_0_1_MG_ML_Injectable_Suspension_3041.0x1141ff190.2022-06-17T134901.tsv
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - finished save to file
2022-06-17 14:49:01 - BCLinkHelpers - NOTICE - dataset_tool --load --table=drug_exposure --user=data --data_file=./output/cache//drug_exposure.SARS_CoV_2_COVID_19_vaccine_mRNA_BNT162b2_0_1_MG_ML_Injectable_Suspension_3041.0x1141ff190.2022-06-17T134901.tsv --support --bcqueue bclink
2022-06-17 14:49:01 - BCLinkHelpers - NOTICE - datasettool2 list-updates --dataset=drug_exposure --user=data --database=bclink
2022-06-17 14:49:01 - CommonDataModel - INFO - finalised drug_exposure on iteration 0 producing 1210 rows from 5 tables
2022-06-17 14:49:01 - LocalDataCollection - INFO - Getting next chunk of data
2022-06-17 14:49:01 - LocalDataCollection - INFO - All input files for this object have now been used.
2022-06-17 14:49:01 - CommonDataModel - INFO - {
"version": "0.0.0",
"created_by": "calummacdonald",
"created_at": "2022-06-17T134857",
"dataset": "CommonDataModel::FAILED: ExampleV4",
"total_data_processed": {
"person": 996,
"observation": 1197,
"condition_occurrence": 2630,
"drug_exposure": 1210
}
}
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - finalising, waiting for jobs to finish
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - job_ids to wait for: []
2022-06-17 14:49:01 - BCLinkDataCollection - INFO - done!