Running DRIVE phenomewide#

This section illustrates how to run DRIVE for multiple phenotypes or diseases of interest in the same analysis. This process will be referred to as a PheWES (Phenome-wide Enrichment Study) throughout the documentation.

This process is very similar to how we would run DRIVE and can be illustrated in the following steps:

  1. Identify pairwise IBD segments in the cohort of interest

  2. Generate a phenotype file containing multiple phenotypes of interest

  3. Run DRIVE over the locus of interest

For this example we are going to assume that IBD segments have already been detected using a program such as hap-IBD or iLASH.

Format of the Phenotype File:#

The biggest difference in running DRIVE phenomewide is that the input phenotype file takes the form of a matrix where the first column is the cohort IDs and every other column in the file is phenotype of interest. This file should be tab separated. Excluded individuals for any phenotype can be represented by N/A or -1.

Example of a phenotype file with multiple phenotypes (completely made up)#

GRID

status_1

status_2

status_3

ID1

1

0

-1

ID2

0

1

0

ID3

1

1

-1

ID4

0

0

1

An example file can also be found in the “tests/test_inputs” folder of the repository called “test_phenotype_file_withNAs.txt”

Command to run DRIVE phenome-wide:#

When you provide a phenotype file with multiple columns, DRIVE will automatically start to run the analysis phenome-wide without any additional flags.

drive cluster \
  -i tests/test_inputs/simulated_ibd_test_data_v2_chr20.ibd.gz \
  -f hapibd \
  -t 20:4666882-4682236 \
  -o ./test_drive_phenomewide_output \
  --min-cm 3 \
  --cases tests/test_inputs/test_phenotype_file_withNAs.txt \
  --segment-overlap overlaps \
  --min-network-size 2 \
  --recluster \
  --log-to-console \
  --log-filename test_drive_phenomewide_output.log

When finished, DRIVE will create a output file that has five columns for each phenotype. An example output file can be found here “tests/test_outputs/test_drive_phenomewide_output.drive_networks.txt”. The first few rows of this output are shown below:

DRIVE Phenomewide Output Example#

clstID

n.total

n.haplotype

true.positive.n

true.positive

false.positive

IDs

ID.haplotype

min_pvalue

min_phenotype

min_phenotype_description

CV_414_case_count_in_network

CV_414_cases_in_network

CV_414_excluded_count_in_network

CV_414_excluded_in_network

CV_414_pvalue

NS_324.11_case_count_in_network

NS_324.11_cases_in_network

NS_324.11_excluded_count_in_network

NS_324.11_excluded_in_network

NS_324.11_pvalue

phenoC_case_count_in_network

phenoC_cases_in_network

phenoC_excluded_count_in_network

phenoC_excluded_in_network

phenoC_pvalue

0

4

4

4

0.6667

0

842,130,30,861

130.1,842.1,30.2,861.2

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

1

2

2

1

1.0000

0

223,443

223.1,443.1

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

2

2

2

1

1.0000

0

253,957

253.2,957.1

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

1

253

1

1

253

0

None

1

3

3

3

3

1.0000

0

244,531,231

231.1,244.1,531.2

N/A

N/A

N/A

0

N/A

0

None

1

1

531

0

None

1

0

N/A

0

None

1

4

5

5

7

0.7000

0

574,676,210,535,94

535.1,574.2,94.1,210.1,676.2

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

5

3

3

3

1.0000

0

600,962,591

591.2,600.2,962.2

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

6

2

2

1

1.0000

0

610,895

610.1,895.2

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

7

4

4

5

0.8333

0

342,295,211,969

211.1,969.2,342.1,295.1

N/A

N/A

N/A

0

N/A

0

None

1

1

211

0

None

1

0

N/A

1

295

1

8

3

3

3

1.0000

0

133,622,941

622.2,941.1,133.1

N/A

N/A

N/A

0

N/A

0

None

1

1

941

0

None

1

0

N/A

0

None

1

9

6

6

15

1.0000

0

484,577,946,343,10,609

343.2,609.1,946.2,484.2,10.1,577.2

N/A

N/A

N/A

0

N/A

1

10

1

1

10

0

None

1

0

N/A

0

None

1

10

6

6

12

0.8000

0

786,259,533,199,575,811

533.1,786.1,199.1,575.2,811.1,259.2

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

11

2

2

1

1.0000

0

801,938

801.2,938.1

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

1

801

1

12

2

2

1

1.0000

0

871,895

871.2,895.1

N/A

N/A

N/A

1

871

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

13

5

5

8

0.8000

0

773,108,751,970,326

108.2,970.2,326.2,773.2,751.1

N/A

N/A

N/A

0

N/A

0

None

1

0

N/A

0

None

1

0

N/A

0

None

1

14

5

5

10

1.0000

0

385,711,14,615,507

14.2,615.2,385.1,507.1,711.2

0.34684049957692414

NS_324.11

Parkinson’s disease (Primary)

0

N/A

0

None

1

2

615, 385

0

None

0.34684049957692414

0

N/A

0

None

1