Structure of the Data container
===============================

There is a specific class that the clustering module of DRIVE creates. This class is called "RuntimeState" and is passed to each plugin (see code block below). This class has no methods and is used to convey and share state of the application including both data and configuration options. The following section will describe the class attributes so the user understands what information each plugin can access by default.


.. code:: python

    @dataclass
    class RuntimeState:
        """main class to hold the data from the network analysis and the different pvalues"""

        networks: List[Network_Interface]
        output_path: Path
        carriers: Dict[str, Dict[str, List[str]]]
        phenotype_descriptions: Dict[str, Dict[str, str]]
        config_options: dict[str, Any] = field(default_factory=dict)

.. admonition:: Changing the Data class attributes

    The attributes described in this section are what DRIVE comes with by default. Users can add additional attributes through custom plugins. The advantages/disadvantages of customizing the class attributes will be discussed in further detail in a future section on creating custom plugins.

Data class attributes:
----------------------
- **networks**: This is a list of all the networks identified by the DRIVE clustering analysis. This list consists of Network objects that are shown by the code block below. These objects have information about the individuals in the cluster, the haplotypes in the cluster, and the size of the cluster. The Network class has attributes for pvalues, and min_pvalue_str but these values are empty until they are calculated by the "pvalues" plugin.

.. code:: python

    @dataclass
    class Network:
        clst_id: float
        true_positive_count: int
        true_positive_percent: float
        false_negative_edges: List[int]
        false_negative_count: int
        members: Union[Set[int], Set[str]]
        haplotypes: Union[List[int], List[str]]
        min_pvalue_str: str = ""
        pvalues: Dict[str, Dict[str, Any]] = field(default_factory=dict)

        def print_members_list(self) -> str:
            """Returns a string that has all of the members' IDs separated by space
            """Returns a string that has all of the members' IDs separated by space

            Returns
            -------
            str
                returns a string where the members list attribute
                is formatted as a string for the output file. Individual strings are joined by comma.
            """
            return ", ".join(list(map(str, self.members)))

        def __lt__(self, comp_class: T) -> bool:
            """Override the less than method so that objects can be sorted in
            ascending numeric order based on cluster id.

            Parameters
            ----------
            comp_class : Network
                Network object to compare.

            Returns
            -------
            bool
                returns True if the self cluster ID is less than the
                comp_class cluster ID.
            """

            return self.clst_id < comp_class.clst_id

.. role:: python(code)
   :language: python

- **output**: This attribute is a path object that describes where to write the DRIVE output to. This path will have a file prefix that DRIVE appends a suffix to when it writes to a file. If you wish to get the parent directory you can just use :python:`network_obj.output.parent`.

- **carriers**: This attribute is a dictionary of dictionaries that tells who cases, controls, and exclusions are for each phenotype. The outer key is the phenotype ID. The inner dictionary has 3 keys: "cases", "controls", "excluded". The values for each of these keys are a list that contains the individual IDs for each case, control, or excluded individual, respectively.

- **phenotype_descriptions**: This attribute is another dictionary of dictionaries that has a description of each phenotype. The outer key is the phenotype ID. The inner key is the string phenotype and has the phenotype description as a value.

- **config_options**: This attribute is a dictionary where the keys represent runtime options that the plugin can use and the values are the state for the runtime option. For example the network writer plugin checks to see if there is a key "compress" and decides whether or not to compress the output file based on the value of this key.
