Data API
The Data API provides functionality to easily read the preprocessed signal data that Nanocompore uses for the analysis. This can be used for custom plotting or other purposes. In general, you just need to load the configuration file used for the analysis via the load_config function and then you can query the data with get_references, get_reads, and get_pos.
For example:
>>> from nanocompore.api import load_config, get_pos
# Load the YAML configuration file to a Config object.
>>> config = load_config('analysis.yaml')
# Get the signal data for a given position:
>>> ref_id = 'ENST00000464651.1|ENSG00000166136.16|OTTHUMG00000019346.4|OTTHUMT00000051221.1|NDUFB8-204|NDUFB8|390|retained_intron|'
>>> get_pos(config, ref_id, 243)
    condition sample                                  read  intensity  dwell
0          WT   WT_2  a6f3e188-6288-4215-acdc-fe28beba411f    -1624.0   27.0
1          WT   WT_2  09923db6-eccc-497f-8621-8adeea9b1bfb     4072.0   20.0
2          WT   WT_2  f65926cc-bf13-4396-ba92-7f2f690b71d9    -2571.0    5.0
3          WT   WT_2  aebabd0a-5260-41c4-b38b-1ebb117dc0fb      586.0   16.0
4          WT   WT_2  994256e9-afab-4b54-94ff-cc37ae4cbe08     5229.0   16.0
..        ...    ...                                   ...        ...    ...
383        WT   WT_1  79df3c74-a4c6-4335-93c5-a0ca7e3aec78    -1067.0   25.0
384        WT   WT_1  f7dad9c6-d3d9-4501-85fb-6c6246a03719    -2225.0   56.0
385        WT   WT_1  8653efdc-943f-48f8-b6f1-174cc4bb1ad5     2837.0   12.0
386        WT   WT_1  fdf524f0-5bb5-45fc-a783-7e3a592eb149      462.0   30.0
387        WT   WT_1  b05c004e-5f58-4bfa-896b-ce28b4225ab2     -469.0   27.0
[388 rows x 5 columns]
Reference
            get_metadata(db)
    Returns the metadata from the given SQLite database.
The metadata contains information such as input files, resquiggler used, and data types for the binary encoded fields.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                db
             | 
            
                  str
             | 
            
               Path to the SQLite database produced by the preprocessing command of Nanocompore.  | 
            required | 
Returns:
| Type | Description | 
|---|---|
                  dict
             | 
            
               Dictionary containing the metadata  | 
          
Source code in nanocompore/api.py
              584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605  |  | 
            get_pos(config, reference_id, pos)
    Get the data for a given position for all samples. Note that position is a 0-based index of the first nucleotide of a k-mer.
Returns the signal data for a specific position of the given reference transcript from all reads.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                config
             | 
            
                  Config
             | 
            
               Path to a Nanocompore configuration file.  | 
            required | 
                reference_id
             | 
            
                  str
             | 
            
               ID for a reference sequence (transcript).  | 
            required | 
                pos
             | 
            
                  int
             | 
            
               Position on the transcript for which to get data. A 0-based index is assumed.  | 
            required | 
Returns:
| Type | Description | 
|---|---|
                  DataFrame
             | 
            
               Where the DataFrame contains the following columns: 
  | 
          
Examples:
>>> from nanocompore.api import load_config, get_pos
>>> config = load_config('analysis.yaml')
>>> get_pos(config, 'ENST00000674681.1|ENSG00000075624.17|OTTHUMG00000023268|-|ACTB-219|ACTB|2554|protein_coding|', 532)
   condition sample                                  read  intensity  dwell
0         WT    WT1  a4395b0d-dd3b-48e3-8afb-4085374b1147     3800.0    7.0
1         WT    WT1  f9733448-6e6b-47ba-9501-01eda2f5ea26     4865.0  126.0
2         WT    WT1  6f5e3b2e-f27b-47ef-b3c6-2ab4fdefd20a     3272.0   42.0
3         WT    WT2  2da07406-70c2-40a1-835a-6a7a2c914d49     6241.0   44.0
4         WT    WT2  54fc1d38-5e3d-4d77-a717-2d41b4785af6     4047.0    9.0
5         WT    WT2  3cfa90d1-7dfb-4398-a224-c75a3ab99873     3709.0   70.0
6         KD    KD1  3f46f499-8ce4-4817-8177-8ad61b784f27     4807.0   57.0
7         KD    KD1  73d62df4-f04a-4207-a4bc-7b9739b3c3b2     4336.0  132.0
8         KD    KD1  b7bc9a36-318e-4be2-a90f-74a5aa6439bf     -861.0    7.0
9         KD    KD2  ac486e16-15be-47a8-902c-2cfa2887c534     2706.0   45.0
10        KD    KD2  797fd991-570e-42d4-8292-0a7557b192d7     5450.0   24.0
11        KD    KD2  4e1ad358-ec2b-40b4-8e9a-54db28a40551      206.0   47.0
Source code in nanocompore/api.py
              137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196  |  | 
            get_reads(config, reference_id, selected_reads=None)
    Get the data for all reads mapping to the given reference.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                config
             | 
            
                  Config
             | 
            
               Path to a Nanocompore configuration file.  | 
            required | 
                reference_id
             | 
            
                  str
             | 
            
               ID for a reference sequence (transcript).  | 
            required | 
                selected_reads
             | 
            
                  Optional[list[str]]
             | 
            
               Optional list of UUIDs of the reads for which to get data. By default it's set to None and returns all reads.  | 
            
                  None
             | 
          
Returns:
| Type | Description | 
|---|---|
                  tuple[Float[np.ndarray, ["reads positions variables"]],
             | 
            
               list[str], list[str], list[str]] A tuple with (signal_data, reads, samples, conditions) 
  | 
          
Raises:
| Type | Description | 
|---|---|
                  KeyError
             | 
            
               If the reference_id is not found in the data sources.  | 
          
Examples:
>>> from nanocompore.api import load_config, get_references
>>> config = load_config('analysis.yaml')
>>> get_reads(config, 'ENST00000674681.1|ENSG00000075624.17|OTTHUMG00000023268|-|ACTB-219|ACTB|2554|protein_coding|')
Source code in nanocompore/api.py
              72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134  |  | 
            get_references(config, has_data=True)
    Returns a list of all references found in the list of samples defined in the configuration.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                config
             | 
            
                  Config
             | 
            
               Path to a Nanocompore configuration file.  | 
            required | 
                has_data
             | 
            
                  bool
             | 
            
               If True (default) will return only references for which there are mapped reads.  | 
            
                  True
             | 
          
Returns:
| Type | Description | 
|---|---|
                  list
             | 
            
               List of transcript reference id strings.  | 
          
Examples:
>>> from nanocompore.api import load_config, get_references
>>> config = load_config('analysis.yaml')
>>> get_references(config)
['ENST00000674681.1|ENSG00000075624.17|OTTHUMG00000023268|-|ACTB-219|ACTB|2554|protein_coding|', 'ENST00000642480.2|ENSG00000075624.17|OTTHUMG00000023268|OTTHUMT00000495153.1|ACTB-213|ACTB|2021|protein_coding|']
Source code in nanocompore/api.py
              40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69  |  | 
            load_config(config_path)
    Load a configuration file.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                config_path
             | 
            
                  str
             | 
            
               Path to the Nanocompore configuration file.  | 
            required | 
Returns:
| Type | Description | 
|---|---|
                  Config
             | 
            
               A configuration object.  | 
          
Source code in nanocompore/api.py
              22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37  |  |