# API Reference ## `HLA_vae` (in `themap.model`) ### `train_model(train_loader, lr, epochs, device)` Train the HLA VAE model using MSE loss and KL divergence. #### Parameters | Name | Type | Description | |---------------|----------------------------------|-----------------------------------------------------| | `train_loader`| `torch.utils.data.DataLoader` | Batches of input HLA tensors. | | `lr` | `float` | Learning rate. | | `epochs` | `int` | Number of training epochs. | | `device` | `torch.device` | Device to run training (e.g., `cuda:0` or `cpu`). | #### Returns None. --- ### `embed_hla(hla_loader, device)` Generate latent embeddings for HLA sequences using the trained encoder. #### Parameters | Name | Type | Description | |---------------|----------------------------------|----------------------------------------------------| | `hla_loader` | `torch.utils.data.DataLoader` | Batches of input HLA sequences. | | `device` | `torch.device` | Computation device. | #### Returns `Tuple[np.ndarray, np.ndarray]` - `z_hla_mean`: latent mean vectors - `z_conv`: intermediate convolutional features --- ## `PEP_vae` (in `themap.model`) ### `train_model(train_loader, lr, epochs, device)` Train peptide VAE using reconstruction + KL + alignment loss. #### Parameters | Name | Type | Description | |---------------|----------------------------------|---------------------------------------------------| | `train_loader`| `torch.utils.data.DataLoader` | Batches of `(peptide, z_hla)` inputs. | | `lr` | `float` | Learning rate. | | `epochs` | `int` | Number of training epochs. | | `device` | `torch.device` | CUDA or CPU device. | #### Returns None. --- ### `embed_pep(data_loader, device)` Embed peptides using the trained VAE encoder. #### Parameters | Name | Type | Description | |---------------|----------------------------------|---------------------------------------------------| | `data_loader` | `torch.utils.data.DataLoader` | Peptide batches. | | `device` | `torch.device` | Device for inference. | #### Returns `Tuple[np.ndarray, np.ndarray]` - `z_pep_mean`: latent mean embeddings - `z_conv`: intermediate conv features --- ## `THE` (in `themap.model`) ### `train_model(...)` Train the full THE model using labeled TCR-target binding pairs. #### Parameters | Name | Type | Description | |----------------|-------------------|-------------| | `df` | `pd.DataFrame` | Training set with TCR, target, and label. | | `num_epochs` | `int` | Number of training epochs. | | `device` | `torch.device` | Training device. | | `batch_size` | `int` | Batch size (default: 256). | | `lr` | `float` | Learning rate (default: 1e-4). | | `chain` | `str` | One of `'a'`, `'b'`, or `'ab'`. | | `targets` | `str` | `'peptide'` or `'hla'`. | | `hla_dict` | `dict` or `None` | Required if `targets='hla'`. | | `df_add` | `pd.DataFrame` or `None` | Additional data to append. | | `neg_sampling` | `bool` | Enable negative sampling (default: `True`). | | `neg_resample` | `bool` | Resample negatives each epoch (default: `True`). | | `max_alpha` | `int` | Max alpha CDR3 length. | | `max_beta` | `int` | Max beta CDR3 length. | | `max_pep` | `int` | Max peptide length. | #### Returns None. --- ### `test_model(...)` Evaluate trained THE model on a test set or test loader. #### Parameters | Name | Type | Description | |----------------|-------------------|-------------| | `df_test` | `pd.DataFrame` or `None` | Test set DataFrame. | | `batch_size` | `int` | Batch size. | | `test_loader` | `DataLoader` or `None` | If provided, skip DataFrame. | | `hla_dict` | `dict` or `None` | Required for HLA targets. | | `device` | `str` or `torch.device` | CPU or CUDA. | | `chain` | `str` | `'a'`, `'b'`, or `'ab'`. | | `targets` | `str` | `'peptide'` or `'hla'`. | | `max_alpha` | `int` | Max alpha CDR3 length. | | `max_beta` | `int` | Max beta CDR3 length. | | `max_pep` | `int` | Max peptide length. | #### Returns `Tuple[Tensor, Tensor, Tensor]` - `scores`: predicted binding scores - `alpha_attn`: attention for alpha - `beta_attn`: attention for beta