L2 Reconstruction loss¤
L2 Reconstruction loss.
L2ReconstructionLoss
¤
Bases: Metric
L2 Reconstruction loss (MSE).
L2 reconstruction loss is calculated as the sum squared error between each each input vector and it's corresponding decoded vector. The original paper found that models trained with some loss functions such as cross-entropy loss generally prefer to represent features polysemantically, whereas models trained with L2 may achieve the same loss for both polysemantic and monosemantic representations of true features.
Example
import torch loss = L2ReconstructionLoss(num_components=1) source_activations = torch.tensor([ ... [ # Batch 1 ... [4., 2.] # Component 1 ... ], ... [ # Batch 2 ... [2., 0.] # Component 1 ... ] ... ]) decoded_activations = torch.tensor([ ... [ # Batch 1 ... [2., 0.] # Component 1 (MSE of 4) ... ], ... [ # Batch 2 ... [0., 0.] # Component 1 (MSE of 2) ... ] ... ]) loss.forward( ... decoded_activations=decoded_activations, source_activations=source_activations ... ) tensor(3.)
Source code in sparse_autoencoder/metrics/loss/l2_reconstruction_loss.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
|
keep_batch_dim: bool = keep_batch_dim
instance-attribute
property
writable
¤
Whether to keep the batch dimension in the loss output.
__init__(num_components=1, *, keep_batch_dim=False)
¤
Initialise the L2 reconstruction loss.
Source code in sparse_autoencoder/metrics/loss/l2_reconstruction_loss.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|
calculate_mse(decoded_activations, source_activations)
staticmethod
¤
Calculate the MSE.
Source code in sparse_autoencoder/metrics/loss/l2_reconstruction_loss.py
111 112 113 114 115 116 117 118 119 120 121 |
|
compute()
¤
Compute the metric.
Source code in sparse_autoencoder/metrics/loss/l2_reconstruction_loss.py
157 158 159 160 161 162 163 |
|
update(decoded_activations, source_activations, **kwargs)
¤
Update the metric state.
If we're keeping the batch dimension, we simply take the mse of the activations (over the features dimension) and then append this tensor to a list. Then during compute we just concatenate and return this list. This is useful for e.g. getting L1 loss by batch item when resampling neurons (see the neuron resampler for details).
By contrast if we're averaging over the batch dimension, we sum the activations over the batch dimension during update (on each process), and then divide by the number of activation vectors on compute to get the mean.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
decoded_activations |
Float[Tensor, names(BATCH, COMPONENT_OPTIONAL, INPUT_OUTPUT_FEATURE)]
|
The decoded activations from the autoencoder. |
required |
source_activations |
Float[Tensor, names(BATCH, COMPONENT_OPTIONAL, INPUT_OUTPUT_FEATURE)]
|
The source activations from the autoencoder. |
required |
**kwargs |
Any
|
Ignored keyword arguments (to allow use with other metrics in a collection). |
{}
|
Source code in sparse_autoencoder/metrics/loss/l2_reconstruction_loss.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
|