Optimizing over top level inputs to a custom featurization layer #2125

sbernasek · 2023-11-21T03:06:53Z

sbernasek
Nov 21, 2023

I'm trying to use a single task GP model to find optimal values of some independent inputs based on their ability to maximize some output. Using domain knowledge I'm able to losslessly compute some additional features that I know will yield stronger model signal than just the raw input values. There may be N features constructed from M inputs such that N > M.

I've been trying to add this featurization layer at the top level of a SingleTaskGP model. I've tried several approaches, but I can't seem to get any to optimize over the raw top level inputs rather than over the GP model feature space.

I've tried to reproduce a simple example below. Here I'm defining two inputs X1 and X2, then adding a third "feature" defined as X3 = X1 * X2. When I add this as an input transform, the optimizer expects 3 bounds, presumably because it wants to optimize over the feature space rather than the raw inputs. I've tried subclassing SingleTaskGP and augmenting its forward pass method as well, but to no avail.

Something I should add is that the featurization is not readily invertible, so I can't just convert candidates from model feature space back to the original input dimensions.

Any suggestions would be greatly appreciated. Thank you!

from botorch.models import SingleTaskGP
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.fit import fit_gpytorch_model
from botorch.models.transforms import Standardize
from botorch.models.transforms.input import InputTransform
from botorch.optim.optimize import optimize_acqf


# simulate some data
X = torch.rand((100, 2))
y = torch.rand((100, 1))


class FeaturizationTransform(InputTransform, torch.nn.Module):

    is_one_to_many = True
    transform_on_eval = True
    transform_on_train = True
    transform_on_fantasize = True

    def transform(self, X: torch.Tensor) -> torch.Tensor:
        product = torch.prod(X, dim=1, keepdim=True)
        return torch.cat([X, product], dim=1)


# Define the Custom model
gp = SingleTaskGP(
    X, 
    y,
    outcome_transform=Standardize(m=1),
    input_transform=FeaturizationTransform(),
)

# Fit the model
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
_ = fit_gpytorch_model(mll)

# Define an acquisition function
acquisition_function = ProbabilityOfImprovement(gp, best_f=y.max().item(), maximize=True)

# Specify bounds on each input (shaped 2 by d)
bounds = torch.stack([torch.zeros(2), torch.ones(2)])

# Run optimizer
candidate, acq_value = optimize_acqf(
    acq_function=acquisition_function,
    bounds=bounds,
    q=1,
    num_restarts=50,
    raw_samples=250,
)```

Answered by Balandat

Nov 21, 2023

You almost got it right. The @t_batch_transform that is applied in ProbabilityOfImprovement here: https://github.com/pytorch/botorch/blob/main/botorch/acquisition/analytic.py#L162 and in most other acquisition functions means that the tensor that is seem by the forward methods of the transforms is at least three-dimensional / has at least one batch dimension. That means that your indexing is off. If you change your method as follows this works as you intended:

def transform(self, X: torch.Tensor) -> torch.Tensor:
    product = torch.prod(X, dim=-1, keepdim=True)
    return torch.cat([X, product], dim=-1)

Pro tip: Since BoTorch and GPyTorch make heavy use of batching and in principle wor…

View full answer

Balandat · 2023-11-21T03:54:03Z

Balandat
Nov 21, 2023
Collaborator

You almost got it right. The @t_batch_transform that is applied in ProbabilityOfImprovement here: https://github.com/pytorch/botorch/blob/main/botorch/acquisition/analytic.py#L162 and in most other acquisition functions means that the tensor that is seem by the forward methods of the transforms is at least three-dimensional / has at least one batch dimension. That means that your indexing is off. If you change your method as follows this works as you intended:

def transform(self, X: torch.Tensor) -> torch.Tensor:
    product = torch.prod(X, dim=-1, keepdim=True)
    return torch.cat([X, product], dim=-1)

Pro tip: Since BoTorch and GPyTorch make heavy use of batching and in principle work with arbitrary batch dimensions, it is usually best to index "from the back".

1 reply

sbernasek Nov 21, 2023
Author

This solved my problem - thank you so much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing over top level inputs to a custom featurization layer #2125

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Optimizing over top level inputs to a custom featurization layer #2125

sbernasek Nov 21, 2023

Replies: 1 comment · 1 reply

Balandat Nov 21, 2023 Collaborator

sbernasek Nov 21, 2023 Author

sbernasek
Nov 21, 2023

Replies: 1 comment 1 reply

Balandat
Nov 21, 2023
Collaborator

sbernasek Nov 21, 2023
Author