It seems likely that you should modify the Sampler class in vLLM V0 as follows:
- Add is_sigmoid to SamplingParams (sampling_params.py):
class SamplingParams(...):
...
is_sigmoid: bool = False
- Modify Sampler.forward in sampler.py:
import torch
class Sampler(nn.Module):
...
def forward(
self,
logits: torch.Tensor,
sampling_metadata,
# ... other args
):
# Example: batch over sequence groups
outputs = []
for i, seq_group in enumerate(sampling_metadata.seq_groups):
sampling_params = seq_group.sampling_params
group_logits = logits[i]
if getattr(sampling_params, "is_sigmoid", False):
probs = torch.sigmoid(group_logits)
outputs.append({"sigmoid_probs": probs})
else:
logprobs = torch.log_softmax(group_logits, dim=-1)
outputs.append({"logprobs": logprobs})
return outputs
You may need to adapt this to your actual output structure, but this shows the key logic: check is_sigmoid for each sequence group and apply sigmoid or log_softmax accordingly.
Would you like more detail on integrating this with the rest of the vLLM pipeline?
Sources: