I am running vLLM on a remote host and attempting to execute a test on the v1/embeddings endpoint of an OpenAI style api running from the remote host. I have confirmed that the system is running and models are available from the client. My Code to do a simple embedding is below:
I am running
async def _embed_single(self) -> List[float]:
"""
Generate embedding for single text (with caching)
Args:
text: Text to embed
Returns:
Embedding vector
"""
try:
responses = await self.openai_client.embeddings.create(
input=[
"Hello my name is",
"The best thing about vLLM is that it supports many different models"
],
model=self.model_name,
dimensions=self.dimension
)
embedding = []
for data in responses.data:
print(data.embedding)
# Normalize embedding
embedding = np.array(embedding, dtype=np.float32)
norm = np.linalg.norm(embedding)
if norm > 0:
embedding = embedding / norm
return embedding
except requests.RequestException as e:
logger.error(f"Embedding request failed: {e}")
return [0.0] * self.dimension
except Exception as e:
logger.error(f"Embedding processing failed: {e}")
return [0.0] * self.dimension
Problems
Everytime I step through this code I keep hitting the same error before I can iterate through the responses.data I get the same error: No Embedding data recieved
Granted this is loosely based on the documentation for the repository’s code example Open AI embeddings example
There has to be something simple I am doing wrong here. Any help would be appreciated.
Surly this must be something incredibly stupid for me and I am a novice to this whole setup so any kind guidance would be highly appreciated.
After messing around a bit I think the model might not support Embeddings. Is there a list of models that support embeddings some where on the vLLM
model: Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8
Embedding processing failed: The model does not support Embeddings API
