Is there any working Colab notebook using vLLM with TPU v5e?

I can connect to a tpu but have never seen a working notebook for a model like Gemma or llama.

do these instructions not work? Other AI accelerators — vLLM

I don’t actually have access to v5e through colab, so can’t test this, but it should be very similar to VM.

you can also try something like this: tpu-recipes/inference/trillium/vLLM at main · AI-Hypercomputer/tpu-recipes · GitHub

just keep in mind it’s using v6e, not v5e (HBM per chip on v6e is 32GB, whereas v5e is 16GB)

Please share a bug if any of these don’t work - thanks!

2 Likes

Oh wow, I’m not sure if it would be easy to do via Colab, but I ran those scripts via cloud and everything worked like butter?

Any idea on whether it is possible to test the v7? I’m looking to make a video for the YouTube.com/@TrelisResearch channel, cheers