I can connect to a tpu but have never seen a working notebook for a model like Gemma or llama.
do these instructions not work? Other AI accelerators — vLLM
I don’t actually have access to v5e through colab, so can’t test this, but it should be very similar to VM.
you can also try something like this: tpu-recipes/inference/trillium/vLLM at main · AI-Hypercomputer/tpu-recipes · GitHub
just keep in mind it’s using v6e, not v5e (HBM per chip on v6e is 32GB, whereas v5e is 16GB)
Please share a bug if any of these don’t work - thanks!
2 Likes
Oh wow, I’m not sure if it would be easy to do via Colab, but I ran those scripts via cloud and everything worked like butter?
Any idea on whether it is possible to test the v7? I’m looking to make a video for the YouTube.com/@TrelisResearch channel, cheers