feat: NVIDIA-Llama-Nemotron-Super-49B #4430

Sam-Deciga · 2026-01-27T23:33:14Z

REQUIRED: Add a summary of your PR here, typically including why the change is needed and what was changed. Include any design alternatives for discussion purposes.

--- YOUR PR SUMMARY GOES HERE ---

NVIDIA Llama Nemotron Super 49B v1.5 model launch on Jan 28
b/454127910

REQUIRED: Fill out the below checklists or remove if irrelevant

If you are opening a PR for Official Notebooks under the notebooks/official folder, follow this mandatory checklist:

Use the notebook template as a starting point.
Follow the style and grammar rules outlined in the above notebook template.
Verify the notebook runs successfully in Colab since the automated tests cannot guarantee this even when it passes.
Passes all the required automated checks. You can locally test for formatting and linting with these instructions.
You have consulted with a tech writer to see if tech writer review is necessary. If so, the notebook has been reviewed by a tech writer, and they have approved it.
This notebook has been added to the CODEOWNERS file under the Official Notebooks section, pointing to the author or the author's team.
The Jupyter notebook cleans up any artifacts it has created (datasets, ML models, endpoints, etc) so as not to eat up unnecessary resources.

If you are opening a PR for Community Notebooks under the notebooks/community folder:

This notebook has been added to the CODEOWNERS file under the Community Notebooks section, pointing to the author or the author's team.
Passes all the required formatting and linting checks. You can locally test with these instructions.

If you are opening a PR for Community Content under the community-content folder:

Make sure your main Content Directory Name is descriptive, informative, and includes some of the key products and attributes of your content, so that it is differentiable from other content
The main content directory has been added to the CODEOWNERS file under the Community Content section, pointing to the author or the author's team.
Passes all the required formatting and linting checks. You can locally test with these instructions.

gemini-code-assist · 2026-01-27T23:33:32Z

Summary of Changes

Hello @Sam-Deciga, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new official Jupyter notebook designed to guide users through the process of deploying and interacting with NVIDIA's Llama Nemotron models on Google Cloud's Vertex AI platform. It provides comprehensive instructions and code examples for model registration, endpoint creation, model deployment, and real-time prediction, utilizing both direct API calls and the Vertex AI Python SDK. The notebook focuses on the Llama-3.3-Nemotron-Super-49B-v1.5 model, showcasing its advanced reasoning and efficiency features.

Highlights

New Model Integration: Introduced a new official Jupyter notebook (llama_nemotron_intro.ipynb) to demonstrate the deployment and usage of NVIDIA Llama Nemotron models on Google Cloud Vertex AI.
Comprehensive Deployment Guide: The notebook provides step-by-step instructions for setting up the Vertex AI environment, uploading the NVIDIA NIM model to the Model Registry, creating and deploying to a Vertex AI Endpoint, and performing real-time predictions.
Dual API and SDK Support: Examples are provided for both direct Vertex AI API calls (using curl) and the Vertex AI Python SDK, offering flexibility for users to interact with the deployed models.
Prediction Capabilities Showcase: The notebook demonstrates both unary and streaming prediction requests, including the ability to toggle 'reasoning on/off' for the Llama Nemotron model.
Model Specifics: Specifically highlights the Llama-3.3-Nemotron-Super-49B-v1.5 model, detailing its capabilities for reasoning, human chat, agentic tasks, 128K token context length, and efficiency gains from a novel Neural Architecture Search (NAS) approach.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new Jupyter notebook for deploying NVIDIA Llama Nemotron models on Google Cloud Vertex AI. The notebook demonstrates both API and Python SDK methods for model upload, endpoint creation, deployment, and prediction. Overall, the notebook provides a good starting point for users. However, there are several areas where the robustness and clarity of the code can be improved, particularly concerning resource ID extraction and consistency in naming conventions.

notebooks/official/generative_ai/llama_nemotron_intro.ipynb

Sam-Deciga · 2026-01-28T23:36:38Z

Checked with the partner and they say they are OK

r2xd889zyn-dot · 2026-01-29T00:36:03Z

Thanks for the clear PR template and checklists.

The required summary (“why the change is needed,” “what was changed,” and design alternatives), along with CODEOWNERS attribution and environment verification (e.g., Colab runs), creates a strong authorship and accountability signal that’s easy for reviewers to reason about.

The explicit cleanup requirements for generated artifacts are also helpful for maintaining long-term repo and resource hygiene.

Overall, this structure makes review intent, ownership, and execution boundaries much clearer up front. Appreciate the rigor.

r2xd889zyn-dot · 2026-01-29T00:51:42Z

Since MODEL_ID already contains the fully qualified resource path, re-concatenating it with projects/{PROJECT_ID}/locations/{LOCATION}/models/ would indeed produce a malformed identifier.

Using the existing MODEL_ID value directly for the "model" field keeps the payload aligned with the expected resource name format and avoids duplication.

Appreciate the clear explanation in the review comment — it makes the correction unambiguous.

feat: NVIDIA-Llama-Nemotron-Super-49B

153e4b8

Sam-Deciga requested a review from a team as a code owner January 27, 2026 23:33

gemini-code-assist bot reviewed Jan 27, 2026

View reviewed changes

gericdong approved these changes Jan 29, 2026

View reviewed changes

gericdong merged commit 008eb40 into GoogleCloudPlatform:main Jan 29, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: NVIDIA-Llama-Nemotron-Super-49B #4430

feat: NVIDIA-Llama-Nemotron-Super-49B #4430

Uh oh!

Sam-Deciga commented Jan 27, 2026

Uh oh!

gemini-code-assist bot commented Jan 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Sam-Deciga commented Jan 28, 2026

Uh oh!

r2xd889zyn-dot commented Jan 29, 2026

Uh oh!

r2xd889zyn-dot commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: NVIDIA-Llama-Nemotron-Super-49B #4430

feat: NVIDIA-Llama-Nemotron-Super-49B #4430

Uh oh!

Conversation

Sam-Deciga commented Jan 27, 2026

Uh oh!

gemini-code-assist bot commented Jan 27, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Sam-Deciga commented Jan 28, 2026

Uh oh!

r2xd889zyn-dot commented Jan 29, 2026

Uh oh!

r2xd889zyn-dot commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants