Generated GPT_OSS model files through porter script. by laxmareddyp · Pull Request #2384 · keras-team/keras-hub

laxmareddyp · 2025-09-05T19:09:05Z

@divyashreepathihalli @mattdangerw @abheesht17 Could you please check and provide your feedback on the quality of this code generated through script.

I assume that 80-85% the code is matching and backbone files import successfully and it's possible to instantiate a backbone model. There still were some errors , which might be alleviated with a stronger model.

The converter and weight conversion scripts are still in development. Generating a workable solution is complex because it requires providing the model with a comprehensive understanding of the entire architectural layout to handle the intricate dependencies of the model's layers and weights.

Numeric verifications:

Matching output of both the models :

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

divyashreepathihalli · 2025-09-08T16:25:07Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new GPT-OSS model, seemingly generated by a script. While the overall structure aligns with the repository's conventions, there are several critical issues that need to be addressed. The weight conversion scripts in both utils/transformers and tools/checkpoint_conversion are non-functional due to incorrect layer and weight names, likely a result of being copied from another model's implementation without proper adaptation. Additionally, there's a critical code duplication issue with the GptOssCausalLMPreprocessor class being defined in two separate files. Several style guide violations were also found, including the use of type hints in function signatures, incorrect docstring formatting, and missing docstrings or examples in several new components. These issues must be resolved to ensure the model is correct, maintainable, and usable.

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm.py

keras_hub/src/utils/transformers/convert_gpt_oss.py

tools/checkpoint_conversion/convert_gpt_oss_checkpoints.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone.py

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm_preprocessor.py

keras_hub/src/models/gpt_oss/gpt_oss_attention.py

keras_hub/src/models/gpt_oss/gpt_oss_decoder.py

keras_hub/src/models/gpt_oss/gpt_oss_layer_norm.py

keras_hub/src/models/gpt_oss/gpt_oss_tokenizer.py

laxmareddyp · 2025-09-08T23:53:04Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the GPT_OSS model, including its backbone, causal language model task, preprocessor, and tokenizer. The overall structure aligns well with the project's conventions. However, there are several areas that need attention.

Most critically, the Hugging Face weight conversion script in convert_gpt_oss.py is incorrect and incompatible with the implemented model architecture, particularly for the Mixture-of-Experts (MoE) layers. This will prevent loading pre-trained weights correctly. I've provided a corrected implementation for the convert_weights function.

Additionally, there are multiple instances of incorrect docstring formatting across the new files, which violates the project's style guide. I've pointed these out with suggestions for correction. A minor but important clarification is also needed in a test file comment to improve maintainability.

keras_hub/src/utils/transformers/convert_gpt_oss.py

keras_hub/src/models/gpt_oss/gpt_oss_attention.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone_test.py

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm.py

keras_hub/src/models/gpt_oss/gpt_oss_decoder.py

laxmareddyp · 2025-09-10T20:49:47Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new GptOss model, including its backbone, causal language model, preprocessor, and associated tests and conversion scripts. The code is script-generated, and this review focuses on its quality, correctness, and adherence to the project's style guide.

The model definition and tests are generally well-structured. However, there are a few issues in the GptOssBackbone implementation regarding initialization and configuration that need to be addressed. The most critical issue lies in the Hugging Face conversion script (convert_gpt_oss.py), which appears to contain significant copy-paste errors from another model's converter. This will prevent correct weight loading and needs a substantial revision. My detailed comments provide specific suggestions to fix these issues.

keras_hub/src/utils/transformers/convert_gpt_oss.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone.py

mattdangerw · 2025-09-10T22:10:07Z

@laxmareddyp does it work? Looks like it probably doesn't even output valid code yet right? Going by the test output.

The overall code style looks ok, so this will come down to how accurate the code is.

I would view scripts like this as a developer tool rather than an automated workflow. So maybe the thing to do here is to try getting this code end to end working. Then you can have some first hand experience on how useful this was at saving time. If there's tons of hard to find errors in the code, this might be slower than doing it yourself. If it's pretty accurate, maybe this is saving time and worth putting forward as a tool for contributors to use.

laxmareddyp · 2025-09-11T23:58:54Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new GptOss model, which is a significant contribution. The code is mostly well-structured and follows the project's conventions. However, I've found a few critical issues, primarily related to the tokenizer usage in tests and a bug in the weight conversion logic for the MoE layers. There are also some inconsistencies in the checkpoint conversion script and a minor point of confusion in a test file's comments. Addressing these points will be crucial for the model's correctness and maintainability.

keras_hub/src/utils/transformers/convert_gpt_oss.py

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm_preprocessor_test.py

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm_test.py

tools/checkpoint_conversion/convert_gpt_oss_checkpoints.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone_test.py

keras_hub/src/models/gpt_oss/gpt_oss_causal_lm.py

keras_hub/src/models/gpt_oss/gpt_oss_decoder.py

tools/checkpoint_conversion/convert_gpt_oss_checkpoints.py

keras_hub/src/models/gpt_oss/gpt_oss_attention.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone_test.py

sachinprasadhs

Thanks, overall the code and result looks good. Suggested some minor changes on the code comments.

keras_hub/src/models/gpt_oss/gpt_oss_backbone.py

keras_hub/src/models/gpt_oss/gpt_oss_attention.py

keras_hub/src/models/gpt_oss/gpt_oss_backbone.py

keras_hub/src/utils/transformers/convert_gpt_oss.py

sachinprasadhs

LGTM, Thanks

* Test GPT_OSS files through porter * generate API and moved files to respective folders * Fix format issues * Add gpt_oss to preset loader and Fix format issues * Add gpt_oss to preset loader * generated files through 2.5-pro model * Format fix * Add converter, RoPE update * Fix format * Fix BPE tests * Update converter * Fix converter, checkpoints conversion and attention * Fix the parameter count and debug code * Add dequantization logic to converter * Add YaRN support,Fix Serialisation,Fix dequantization * Fixed several pytest tests * Address gpt_oss_causal_lm tests * Fix format issues * Address review comments * set start token id to None to match the HF output * Fix test cases * Fix test * Fix error * Fix * Address all comments

Test GPT_OSS files through porter

26867ba

laxmareddyp requested review from abheesht17, divyashreepathihalli and mattdangerw September 5, 2025 19:09

laxmareddyp added 4 commits September 6, 2025 09:16

generate API and moved files to respective folders

f1c055b

Fix format issues

d4da96c

Add gpt_oss to preset loader and Fix format issues

b14cfb5

Add gpt_oss to preset loader

b675610

gemini-code-assist bot reviewed Sep 8, 2025

View reviewed changes

generated files through 2.5-pro model

8cf71ce

gemini-code-assist bot reviewed Sep 8, 2025

View reviewed changes

Format fix

2242ef4

gemini-code-assist bot reviewed Sep 10, 2025

View reviewed changes

laxmareddyp added 2 commits September 11, 2025 16:46

Add converter, RoPE update

eb25d19

Fix format

ba50a9f

gemini-code-assist bot reviewed Sep 12, 2025

View reviewed changes

Fix BPE tests

1854d80

laxmareddyp removed request for abheesht17, divyashreepathihalli and mattdangerw September 12, 2025 02:02

laxmareddyp added 4 commits September 11, 2025 22:21

Merge branch 'keras-team:master' into test_gpt_oss_model

76139cd

Merge branch 'keras-team:master' into test_gpt_oss_model

00ec305

Update converter

9447990

Fix converter, checkpoints conversion and attention

340aa85

laxmareddyp added the WIP Pull requests which are work in progress and not ready yet for review. label Sep 24, 2025

sachinprasadhs reviewed Nov 18, 2025

View reviewed changes

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Nov 18, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Nov 18, 2025

sachinprasadhs reviewed Nov 19, 2025

View reviewed changes

keras_hub/src/models/gpt_oss/gpt_oss_backbone_test.py Outdated Show resolved Hide resolved

keras_hub/src/models/gpt_oss/gpt_oss_backbone_test.py Outdated Show resolved Hide resolved

Merge branch 'keras-team:master' into test_gpt_oss_model

240bf6f

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Nov 22, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Nov 22, 2025

laxmareddyp added 2 commits December 8, 2025 13:00

Merge branch 'keras-team:master' into test_gpt_oss_model

8ab7ca1

Address review comments

4cd83a5

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Dec 8, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Dec 8, 2025

set start token id to None to match the HF output

1224641

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Dec 9, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Dec 9, 2025

laxmareddyp added 4 commits December 9, 2025 02:01

Fix test cases

8106be4

Fix test

5a3669a

Fix error

a7a378d

Fix

032314a

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Dec 9, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Dec 9, 2025

sachinprasadhs reviewed Dec 16, 2025

View reviewed changes

laxmareddyp added 2 commits December 16, 2025 17:05

Merge branch 'keras-team:master' into test_gpt_oss_model

7c60193

Address all comments

193af45

laxmareddyp added the kokoro:force-run Runs Tests on GPU label Dec 17, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Dec 17, 2025

sachinprasadhs approved these changes Dec 17, 2025

View reviewed changes

sachinprasadhs merged commit b238c7d into keras-team:master Dec 17, 2025
11 checks passed

This was referenced Dec 17, 2025

Register OpenAI GPT-OSS and GPT-OSS-SAFEGUARD Presets to kerashub. #2473

Merged

An automated script to port any Text-only decoder LLM model from Hugging Face to Keras Hub repo #2497

Merged

Conversation

laxmareddyp commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

divyashreepathihalli commented Sep 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laxmareddyp commented Sep 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laxmareddyp commented Sep 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattdangerw commented Sep 10, 2025

Uh oh!

laxmareddyp commented Sep 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laxmareddyp commented Sep 5, 2025 •

edited

Loading