{
  "clusters": {},
  "flaky": [
    {
      "model": "audioflamingo3",
      "gpu": "single",
      "test": "tests/models/audioflamingo3/test_modeling_audioflamingo3.py::AudioFlamingo3ForConditionalGenerationIntegrationTest::test_fixture_batched_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "audioflamingo3",
      "gpu": "single",
      "test": "tests/models/audioflamingo3/test_modeling_audioflamingo3.py::AudioFlamingo3ForConditionalGenerationIntegrationTest::test_fixture_single_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "aya_vision",
      "gpu": "single",
      "test": "tests/models/aya_vision/test_modeling_aya_vision.py::AyaVisionIntegrationTest::test_small_model_integration_generate_chat_template",
      "trace": "(line 355)  AssertionError: 'The [29 chars] two cats resting on a bright pink blanket spread across a red' != 'The [29 chars] two cats resting on a bright pink blanket. The cats,'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 355)  AssertionError: 'The [29 chars] two cats resting on a bright pink blanket spread across a red' != 'The [29 chars] two cats resting on a bright pink blanket. The cats,'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bamba",
      "gpu": "single",
      "test": "tests/models/bamba/test_modeling_bamba.py::BambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 780)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.50 GiB is free. Process 114588 has 18.80 GiB memory in use. Of the allocated memory 18.42 GiB is allocated by PyTorch, and 17.60 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 779)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.50 GiB is free. Process 131469 has 18.80 GiB memory in use. Of the allocated memory 18.42 GiB is allocated by PyTorch, and 17.60 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bamba",
      "gpu": "single",
      "test": "tests/models/bamba/test_modeling_bamba.py::BambaModelIntegrationTest::test_simple_generate",
      "trace": "(line 780)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.42 GiB is free. Process 114588 has 18.88 GiB memory in use. Of the allocated memory 18.50 GiB is allocated by PyTorch, and 20.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 779)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.42 GiB is free. Process 131469 has 18.88 GiB memory in use. Of the allocated memory 18.50 GiB is allocated by PyTorch, and 20.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "big_bird",
      "gpu": "single",
      "test": "tests/models/big_bird/test_modeling_big_bird.py::BigBirdModelIntegrationTest::test_fill_mask",
      "trace": "(line 906)  AssertionError: '' != 'happiness'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 906)  AssertionError: '' != 'happiness'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bitnet",
      "gpu": "single",
      "test": "tests/models/bitnet/test_modeling_bitnet.py::BitNetIntegrationTest::test_model_generation",
      "trace": "(line 309)  RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bitnet",
      "gpu": "single",
      "test": "tests/models/bitnet/test_modeling_bitnet.py::BitNetIntegrationTest::test_model_logits",
      "trace": "(line 309)  RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "blip_2",
      "gpu": "single",
      "test": "tests/models/blip_2/test_modeling_blip_2.py::Blip2ModelIntegrationTest::test_inference_t5",
      "trace": "(line 1616)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1616)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "blip_2",
      "gpu": "single",
      "test": "tests/models/blip_2/test_modeling_blip_2.py::Blip2ModelIntegrationTest::test_inference_t5_batched_beam_search",
      "trace": "(line 1671)  AssertionError: Lists differ: [0, 3, 9, 2335, 19, 3823, 30, 8, 2608, 28, 160, 1782, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1671)  AssertionError: Lists differ: [0, 3, 9, 2335, 19, 3823, 30, 8, 2608, 28, 160, 1782, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "single",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_batch_generated_text",
      "trace": "(line 621)  AssertionError: Lists differ: ['Hello what is', 'Running a quick test with the followi[54 chars]the'] != ['Hello what is the best way to get the data from the se[127 chars]on2']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 621)  AssertionError: Lists differ: ['Hello what is', 'Running a quick test with the followi[54 chars]the'] != ['Hello what is the best way to get the data from the se[127 chars]on2']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "single",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_batch_generation_padding",
      "trace": "(line 586)  AssertionError: Lists differ: [5941[15 chars]632, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,[82 chars]0, 0] != [5941[15 chars]632, 419, 682, 15, 473, 912, 267, 40704, 15, 1[186 chars] 912]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 586)  AssertionError: Lists differ: [5941[15 chars]632, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,[82 chars]0, 0] != [5941[15 chars]632, 419, 682, 15, 473, 912, 267, 40704, 15, 1[186 chars] 912]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "single",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_simple_generation",
      "trace": "(line 539)  AssertionError: 'I en[58 chars] play. I am a very active person, and I am a v[75 chars]am a' != 'I en[58 chars] play with the kids. I am a very active person[86 chars]nd I'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 539)  AssertionError: 'I en[58 chars] play. I am a very active person, and I am a v[75 chars]am a' != 'I en[58 chars] play with the kids. I am a very active person[86 chars]nd I'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "single",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_constrastive_learning",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-large-itm-mlm-itc. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-large-itm-mlm-itc. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "single",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_image_and_text_retrieval",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "single",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_masked_language_modeling",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "single",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b",
      "trace": "(line 399)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[92 chars]ted'] != ['Des[115 chars] dot in the center representing the North Star[99 chars]the']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 399)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[92 chars]ted'] != ['Des[115 chars] dot in the center representing the North Star[99 chars]the']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "single",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b_batched",
      "trace": "(line 445)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[309 chars]on.'] != ['Des[115 chars] dot in the center representing the star Alpha[154 chars]The']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 445)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[309 chars]on.'] != ['Des[115 chars] dot in the center representing the star Alpha[154 chars]The']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "single",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b_multi_image",
      "trace": "(line 469)  AssertionError: Lists differ: ['Wha[74 chars]een the night sky and the internet. The first [115 chars]The'] != ['Wha[74 chars]een two celestial objects, the stars and the c[113 chars]map']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  AssertionError: Lists differ: ['Wha[74 chars]een the night sky and the internet. The first [115 chars]The'] != ['Wha[74 chars]een two celestial objects, the stars and the c[113 chars]map']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "clvp",
      "gpu": "single",
      "test": "tests/models/clvp/test_modeling_clvp.py::ClvpIntegrationTest::test_conditional_encoder",
      "trace": "(line 552)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 552)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "clvp",
      "gpu": "single",
      "test": "tests/models/clvp/test_modeling_clvp.py::ClvpIntegrationTest::test_full_model_integration",
      "trace": "(line 1310)  RuntimeError: The expanded size of the tensor (2) must match the existing size (3) at non-singleton dimension 0.  Target sizes: [2].  Tensor sizes: [3]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1310)  RuntimeError: The expanded size of the tensor (2) must match the existing size (3) at non-singleton dimension 0.  Target sizes: [2].  Tensor sizes: [3]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "single",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2IntegrationTest::test_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([2.3711, 1.6689, 1.8389, 1.9785, 1.9121], dtype=torch.float16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([2.3711, 1.6689, 1.8389, 1.9785, 1.9121], dtype=torch.float16)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "single",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2IntegrationTest::test_model_integration_generate_chat_template",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.86 GiB. GPU 0 has a total capacity of 22.30 GiB of which 720.69 MiB is free. Process 247374 has 21.59 GiB memory in use. Of the allocated memory 15.37 GiB is allocated by PyTorch, and 5.83 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.86 GiB. GPU 0 has a total capacity of 22.30 GiB of which 720.69 MiB is free. Process 823684 has 21.59 GiB memory in use. Of the allocated memory 15.37 GiB is allocated by PyTorch, and 5.83 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "single",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2MoeVisionIntegrationTest::test_model_forward_vision",
      "trace": "(line 473)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "first_failure_day": "2026-05-21",
      "last_green_day": "2026-05-20",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "single",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2MoeVisionIntegrationTest::test_model_generate_vision",
      "trace": "(line 473)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "first_failure_day": "2026-05-21",
      "last_green_day": "2026-05-20",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "colqwen2",
      "gpu": "single",
      "test": "tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ModelIntegrationTest::test_model_integration_test",
      "trace": "(line 110)  ValueError: images must be an image, list of images or list of list of images",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 110)  ValueError: images must be an image, list of images or list of list of images",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "colqwen2",
      "gpu": "single",
      "test": "tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ModelIntegrationTest::test_model_integration_test_2",
      "trace": "(line 400)  AssertionError: Expected scores tensor([[16.3750, 10.9375, 14.7500],",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 400)  AssertionError: Expected scores tensor([[16.3750, 10.9375, 14.7500],",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "convnextv2",
      "gpu": "single",
      "test": "tests/models/convnextv2/test_modeling_convnextv2.py::ConvNextV2ModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 308)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 308)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cvt",
      "gpu": "single",
      "test": "tests/models/cvt/test_modeling_cvt.py::CvtModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 271)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 271)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cwm",
      "gpu": "single",
      "test": "tests/models/cwm/test_modeling_cwm.py::CwmIntegrationTest::test_cwm_integration",
      "trace": "(line 1968)  AttributeError: 'CwmDecoderLayer' object has no attribute 'attention_type'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1968)  AttributeError: 'CwmDecoderLayer' object has no attribute 'attention_type'",
      "first_failure_day": "2026-03-21",
      "last_green_day": "2026-03-20",
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "cwm",
      "gpu": "single",
      "test": "tests/models/cwm/test_modeling_cwm.py::CwmIntegrationTest::test_cwm_sliding_window_long_sequence",
      "trace": "(line 182)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 182)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dab_detr",
      "gpu": "single",
      "test": "tests/models/dab_detr/test_modeling_dab_detr.py::DabDetrModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 805)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 805)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_0_dac_16khz",
      "trace": "(line 819)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 819)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_1_dac_24khz",
      "trace": "(line 813)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 813)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_2_dac_44khz",
      "trace": "(line 825)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 825)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_0_dac_16khz",
      "trace": "(line 870)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 870)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_1_dac_24khz",
      "trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "single",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_2_dac_44khz",
      "trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dbrx",
      "gpu": "single",
      "test": "tests/models/dbrx/test_modeling_dbrx.py::DbrxModelIntegrationTest::test_tiny_model_logits",
      "trace": "(line 146)  huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'moe_jitter_eps':",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 146)  huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'moe_jitter_eps':",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": true
    },
    {
      "model": "deepseek_v2",
      "gpu": "single",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_batch_fa2",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 2.69 MiB is free. Process 225485 has 22.29 GiB memory in use. Of the allocated memory 21.66 GiB is allocated by PyTorch, and 50.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 2.69 MiB is free. Process 332120 has 22.29 GiB memory in use. Of the allocated memory 21.66 GiB is allocated by PyTorch, and 50.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_v2",
      "gpu": "single",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_deepseek_v2_lite",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 222.69 MiB is free. Process 225485 has 22.08 GiB memory in use. Of the allocated memory 21.47 GiB is allocated by PyTorch, and 22.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 222.69 MiB is free. Process 332120 has 22.08 GiB memory in use. Of the allocated memory 21.47 GiB is allocated by PyTorch, and 22.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_v2",
      "gpu": "single",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_logits_eager",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 222.69 MiB is free. Process 225485 has 22.08 GiB memory in use. Of the allocated memory 21.47 GiB is allocated by PyTorch, and 22.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 222.69 MiB is free. Process 332120 has 22.08 GiB memory in use. Of the allocated memory 21.47 GiB is allocated by PyTorch, and 22.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_v3",
      "gpu": "single",
      "test": "tests/models/deepseek_v3/test_modeling_deepseek_v3.py::DeepseekV3IntegrationTest::test_compile_static_cache",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_vl",
      "gpu": "single",
      "test": "tests/models/deepseek_vl/test_modeling_deepseek_vl.py::DeepseekVLIntegrationTest::test_model_text_generation_batched",
      "trace": "(line 147)  AssertionError: Lists differ: ['You[222 chars]tant:The image depicts a snowy landscape with [367 chars]the'] != ['You[222 chars]tant:What is a cat, a cat, a cat, a cat, a cat[329 chars]the']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 147)  AssertionError: Lists differ: ['You[222 chars]tant:The image depicts a snowy landscape with [367 chars]the'] != ['You[222 chars]tant:What is a cat, a cat, a cat, a cat, a cat[329 chars]the']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_vl_hybrid",
      "gpu": "single",
      "test": "tests/models/deepseek_vl_hybrid/test_modeling_deepseek_vl_hybrid.py::DeepseekVLHybridIntegrationTest::test_model_text_generation_batched",
      "trace": "(line 370)  AssertionError: Lists differ: ['You[224 chars]nt:\\nThe image depicts a fluffy, light brown a[371 chars]he '] != ['You[224 chars]nt:\\nA fluffy animal in a fluffyThe image,The [329 chars]he ']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  AssertionError: Lists differ: ['You[224 chars]nt:\\nThe image depicts a fluffy, light brown a[371 chars]he '] != ['You[224 chars]nt:\\nA fluffy animal in a fluffyThe image,The [329 chars]he ']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "deepseek_vl_hybrid",
      "gpu": "single",
      "test": "tests/models/deepseek_vl_hybrid/test_modeling_deepseek_vl_hybrid.py::DeepseekVLHybridIntegrationTest::test_model_text_generation_with_multi_image",
      "trace": "(line 468)  RuntimeError: You can't move a model that has some modules offloaded to cpu or disk.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 468)  RuntimeError: You can't move a model that has some modules offloaded to cpu or disk.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "depth_anything",
      "gpu": "single",
      "test": "tests/models/depth_anything/test_modeling_depth_anything.py::DepthAnythingModelIntegrationTest::test_inference",
      "trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "dia",
      "gpu": "single",
      "test": "tests/models/dia/test_modeling_dia.py::DiaForConditionalGenerationIntegrationTest::test_dia_model_integration_generate_audio_context",
      "trace": "(line 732)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 732)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "diffllama",
      "gpu": "single",
      "test": "tests/models/diffllama/test_modeling_diffllama.py::DiffLlamaIntegrationTest::test_compile_static_cache",
      "trace": "(line 484)  AssertionError: Lists differ: ['Sim[41 chars]that 1) the speed of light is constant in all [301 chars]y p'] != ['Sim[41 chars]that 2.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 '[133 chars]a a']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 484)  AssertionError: Lists differ: ['Sim[41 chars]that 1) the speed of light is constant in all [301 chars]y p'] != ['Sim[41 chars]that 2.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 '[133 chars]a a']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_batched_images_batched_boxes",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_batched_images_batched_points_multi_points",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_batched_images_multi_points",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_from_existing_points_and_mask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_one_point_multimask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "single",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_one_point_no_multimask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "efficientnet",
      "gpu": "single",
      "test": "tests/models/efficientnet/test_modeling_efficientnet.py::EfficientNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "single",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generate_images",
      "trace": "(line 1968)  AttributeError: 'Emu3ForConditionalGeneration' object has no attribute 'vocabulary_mapping'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1968)  AttributeError: 'Emu3ForConditionalGeneration' object has no attribute 'vocabulary_mapping'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "single",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation",
      "trace": "(line 363)  AssertionError: Lists differ: ['USE[85 chars]ANT: The image captures a moment of tranquilit[145 chars] in'] != ['USE[85 chars]ANT: 1. The image is a 1.\u4f60\u597d!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 363)  AssertionError: Lists differ: ['USE[85 chars]ANT: The image captures a moment of tranquilit[145 chars] in'] != ['USE[85 chars]ANT: 1. The image is a 1.\u4f60\u597d!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "single",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_batched",
      "trace": "(line 2397)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 458.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 378.69 MiB is free. Process 449348 has 21.93 GiB memory in use. Of the allocated memory 21.40 GiB is allocated by PyTorch, and 145.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2397)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 458.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 378.69 MiB is free. Process 1275216 has 21.93 GiB memory in use. Of the allocated memory 21.40 GiB is allocated by PyTorch, and 145.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "single",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_multi_image",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.66 GiB. GPU 0 has a total capacity of 22.30 GiB of which 376.69 MiB is free. Process 449348 has 21.93 GiB memory in use. Of the allocated memory 21.40 GiB is allocated by PyTorch, and 145.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.66 GiB. GPU 0 has a total capacity of 22.30 GiB of which 376.69 MiB is free. Process 1275216 has 21.93 GiB memory in use. Of the allocated memory 21.40 GiB is allocated by PyTorch, and 145.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "eomt_dinov3",
      "gpu": "single",
      "test": "tests/models/eomt_dinov3/test_modeling_eomt_dinov3.py::EomtDinov3ForUniversalSegmentationIntegrationTest::test_inference_bf16",
      "trace": "(line 310)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 310)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "evolla",
      "gpu": "single",
      "test": "tests/models/evolla/test_modeling_evolla.py::EvollaModelIntegrationTest::test_inference_natural_language_protein_reasoning",
      "trace": "(line 364)  AssertionError: 'This protein' not found in 'systemYouareanAIexpertthatcanansweranyquestionsaboutprotein.userWhatisthefunctionofthisprotein?assistant\u010aThis\u0120protein\u0120is\u0120a\u0120critical\u0120enzyme\u0120involved\u0120in\u0120the\u0120metabolic\u0120pathway\u0120of\u0120purine\u0120metabolism,\u0120specifically\u0120in\u0120the\u0120salvage\u0120pathway\u0120of\u0120IMP\u0120(inosine\u0120monophosphate)\u0120biosynthesis.\u0120Its\u0120primary\u0120function\u0120is\u0120to\u0120catalyze\u0120the\u0120conversion\u0120of\u0120hypoxanthine\u0120and\u0120guanine\u0120into\u0120their\u0120respective\u0120nucleotide\u0120monophosphates,\u0120which\u0120are\u0120essential\u0120building\u0120blocks\u0120for\u0120nucleic\u0120acids.\u010a\u010aThe\u0120protein\u0120is\u0120annotated\u0120with\u0120several\u0120molecular\u0120functions,\u0120including\u0120guanine\u0120phosphoribosyltransferase\u0120activity\u0120and\u0120hypoxanthine\u0120phosphorib'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 364)  AssertionError: 'This protein' not found in 'systemYouareanAIexpertthatcanansweranyquestionsaboutprotein.userWhatisthefunctionofthisprotein?assistant\u010aThis\u0120protein\u0120is\u0120a\u0120critical\u0120enzyme\u0120involved\u0120in\u0120the\u0120metabolic\u0120pathway\u0120of\u0120purine\u0120metabolism,\u0120specifically\u0120in\u0120the\u0120salvage\u0120pathway\u0120of\u0120IMP\u0120(inosine\u0120monophosphate)\u0120biosynthesis.\u0120Its\u0120primary\u0120function\u0120is\u0120to\u0120catalyze\u0120the\u0120conversion\u0120of\u0120hypoxanthine\u0120and\u0120guanine\u0120into\u0120their\u0120respective\u0120nucleotide\u0120monophosphates,\u0120which\u0120are\u0120essential\u0120building\u0120blocks\u0120for\u0120nucleic\u0120acids.\u010a\u010aThe\u0120protein\u0120is\u0120annotated\u0120with\u0120several\u0120molecular\u0120functions,\u0120including\u0120guanine\u0120phosphoribosyltransferase\u0120activity\u0120and\u0120hypoxanthine\u0120phosphorib'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "single",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_generation_beyond_sliding_window",
      "trace": "(line 160)  AssertionError: \" Thi[46 chars] and the atmosphere is so relaxing. I'm gratef[47 chars]. It\" != \" Thi[46 chars] and I'm grateful for the opportunity to exper[26 chars]reak\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 160)  AssertionError: \" Thi[46 chars] and the atmosphere is so relaxing. I'm gratef[47 chars]. It\" != \" Thi[46 chars] and I'm grateful for the opportunity to exper[26 chars]reak\"",
      "first_failure_day": "2026-03-28",
      "last_green_day": "2026-03-27",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "single",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_logits",
      "trace": "(line 99)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 99)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-28",
      "last_green_day": "2026-03-27",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "exaone_moe",
      "gpu": "single",
      "test": "tests/models/exaone_moe/test_modeling_exaone_moe.py::ExaoneMoeIntegrationTest::test_model_logits",
      "trace": "(line 120)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 120)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "falcon_h1",
      "gpu": "single",
      "test": "tests/models/falcon_h1/test_modeling_falcon_h1.py::FalconH1ModelIntegrationTest::test_falcon_h1_hard",
      "trace": "(line 470)  AssertionError: 'user\\nTell me about the french revolutio[1920 chars]ct**' != \"user\\nTell me about the french revolutio[1929 chars]n6. \"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 470)  AssertionError: 'user\\nTell me about the french revolutio[1920 chars]ct**' != \"user\\nTell me about the french revolutio[1929 chars]n6. \"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "single",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_batched_generation",
      "trace": "(line 488)  AssertionError: Lists differ: ['Hello today I will be talking about the \u201cTheory of Rela[161 chars]bal'] != ['Hello today I am going to talk about the \u201cTheory of Rel[159 chars]bal']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  AssertionError: Lists differ: ['Hello today I will be talking about the \u201cTheory of Rela[161 chars]bal'] != ['Hello today I am going to talk about the \u201cTheory of Rel[159 chars]bal']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "single",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_4bit",
      "trace": "(line 438)  AssertionError: 'Hello today I\\'m going to be talking about the \"A\" in the \"A-B' != \"Hello today Iava,\\n\\nI'm sorry to hear that you're having trouble with the \"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 438)  AssertionError: 'Hello today I\\'m going to be talking about the \"A\" in the \"A-B' != \"Hello today Iava,\\n\\nI'm sorry to hear that you're having trouble with the \"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "single",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_fp16",
      "trace": "(line 423)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 423)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "first_failure_day": "2026-03-19",
      "last_green_day": "2026-03-18",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "single",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_torch_compile",
      "trace": "(line 451)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 451)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "first_failure_day": "2026-03-19",
      "last_green_day": "2026-03-18",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "fastspeech2_conformer",
      "gpu": "single",
      "test": "tests/models/fastspeech2_conformer/test_modeling_fastspeech2_conformer.py::FastSpeech2ConformerModelIntegrationTest::test_training_integration",
      "trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "flava",
      "gpu": "single",
      "test": "tests/models/flava/test_modeling_flava.py::FlavaModelIntegrationTest::test_inference",
      "trace": "(line 899)  AssertionError: -1352.535400390625 != -1352.4685 within 4 places (0.06690039062505093 difference)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 899)  AssertionError: -1352.535400390625 != -1352.4685 within 4 places (0.06690039062505093 difference)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "flava",
      "gpu": "single",
      "test": "tests/models/flava/test_modeling_flava.py::FlavaForPreTrainingIntegrationTest::test_inference_with_itm_labels",
      "trace": "(line 1223)  AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 2]) != torch.Size([2, 2]).",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1223)  AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 2]) != torch.Size([2, 2]).",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "flex_olmo",
      "gpu": "single",
      "test": "tests/models/flex_olmo/test_modeling_flex_olmo.py::FlexOlmoIntegrationTest::test_model_7b_logits",
      "trace": "(line 87)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 87)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "florence2",
      "gpu": "single",
      "test": "tests/models/florence2/test_modeling_florence2.py::Florence2ForConditionalGenerationIntegrationTest::test_large_model_inference_eager",
      "trace": "(line 470)  AssertionError: Lists differ: [[2, [144 chars], 5, 2014, 6, 8, 11, 5, 3618, 6, 89, 32, 3980,[51 chars], 2]] != [[2, [144 chars], 5, 921, 6, 8, 11, 5, 3618, 6, 89, 32, 1104, [44 chars], 2]]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 470)  AssertionError: Lists differ: [[2, [144 chars], 5, 2014, 6, 8, 11, 5, 3618, 6, 89, 32, 3980,[51 chars], 2]] != [[2, [144 chars], 5, 921, 6, 8, 11, 5, 3618, 6, 89, 32, 1104, [44 chars], 2]]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "single",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_inference_no_head",
      "trace": "(line 484)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 484)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "single",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_translation_direct_0_en_ru",
      "trace": "(line 517)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 517)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "single",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_translation_direct_1_ru_en",
      "trace": "(line 517)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 517)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "fuyu",
      "gpu": "single",
      "test": "tests/models/fuyu/test_modeling_fuyu.py::FuyuModelIntegrationTest::test_greedy_generation",
      "trace": "(line 295)  AssertionError: '\\x04 A bus parked on the side of a road.' != 'A blue bus parked on the side of a road.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 295)  AssertionError: '\\x04 A bus parked on the side of a road.' != 'A blue bus parked on the side of a road.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_compile_static_cache",
      "trace": "(line 337)  AssertionError: Lists differ: ['Hel[196 chars]tdi 105bhp.\\nI have a problem with the engine [37 chars]the'] != ['Hel[196 chars]tdi 110bhp.\\nI have a problem with the engine [49 chars]ugh']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 337)  AssertionError: Lists differ: ['Hel[196 chars]tdi 105bhp.\\nI have a problem with the engine [37 chars]the'] != ['Hel[196 chars]tdi 110bhp.\\nI have a problem with the engine [49 chars]ugh']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_export_static_cache",
      "trace": "(line 414)  AssertionError: Lists differ: ['Hel[87 chars] in the 1990s. I have been looking on the internet and I have'] != ['Hel[87 chars] in the 1990s. I have looked on the internet and I have found']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 414)  AssertionError: Lists differ: ['Hel[87 chars] in the 1990s. I have been looking on the internet and I have'] != ['Hel[87 chars] in the 1990s. I have looked on the internet and I have found']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_2b_4bit",
      "trace": "(line 190)  AssertionError: Lists differ: ['Hel[118 chars] you a few of my favorite and most used brushes.\\n\\nI\"] != ['Hel[118 chars] you my experience with the new wattpad wattpa[38 chars]pad\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 190)  AssertionError: Lists differ: ['Hel[118 chars] you a few of my favorite and most used brushes.\\n\\nI\"] != ['Hel[118 chars] you my experience with the new wattpad wattpa[38 chars]pad\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_4bit",
      "trace": "(line 317)  AssertionError: Lists differ: ['Hel[59 chars]ke a \"self balancing\" robot. I have', 'Hi toda[76 chars] of'] != ['Hel[59 chars]ke a program that will take a number and then'[93 chars]!:)']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 317)  AssertionError: Lists differ: ['Hel[59 chars]ke a \"self balancing\" robot. I have', 'Hi toda[76 chars] of'] != ['Hel[59 chars]ke a program that will take a number and then'[93 chars]!:)']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_bf16",
      "trace": "(line 258)  AssertionError: Lists differ: ['Hel[59 chars]ke a small game. I have a few questions', 'Hi [86 chars]and'] != ['Hel[59 chars]ke a game in which you have to get a', 'Hi tod[83 chars]and']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 258)  AssertionError: Lists differ: ['Hel[59 chars]ke a small game. I have a few questions', 'Hi [86 chars]and'] != ['Hel[59 chars]ke a game in which you have to get a', 'Hi tod[83 chars]and']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_fp16",
      "trace": "(line 228)  AssertionError: Lists differ: ['Hel[27 chars]a 1995 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 228)  AssertionError: Lists differ: ['Hel[27 chars]a 1995 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "single",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_fp16_static_cache",
      "trace": "(line 288)  AssertionError: Lists differ: ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1995 3000gt SL. I have a', 'Hi today I am go[59 chars] 3D']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 288)  AssertionError: Lists differ: ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1995 3000gt SL. I have a', 'Hi today I am go[59 chars] 3D']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma2",
      "gpu": "single",
      "test": "[100%]tests/models/gemma2/test_modeling_gemma2.py::Gemma2IntegrationTest::test_model_2b_pipeline_bf16_flex_attention",
      "trace": "(line 2860)  Failed: (subprocess) AssertionError: \"Hi t[26 chars]ng about the 10 best anime of all time.\\n\\n1\" != \"Hi t[26 chars]ng about the 10 most powerful characters in the Naruto series.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2876)  Failed: (subprocess) AssertionError: \"Hi t[26 chars]ng about the 10 best anime of all time.\\n\\n1\" != \"Hi t[26 chars]ng about the 10 most powerful characters in the Naruto series.\"",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "status": "flaky: test passed in the previous run (commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da) but failed (on the same commit) during the check of the current run.",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma2",
      "gpu": "single",
      "test": "tests/models/gemma2/test_modeling_gemma2.py::Gemma2IntegrationTest::test_model_2b_pipeline_bf16_flex_attention",
      "trace": "Cannot retrieve error message.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "Cannot retrieve error message.",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "single",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_dynamic_sliding_window_is_default",
      "trace": "(line 874)  AssertionError: 'DynamicSlidingWindowLayer' unexpectedly found in 'DynamicCache(layers=[DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer])'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 874)  AssertionError: 'DynamicSlidingWindowLayer' unexpectedly found in 'DynamicCache(layers=[DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer])'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "single",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_1b_text_only",
      "trace": "(line 728)  AssertionError: Lists differ: ['Wri[48 chars]data streams, a boundless flow,\\nA silent worl[63 chars]ing'] != ['Wri[48 chars]data flows, a silent stream,\\nInto the neural [51 chars],\\n']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 728)  AssertionError: Lists differ: ['Wri[48 chars]data streams, a boundless flow,\\nA silent worl[63 chars]ing'] != ['Wri[48 chars]data flows, a silent stream,\\nInto the neural [51 chars],\\n']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "single",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch",
      "trace": "(line 548)  AssertionError: Lists differ: ['use[149 chars]with turquoise water and a blue sky in the bac[227 chars]own\"] != ['use[149 chars]with clear turquoise water and a blue sky in t[231 chars]own\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 548)  AssertionError: Lists differ: ['use[149 chars]with turquoise water and a blue sky in the bac[227 chars]own\"] != ['use[149 chars]with clear turquoise water and a blue sky in t[231 chars]own\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "single",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops",
      "trace": "(line 663)  AssertionError: Lists differ: ['user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a'] != [\"user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 663)  AssertionError: Lists differ: ['user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a'] != [\"user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "single",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_crops",
      "trace": "(line 590)  AssertionError: Lists differ: [\"user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the\"] != ['user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 590)  AssertionError: Lists differ: [\"user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the\"] != ['user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "single",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_generation_beyond_sliding_window",
      "trace": "(line 1196)  AssertionError: Lists differ: [' and I find it very relaxing. I also lik[112 chars]re'\"] != [\" and the people are so friendly. I'm so [93 chars]re'\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1196)  AssertionError: Lists differ: [' and I find it very relaxing. I also lik[112 chars]re'\"] != [\" and the people are so friendly. I'm so [93 chars]re'\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "single",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_batch",
      "trace": "(line 1083)  AssertionError: Lists differ: ['use[196 chars]ewer and has its tongue', \"user\\nYou are a hel[193 chars]cow\"] != ['use[196 chars]ewer with its head slightly', \"user\\nYou are a[197 chars]cow\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1083)  AssertionError: Lists differ: ['use[196 chars]ewer and has its tongue', \"user\\nYou are a hel[193 chars]cow\"] != ['use[196 chars]ewer with its head slightly', \"user\\nYou are a[197 chars]cow\"]",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "single",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_bf16",
      "trace": "(line 998)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 998)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "single",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_image",
      "trace": "(line 1110)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1110)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "single",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_multiimage",
      "trace": "(line 1151)  AssertionError: Lists differ: ['use[140 chars]n district. Here are the key elements:\\n\\n* **A prominent red'] != ['use[140 chars]n district. Here are some of the key elements:\\n\\n* **A']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1151)  AssertionError: Lists differ: ['use[140 chars]n district. Here are the key elements:\\n\\n* **A prominent red'] != ['use[140 chars]n district. Here are some of the key elements:\\n\\n* **A']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "single",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_export_text_only",
      "trace": "(line 2301)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB. GPU 0 has a total capacity of 22.30 GiB of which 2.80 GiB is free. Process 1467578 has 19.49 GiB memory in use. Of the allocated memory 19.04 GiB is allocated by PyTorch, and 56.81 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2301)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB. GPU 0 has a total capacity of 22.30 GiB of which 2.80 GiB is free. Process 504134 has 19.49 GiB memory in use. Of the allocated memory 19.04 GiB is allocated by PyTorch, and 56.81 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-04-09",
      "last_green_day": "2026-04-08",
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "single",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_multiimage",
      "trace": "(line 742)  AssertionError: Lists differ: ['Bas[66 chars]und & Street Scene:**\\n* **Roadway:** There is an'] != ['Bas[66 chars]und & Street Scene:**\\n* **Traffic Sign:** The most prominent']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 742)  AssertionError: Lists differ: ['Bas[66 chars]und & Street Scene:**\\n* **Roadway:** There is an'] != ['Bas[66 chars]und & Street Scene:**\\n* **Traffic Sign:** The most prominent']",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "single",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_with_image",
      "trace": "(line 655)  AssertionError: Lists differ: ['Thi[61 chars] beach** with the **ocean** in the background under a **clear'] != ['Thi[61 chars] beach** with the **ocean and a blue sky** in the background']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 655)  AssertionError: Lists differ: ['Thi[61 chars] beach** with the **ocean** in the background under a **clear'] != ['Thi[61 chars] beach** with the **ocean and a blue sky** in the background']",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "single",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_with_image_batch",
      "trace": "(line 706)  AssertionError: Lists differ: ['Thi[81 chars]ocean** in the background under a **clear', \"N[102 chars] on\"] != ['Thi[81 chars]ocean and a blue sky** in the background', 'No[127 chars]lue']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 706)  AssertionError: Lists differ: ['Thi[81 chars]ocean** in the background under a **clear', \"N[102 chars] on\"] != ['Thi[81 chars]ocean and a blue sky** in the background', 'No[127 chars]lue']",
      "first_failure_day": "2026-04-29",
      "last_green_day": "2026-04-28",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "git",
      "gpu": "single",
      "test": "tests/models/git/test_modeling_git.py::GitModelIntegrationTest::test_inference_image_captioning",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "single",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_eager",
      "trace": "(line 133)  AssertionError: Lists differ: ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper flower.'] != ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper lantern.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 133)  AssertionError: Lists differ: ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper flower.'] != ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper lantern.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "single",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_fp16",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 214.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 76.69 MiB is free. Process 487783 has 22.22 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 8.15 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 214.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 76.69 MiB is free. Process 612952 has 22.22 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 8.15 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "OOM",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "single",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_sdpa",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.16 GiB. GPU 0 has a total capacity of 22.30 GiB of which 76.69 MiB is free. Process 487783 has 22.22 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 8.15 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.16 GiB. GPU 0 has a total capacity of 22.30 GiB of which 76.69 MiB is free. Process 612952 has 22.22 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 8.15 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "OOM",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_expand",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "single",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm4_moe",
      "gpu": "single",
      "test": "tests/models/glm4_moe/test_modeling_glm4_moe.py::Glm4MoeIntegrationTest::test_compile_static_cache",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 120.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 50.69 MiB is free. Process 539381 has 22.25 GiB memory in use. Of the allocated memory 19.49 GiB is allocated by PyTorch, and 2.38 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 120.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 58.69 MiB is free. Process 1190179 has 22.24 GiB memory in use. Of the allocated memory 21.49 GiB is allocated by PyTorch, and 379.35 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm4_moe_lite",
      "gpu": "single",
      "test": "tests/models/glm4_moe_lite/test_modeling_glm4_moe_lite.py::Glm4MoeIntegrationTest::test_compile_static_cache",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 515821 has 22.28 GiB memory in use. Of the allocated memory 21.65 GiB is allocated by PyTorch, and 54.71 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 468727 has 22.28 GiB memory in use. Of the allocated memory 21.64 GiB is allocated by PyTorch, and 58.85 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_image",
      "gpu": "single",
      "test": "tests/models/glm_image/test_modeling_glm_image.py::GlmImageIntegrationTest::test_image_to_image_generation",
      "trace": "(line 687)  AssertionError: False is not true : Expected first 30 tokens:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Expected first 30 tokens:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test",
      "trace": "(line 456)  assert [151331, 1513..., 151343, ...] == [59248, 59250...6, 59280, ...]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 456)  assert [151331, 1513..., 151343, ...] == [59248, 59250...6, 59280, ...]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 503)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14885 chars]ia.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [256 chars]t's\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14885 chars]ia.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [256 chars]t's\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 631)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[10983 chars]at.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [258 chars]but\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 631)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[10983 chars]at.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [258 chars]but\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 603)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[7469 chars]Ai.\"] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]ion']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 603)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[7469 chars]Ai.\"] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]ion']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_expand",
      "trace": "(line 575)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14840 chars]d a'] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]lly\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 575)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14840 chars]d a'] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]lly\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "single",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 541)  AssertionError: Lists differ: ['\\n<|begin_of_video|><|image|><|image|><|[50804 chars]rt.'] != [\"\\n012345Describe this video.\\n<think>Got[114 chars]irt\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 541)  AssertionError: Lists differ: ['\\n<|begin_of_video|><|image|><|image|><|[50804 chars]rt.'] != [\"\\n012345Describe this video.\\n<think>Got[114 chars]irt\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "got_ocr2",
      "gpu": "single",
      "test": "tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2IntegrationTest::test_small_model_integration_test_got_ocr_format",
      "trace": "(line 210)  AssertionError: 'R\\\\&D' != '\\\\title{\\nR'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 210)  AssertionError: 'R\\\\&D' != '\\\\title{\\nR'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "granite",
      "gpu": "single",
      "test": "tests/models/granite/test_modeling_granite.py::GraniteIntegrationTest::test_model_3b_logits_bf16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "single",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_cross_attention_mask",
      "trace": "(line 787)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 787)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "single",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_grounding_dino_loss",
      "trace": "(line 869)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 869)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "single",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 678)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 678)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "single",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_accelerator",
      "trace": "(line 745)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 745)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "helium",
      "gpu": "single",
      "test": "tests/models/helium/test_modeling_helium.py::HeliumIntegrationTest::test_model_2b",
      "trace": "(line 73)  AssertionError: Lists differ: ['Hel[51 chars]have been working on a new project for a while now and I have'] != ['Hel[51 chars]have been working on a new project for a while now, and I']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 73)  AssertionError: Lists differ: ['Hel[51 chars]have been working on a new project for a while now and I have'] != ['Hel[51 chars]have been working on a new project for a while now, and I']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "hiera",
      "gpu": "single",
      "test": "tests/models/hiera/test_modeling_hiera.py::HieraModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 560)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 560)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "single",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_batched_inference",
      "trace": "(line 1399)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1399)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "single",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_multi_speaker_smart_voice",
      "trace": "(line 758)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 758)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "single",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_multi_speaker_voice_cloning",
      "trace": "(line 1098)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1098)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "single",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_zero_shot_voice_cloning",
      "trace": "(line 931)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 931)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "hyperclovax",
      "gpu": "single",
      "test": "tests/models/hyperclovax/test_modeling_hyperclovax.py::HyperCLOVAXIntegrationTest::test_model_seed_think_14b_bf16",
      "trace": "(line 1313)  ValueError: There are one or more stop strings, either in the arguments to `generate` or in the model's generation config, but we could not locate a tokenizer. When generating with stop strings, you must pass the model's tokenizer to the `tokenizer` argument of `generate`.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1315)  ValueError: There are one or more stop strings, either in the arguments to `generate` or in the model's generation config, but we could not locate a tokenizer. When generating with stop strings, you must pass the model's tokenizer to the `tokenizer` argument of `generate`.",
      "first_failure_day": "2026-05-29",
      "last_green_day": "2026-05-28",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "instructblip",
      "gpu": "single",
      "test": "tests/models/instructblip/test_modeling_instructblip.py::InstructBlipModelIntegrationTest::test_inference_flant5_xl",
      "trace": "(line 718)  AssertionError: Lists differ: [0, 3[68 chars]459, 9256, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[500 chars]5, 1] != [0, 3[68 chars]459, 4049, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[295 chars]5, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 718)  AssertionError: Lists differ: [0, 3[68 chars]459, 9256, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[500 chars]5, 1] != [0, 3[68 chars]459, 4049, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[295 chars]5, 1]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "instructblipvideo",
      "gpu": "single",
      "test": "tests/models/instructblipvideo/test_modeling_instructblipvideo.py::InstructBlipVideoModelIntegrationTest::test_inference_vicuna_7b",
      "trace": "(line 671)  AssertionError: 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1' != 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1080p'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 671)  AssertionError: 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1' != 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1080p'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "internvl",
      "gpu": "single",
      "test": "tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8750,  -0.4954,   1.4580, -10.3281, -10.3359], dtype=torch.float16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8750,  -0.4954,   1.4580, -10.3281, -10.3359], dtype=torch.float16)",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "internvl",
      "gpu": "single",
      "test": "tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_generate_text_only",
      "trace": "(line 714)  AssertionError: \"Autu[14 chars],\\nNature's breath, a season's sigh,\\nSilent woods awake.\" != \"Autu[14 chars],\\nNature's breath, a silent sigh,\\nWinter's chill approaches.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 714)  AssertionError: \"Autu[14 chars],\\nNature's breath, a season's sigh,\\nSilent woods awake.\" != \"Autu[14 chars],\\nNature's breath, a silent sigh,\\nWinter's chill approaches.\"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "jais2",
      "gpu": "single",
      "test": "tests/models/jais2/test_modeling_jais2.py::Jais2IntegrationTest::test_model_generation",
      "trace": "(line 488)  OSError: You are trying to access a gated repo.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  OSError: You are trying to access a gated repo.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "jais2",
      "gpu": "single",
      "test": "tests/models/jais2/test_modeling_jais2.py::Jais2IntegrationTest::test_model_logits",
      "trace": "(line 488)  OSError: You are trying to access a gated repo.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  OSError: You are trying to access a gated repo.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "jamba",
      "gpu": "single",
      "test": "tests/models/jamba/test_modeling_jamba.py::JambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 576)  AssertionError: \"<|startoftext|>Tell me a story<|pad|><|p[50 chars]t I'\" != '<|pad|><|pad|><|pad|><|pad|><|pad|><|pad[76 chars]ates'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 576)  AssertionError: \"<|startoftext|>Tell me a story<|pad|><|p[50 chars]t I'\" != '<|pad|><|pad|><|pad|><|pad|><|pad|><|pad[76 chars]ates'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "janus",
      "gpu": "single",
      "test": "tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_text_generation",
      "trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 1179648",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 1179648",
      "first_failure_day": "2026-04-21",
      "last_green_day": "2026-04-20",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "janus",
      "gpu": "single",
      "test": "tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_text_generation_with_multi_image",
      "trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 2359296",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 2359296",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "single",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_inference_interpolate_pos_encoding",
      "trace": "(line 777)  AttributeError: 'NoneType' object has no attribute 'last_hidden_state'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 777)  AttributeError: 'NoneType' object has no attribute 'last_hidden_state'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "single",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_snowman_image_captioning",
      "trace": "(line 79)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 79)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "single",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_snowman_image_captioning_batch",
      "trace": "(line 712)  AssertionError: Lists differ: ['<gr[35 chars]ail: A snowman is sitting in front of a fire, [575 chars]t>.'] != ['<gr[35 chars]ail: The image features a snowman sitting by<p[836 chars]t>.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 712)  AssertionError: Lists differ: ['<gr[35 chars]ail: A snowman is sitting in front of a fire, [575 chars]t>.'] != ['<gr[35 chars]ail: The image features a snowman sitting by<p[836 chars]t>.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "kosmos2_5",
      "gpu": "single",
      "test": "tests/models/kosmos2_5/test_modeling_kosmos2_5.py::Kosmos2_5ModelIntegrationTest::test_eager",
      "trace": "(line 578)  AssertionError: Lists differ: ['<bb[216 chars]<y_650></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n'] != ['<bb[216 chars]<y_651></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 578)  AssertionError: Lists differ: ['<bb[216 chars]<y_650></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n'] != ['<bb[216 chars]<y_651></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "single",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_1",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS[522 chars]t itc ' s new fmcg businesses are the fastest [829 chars]PAD]\" != \"[CLS[522 chars]t itc's new fmcg businesses are the fastest gr[827 chars]PAD]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS[522 chars]t itc ' s new fmcg businesses are the fastest [829 chars]PAD]\" != \"[CLS[522 chars]t itc's new fmcg businesses are the fastest gr[827 chars]PAD]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "single",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_4",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] 11 : 14 to 11 : 39 a[1108 chars]SEP]\" != \"[CLS] what's his name? [SEP] 11 : 14 to 11 : 39 a. [1106 chars]SEP]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] 11 : 14 to 11 : 39 a[1108 chars]SEP]\" != \"[CLS] what's his name? [SEP] 11 : 14 to 11 : 39 a. [1106 chars]SEP]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "single",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_5",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] hello world [SEP]\" != \"[CLS] what's his name? [SEP] hello world [SEP]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] hello world [SEP]\" != \"[CLS] what's his name? [SEP] hello world [SEP]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "lfm2_moe",
      "gpu": "single",
      "test": "tests/models/lfm2_moe/test_modeling_lfm2_moe.py::Lfm2MoeIntegrationTest::test_model_1a8b_batched_chat_generation",
      "trace": "(line 223)  AssertionError: Lists differ: ['Who are you? (AI) designed to assist?  \\nI am an AI ass[192 chars]ial'] != ['Who are you? (as AI) created by?  \\nI am an artificial [200 chars]ish']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 223)  AssertionError: Lists differ: ['Who are you? (AI) designed to assist?  \\nI am an AI ass[192 chars]ial'] != ['Who are you? (as AI) created by?  \\nI am an artificial [200 chars]ish']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "lfm2_vl",
      "gpu": "single",
      "test": "tests/models/lfm2_vl/test_modeling_lfm2_vl.py::Lfm2VlForConditionalGenerationIntegrationTest::test_integration_test",
      "trace": "(line 246)  AssertionError: 'In t[53 chars]. They are both very relaxed and comfortable. [14 chars]grey' != 'In t[53 chars]. There are also two remote controls on the blanket.\\n\\n\\n\\n'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 246)  AssertionError: 'In t[53 chars]. They are both very relaxed and comfortable. [14 chars]grey' != 'In t[53 chars]. There are also two remote controls on the blanket.\\n\\n\\n\\n'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "lfm2_vl",
      "gpu": "single",
      "test": "tests/models/lfm2_vl/test_modeling_lfm2_vl.py::Lfm2_5VlForConditionalGenerationIntegrationTest::test_integration_test_high_resolution",
      "trace": "(line 354)  AssertionError: 'In t[52 chars]ymbol of freedom and democracy. It stands tall on a small' != 'In t[52 chars]ymbol of freedom and democracy. It stands on Liberty Island in'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 354)  AssertionError: 'In t[52 chars]ymbol of freedom and democracy. It stands tall on a small' != 'In t[52 chars]ymbol of freedom and democracy. It stands on Liberty Island in'",
      "first_failure_day": "2026-04-05",
      "last_green_day": "2026-04-04",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llama",
      "gpu": "single",
      "test": "tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_llama_3_1_hard",
      "trace": "(line 96)  AssertionError: 'Tell[74 chars]ical social and political upheaval in France t[552 chars] the' != 'Tell[74 chars]ical political and social upheaval in France t[558 chars]nshr'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 96)  AssertionError: 'Tell[74 chars]ical social and political upheaval in France t[552 chars] the' != 'Tell[74 chars]ical political and social upheaval in France t[558 chars]nshr'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "single",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation",
      "trace": "(line 566)  AssertionError: Lists differ: [\"\\n [134 chars] one image and a\", '\\nUSER: Describe the image[210 chars]ama'] != [\"\\n [134 chars] one and a yellow\", '\\nUSER: Describe the imag[211 chars]ama']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 566)  AssertionError: Lists differ: [\"\\n [134 chars] one image and a\", '\\nUSER: Describe the image[210 chars]ama'] != [\"\\n [134 chars] one and a yellow\", '\\nUSER: Describe the imag[211 chars]ama']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "single",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 140.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 86.69 MiB is free. Process 339377 has 22.21 GiB memory in use. Of the allocated memory 21.81 GiB is allocated by PyTorch, and 23.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 140.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 86.69 MiB is free. Process 599846 has 22.21 GiB memory in use. Of the allocated memory 21.82 GiB is allocated by PyTorch, and 13.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "single",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral_4bit",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 7.77 GiB. GPU 0 has a total capacity of 22.30 GiB of which 76.69 MiB is free. Process 339377 has 22.22 GiB memory in use. Of the allocated memory 21.83 GiB is allocated by PyTorch, and 12.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 7.77 GiB. GPU 0 has a total capacity of 22.30 GiB of which 36.69 MiB is free. Process 599846 has 22.26 GiB memory in use. Of the allocated memory 21.86 GiB is allocated by PyTorch, and 12.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "single",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral_batched",
      "trace": "(line 724)  AssertionError: Lists differ: ['Wha[97 chars]mage?A narrow dirt path is surrounded by grass[74 chars]ue.'] != ['Wha[97 chars]mage?The image depicts a narrow, winding dirt [175 chars]ere']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 724)  AssertionError: Lists differ: ['Wha[97 chars]mage?A narrow dirt path is surrounded by grass[74 chars]ue.'] != ['Wha[97 chars]mage?The image depicts a narrow, winding dirt [175 chars]ere']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava_next",
      "gpu": "single",
      "test": "tests/models/llava_next/test_modeling_llava_next.py::LlavaNextForConditionalGenerationIntegrationTest::test_small_model_integration_test",
      "trace": "(line 172)  AssertionError: assert False",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 172)  AssertionError: assert False",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava_next_video",
      "gpu": "single",
      "test": "tests/models/llava_next_video/test_modeling_llava_next_video.py::LlavaNextVideoForConditionalGenerationIntegrationTest::test_small_model_integration_test",
      "trace": "(line 388)  AssertionError: 'USER[154 chars]hile wearing a pair of glasses that are too la[24 chars] are' != 'USER[154 chars]hile another child is attempting to read the s[45 chars]eems'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 388)  AssertionError: 'USER[154 chars]hile wearing a pair of glasses that are too la[24 chars] are' != 'USER[154 chars]hile another child is attempting to read the s[45 chars]eems'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "llava_next_video",
      "gpu": "single",
      "test": "tests/models/llava_next_video/test_modeling_llava_next_video.py::LlavaNextVideoForConditionalGenerationIntegrationTest::test_small_model_integration_test_batch_matches_single",
      "trace": "(line 480)  AssertionError: 'USER[154 chars]hile another child is attempting to read the s[96 chars]e to' != 'USER[154 chars]hile wearing a pair of glasses that are too la[69 chars]g it'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 480)  AssertionError: 'USER[154 chars]hile another child is attempting to read the s[96 chars]e to' != 'USER[154 chars]hile wearing a pair of glasses that are too la[69 chars]g it'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "longt5",
      "gpu": "single",
      "test": "tests/models/longt5/test_modeling_longt5.py::LongT5ModelIntegrationTests::test_inference_hidden_states",
      "trace": "(line 1225)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1225)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "longt5",
      "gpu": "single",
      "test": "tests/models/longt5/test_modeling_longt5.py::LongT5ModelIntegrationTests::test_summarization",
      "trace": "(line 1194)  AssertionError: Lists differ: ['background : coronary artery disease ( ca[601 chars]red'] != ['sss thessass:ss andss toss ofss fillssess[171 chars]se,']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1194)  AssertionError: Lists differ: ['background : coronary artery disease ( ca[601 chars]red'] != ['sss thessass:ss andss toss ofss fillssess[171 chars]se,']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "luke",
      "gpu": "single",
      "test": "tests/models/luke/test_modeling_luke.py::LukeModelIntegrationTests::test_inference_base_model",
      "trace": "(line 905)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 905)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "luke",
      "gpu": "single",
      "test": "tests/models/luke/test_modeling_luke.py::LukeModelIntegrationTests::test_inference_large_model",
      "trace": "(line 940)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 940)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "lw_detr",
      "gpu": "single",
      "test": "tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrModelIntegrationTest::test_inference_object_detection_head_tiny",
      "trace": "(line 690)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 690)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "lw_detr",
      "gpu": "single",
      "test": "tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrModelIntegrationTest::test_inference_object_detection_head_xlarge",
      "trace": "(line 766)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 766)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "m2m_100",
      "gpu": "single",
      "test": "tests/models/m2m_100/test_modeling_m2m_100.py::M2M100ModelIntegrationTests::test_seq_to_seq_generation",
      "trace": "(line 397)  AssertionError: assert ['</s>__en__T... France.</s>'] == ['</s> __en__... France.</s>']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 397)  AssertionError: assert ['</s>__en__T... France.</s>'] == ['</s> __en__... France.</s>']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "single",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_batched_equivalence_with_cache",
      "trace": "(line 532)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 8.03 GiB is free. Process 628823 has 14.27 GiB memory in use. Of the allocated memory 13.89 GiB is allocated by PyTorch, and 19.36 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 532)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 8.03 GiB is free. Process 616114 has 14.26 GiB memory in use. Of the allocated memory 13.89 GiB is allocated by PyTorch, and 18.18 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "single",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_batched_equivalence_without_cache",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 628823 has 22.25 GiB memory in use. Of the allocated memory 21.88 GiB is allocated by PyTorch, and 19.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 616114 has 22.25 GiB memory in use. Of the allocated memory 21.88 GiB is allocated by PyTorch, and 19.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "single",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_mamba2_mixer_train_vs_eval_equivalence",
      "trace": "(line 370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 628823 has 22.28 GiB memory in use. Of the allocated memory 21.92 GiB is allocated by PyTorch, and 9.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 616114 has 22.28 GiB memory in use. Of the allocated memory 21.92 GiB is allocated by PyTorch, and 9.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-14",
      "last_green_day": "2026-03-13",
      "failure_mode": "OOM",
      "status": "flaky: test fails on the current CI run (commit: effde20942e3f82a1b97449f60b3a48c5ff96145) but passes during the check.",
      "author": null,
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "single",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_simple_generate",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 628823 has 22.28 GiB memory in use. Of the allocated memory 21.91 GiB is allocated by PyTorch, and 21.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 616114 has 22.28 GiB memory in use. Of the allocated memory 21.91 GiB is allocated by PyTorch, and 21.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mimi",
      "gpu": "single",
      "test": "tests/models/mimi/test_modeling_mimi.py::MimiIntegrationTest::test_integration",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mimi",
      "gpu": "single",
      "test": "tests/models/mimi/test_modeling_mimi.py::MimiIntegrationTest::test_integration_longform",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "minimax",
      "gpu": "single",
      "test": "tests/models/minimax/test_modeling_minimax.py::MiniMaxIntegrationTest::test_small_model_logits",
      "trace": "(line 233)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 233)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "ministral",
      "gpu": "single",
      "test": "tests/models/ministral/test_modeling_ministral.py::MinistralIntegrationTest::test_model_8b_generation",
      "trace": "(line 116)  AssertionError: 'My favourite condiment is 100% natural, 100% organic, 100% free of' != 'Myfavouritecondimentis\u010a\u0120\u0120\u0120\u0120Joined:\u01202018-01-01,\u012012'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 116)  AssertionError: 'My favourite condiment is 100% natural, 100% organic, 100% free of' != 'Myfavouritecondimentis\u010a\u0120\u0120\u0120\u0120Joined:\u01202018-01-01,\u012012'",
      "first_failure_day": "2026-04-21",
      "last_green_day": "2026-04-20",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "ministral",
      "gpu": "single",
      "test": "tests/models/ministral/test_modeling_ministral.py::MinistralIntegrationTest::test_model_8b_logits",
      "trace": "(line 93)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 93)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "ministral3",
      "gpu": "single",
      "test": "tests/models/ministral3/test_modeling_ministral3.py::Ministral3IntegrationTest::test_model_3b_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 130)  AssertionError: 'My favourite condiment is icing sugar. I[47 chars]fles' != \"My favourite condiment is 100% pure oliv[46 chars]t in\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "ministral3",
      "gpu": "single",
      "test": "tests/models/ministral3/test_modeling_ministral3.py::Ministral3IntegrationTest::test_model_3b_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 102)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral",
      "gpu": "single",
      "test": "tests/models/mistral/test_modeling_mistral.py::MistralIntegrationTest::test_model_7b_logits",
      "trace": "(line 112)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 112)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral",
      "gpu": "single",
      "test": "tests/models/mistral/test_modeling_mistral.py::MistralIntegrationTest::test_speculative_generation",
      "trace": "(line 207)  AssertionError: 'My f[18 chars] is 100% ketchup. I\u2019m not a fan of mustard, relish' != 'My f[18 chars] is 100% mayonnaise. I\u2019m not a fan of the fancy stuff with all'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 207)  AssertionError: 'My f[18 chars] is 100% ketchup. I\u2019m not a fan of mustard, relish' != 'My f[18 chars] is 100% mayonnaise. I\u2019m not a fan of the fancy stuff with all'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "single",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate",
      "trace": "(line 362)  AssertionError: ' to write a short story based on this ima[70 chars]e pl' != 'Calm waters reflect\\nWooden path to dista[26 chars]oods'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 362)  AssertionError: ' to write a short story based on this ima[70 chars]e pl' != 'Calm waters reflect\\nWooden path to dista[26 chars]oods'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "single",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate_multi_image",
      "trace": "(line 438)  AssertionError: ' to write a short story based on this im[81 chars]ched' != \"Calm waters reflect\\nWooden path to dist[29 chars]hold\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 438)  AssertionError: ' to write a short story based on this im[81 chars]ched' != \"Calm waters reflect\\nWooden path to dist[29 chars]hold\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "single",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_generate",
      "trace": "(line 309)  AssertionError: 'The [14 chars] two tabby cats lying on a pink surface, which[21 chars]h or' != 'The [14 chars] two cats lying on a pink surface, which appea[21 chars] bed'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  AssertionError: 'The [14 chars] two tabby cats lying on a pink surface, which[21 chars]h or' != 'The [14 chars] two cats lying on a pink surface, which appea[21 chars] bed'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral4",
      "gpu": "single",
      "test": "tests/models/mistral4/test_modeling_mistral4.py::Mistral4IntegrationTest::test_mistral_small_4_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 6741)  RuntimeError: Expected mat_a to be Float32, BFloat16 or Float16 matrix, got Float8_e4m3fn",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mistral4",
      "gpu": "single",
      "test": "tests/models/mistral4/test_modeling_mistral4.py::Mistral4IntegrationTest::test_mistral_small_4_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 6741)  RuntimeError: Expected mat_a to be Float32, BFloat16 or Float16 matrix, got Float8_e4m3fn",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mixtral",
      "gpu": "single",
      "test": "tests/models/mixtral/test_modeling_mixtral.py::MixtralIntegrationTest::test_small_model_logits",
      "trace": "(line 143)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 143)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mixtral",
      "gpu": "single",
      "test": "tests/models/mixtral/test_modeling_mixtral.py::MixtralIntegrationTest::test_small_model_logits_batched",
      "trace": "(line 188)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 188)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "single",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_batched_generate",
      "trace": "(line 643)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 643)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "single",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ 6.5938,  4.4062,  3.0938, -0.3105,  1.8906], dtype=torch.bfloat16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ 6.5938,  4.4062,  3.0938, -0.3105,  1.8906], dtype=torch.bfloat16)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "single",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_generate",
      "trace": "(line 510)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 510)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "single",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_multi_image_generate",
      "trace": "(line 724)  AssertionError: 'The image shows a red octagonal stop sign w[59 chars]to a' != 'This image shows a long wooden dock extendi[67 chars]ling'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 724)  AssertionError: 'The image shows a red octagonal stop sign w[59 chars]to a' != 'This image shows a long wooden dock extendi[67 chars]ling'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "single",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_classification_no_padding_or_truncation",
      "trace": "(line 453)  AssertionError: '<s> Japanese is an<s> East Asian language<s> spoken by about[40 chars]</s>' != '<s> Japanese is an<ent>East Asian language<ent>spoken by abo[42 chars]</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: '<s> Japanese is an<s> East Asian language<s> spoken by about[40 chars]</s>' != '<s> Japanese is an<ent>East Asian language<ent>spoken by abo[42 chars]</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "single",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_pair_classification_no_padding_or_truncation",
      "trace": "(line 507)  AssertionError: '<s><s> Japanese<s> is an East Asian language [64 chars]</s>' != '<s><ent>Japanese<ent>is an East Asian languag[68 chars]</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 507)  AssertionError: '<s><s> Japanese<s> is an East Asian language [64 chars]</s>' != '<s><ent>Japanese<ent>is an East Asian languag[68 chars]</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "single",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_span_classification_no_padding_or_truncation",
      "trace": "(line 572)  AssertionError: '<s> [33 chars]e spoken by about 128 million people, primarily in Japan .</s>' != '<s> [33 chars]e spoken by about 128 million people, primarily in Japan.</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 572)  AssertionError: '<s> [33 chars]e spoken by about 128 million people, primarily in Japan .</s>' != '<s> [33 chars]e spoken by about 128 million people, primarily in Japan.</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "single",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 672)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 672)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "single",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu",
      "trace": "(line 738)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 738)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "single",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_mm_grounding_dino_loss",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "modernvbert",
      "gpu": "single",
      "test": "tests/models/modernvbert/test_modeling_modernvbert.py::ModernVBertForMaskedLMIntegrationTest::test_masked_lm_inference",
      "trace": "(line 835)  huggingface_hub.errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1baf08-58e2ed3d7b43da767fe30dbc;b8516d29-8d4e-4b10-99db-7d595ca1239d)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 835)  huggingface_hub.errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6a2396b8-42e1ad94215fec363c454441;eddf0e1a-0839-4335-be1b-cdbb4ce2a9d0)",
      "first_failure_day": "2026-04-01",
      "last_green_day": "2026-03-31",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moonshine_streaming",
      "gpu": "single",
      "test": "tests/models/moonshine_streaming/test_modeling_moonshine_streaming.py::MoonshineStreamingModelIntegrationTests::test_medium_logits_batch",
      "trace": "(line 605)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 605)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moonshine_streaming",
      "gpu": "single",
      "test": "tests/models/moonshine_streaming/test_modeling_moonshine_streaming.py::MoonshineStreamingModelIntegrationTests::test_small_logits_batch",
      "trace": "(line 572)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 572)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "single",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshika_greedy_unconditional_fp16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "single",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "single",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp16_eager",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "single",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp32",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "single",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "single",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation_8k",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b-8k is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b-8k is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "single",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation_batched",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "single",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_model_logits",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicflamingo",
      "gpu": "single",
      "test": "tests/models/musicflamingo/test_modeling_musicflamingo.py::MusicFlamingoForConditionalGenerationIntegrationTest::test_fixture_batched_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicflamingo",
      "gpu": "single",
      "test": "tests/models/musicflamingo/test_modeling_musicflamingo.py::MusicFlamingoForConditionalGenerationIntegrationTest::test_fixture_single_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen",
      "gpu": "single",
      "test": "tests/models/musicgen/test_modeling_musicgen.py::MusicgenIntegrationTests::test_generate_text_prompt_sampling",
      "trace": "(line 1262)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1262)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen",
      "gpu": "single",
      "test": "tests/models/musicgen/test_modeling_musicgen.py::MusicgenIntegrationTests::test_generate_unconditional_sampling",
      "trace": "(line 1179)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1179)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_audio_prompt",
      "trace": "(line 1307)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1307)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_greedy",
      "trace": "(line 1219)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1219)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_greedy_with_classifier_free_guidance",
      "trace": "(line 1247)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1247)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_sampling",
      "trace": "(line 1282)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1282)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_unconditional_greedy",
      "trace": "(line 1167)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1167)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_unconditional_sampling",
      "trace": "(line 1192)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1192)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyStereoIntegrationTests::test_generate_text_audio_prompt",
      "trace": "(line 1376)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1376)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "single",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyStereoIntegrationTests::test_generate_unconditional_greedy",
      "trace": "(line 1344)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1344)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "nemotron",
      "gpu": "single",
      "test": "tests/models/nemotron/test_modeling_nemotron.py::NemotronIntegrationTest::test_nemotron_8b_generation_eager",
      "trace": "(line 103)  AssertionError: Lists differ: ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer'] != ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer: What is the name of the 19']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 103)  AssertionError: Lists differ: ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer'] != ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer: What is the name of the 19']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "nemotron",
      "gpu": "single",
      "test": "tests/models/nemotron/test_modeling_nemotron.py::NemotronIntegrationTest::test_nemotron_8b_generation_fa2",
      "trace": "(line 1714)  ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package for FlashAttention2 doesn't seem to be installed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1714)  ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package for FlashAttention2 doesn't seem to be installed.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "nllb_moe",
      "gpu": "single",
      "test": "tests/models/nllb_moe/test_modeling_nllb_moe.py::NllbMoeModelIntegrationTests::test_inference_logits",
      "trace": "(line 399)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 399)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "single",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_export_static_cache",
      "trace": "(line 338)  AssertionError: Lists differ: ['Sim[41 chars]that \\nthe speed of light is the same in all r[35 chars]ght'] != ['Sim[41 chars]that  .1.\\nThe theory of relativity states tha[18 chars] of']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 338)  AssertionError: Lists differ: ['Sim[41 chars]that \\nthe speed of light is the same in all r[35 chars]ght'] != ['Sim[41 chars]that  .1.\\nThe theory of relativity states tha[18 chars] of']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "single",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_model_1b_logits",
      "trace": "(line 2567)  RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__index_select)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__index_select)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "single",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_model_7b_greedy_generation",
      "trace": "(line 242)  AssertionError: 'Simp[40 chars]that \\nthe speed of light is the same for all [232 chars]\\n\\n' != 'Simp[40 chars]that  .1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1[20 chars].1.1'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 242)  AssertionError: 'Simp[40 chars]that \\nthe speed of light is the same for all [232 chars]\\n\\n' != 'Simp[40 chars]that  .1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1[20 chars].1.1'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "single",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_model_7b_logits",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 30.69 MiB is free. Process 207149 has 22.27 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 53.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 30.69 MiB is free. Process 639695 has 22.27 GiB memory in use. Of the allocated memory 21.84 GiB is allocated by PyTorch, and 53.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo2",
      "gpu": "single",
      "test": "tests/models/olmo2/test_modeling_olmo2.py::Olmo2IntegrationTest::test_model_1b_logits_bfloat16",
      "trace": "(line 214)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 214)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmo3",
      "gpu": "single",
      "test": "tests/models/olmo3/test_modeling_olmo3.py::Olmo3IntegrationTest::test_model_7b_logits",
      "trace": "(line 196)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 196)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "olmoe",
      "gpu": "single",
      "test": "tests/models/olmoe/test_modeling_olmoe.py::OlmoeIntegrationTest::test_model_7b_logits",
      "trace": "(line 217)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 217)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "oneformer",
      "gpu": "single",
      "test": "tests/models/oneformer/test_modeling_oneformer.py::OneFormerModelIntegrationTest::test_inference_no_head",
      "trace": "(line 507)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 507)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "oneformer",
      "gpu": "single",
      "test": "tests/models/oneformer/test_modeling_oneformer.py::OneFormerModelIntegrationTest::test_inference_universal_segmentation_head",
      "trace": "(line 549)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 549)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "opt",
      "gpu": "single",
      "test": "tests/models/opt/test_modeling_opt.py::OPTModelIntegrationTests::test_inference_no_head",
      "trace": "(line 357)  AssertionError: tensor([[-0.2883, -1.9219, -0.3079],",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 357)  AssertionError: tensor([[-0.2883, -1.9219, -0.3079],",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "ovis2",
      "gpu": "single",
      "test": "tests/models/ovis2/test_modeling_ovis2.py::Ovis2IntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 355)  AssertionError: Lists differ: ['sys[81 chars]ant\\n', 'system\\nYou are a helpful assistant.\\[139 chars]et.'] != ['sys[81 chars]ant\\nAnswer: I see a brown dog standing on a w[224 chars]et.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 355)  AssertionError: Lists differ: ['sys[81 chars]ant\\n', 'system\\nYou are a helpful assistant.\\[139 chars]et.'] != ['sys[81 chars]ant\\nAnswer: I see a brown dog standing on a w[224 chars]et.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "single",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_interpolate_pos_encoding",
      "trace": "(line 683)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 683)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "single",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_object_detection",
      "trace": "(line 800)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 800)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "single",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_one_shot_object_detection",
      "trace": "(line 843)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 843)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pegasus",
      "gpu": "single",
      "test": "tests/models/pegasus/test_modeling_pegasus.py::PegasusXSUMIntegrationTest::test_pegasus_xsum_summary",
      "trace": "(line 350)  assert torch.Size([2, 422]) == (2, 421)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 350)  assert torch.Size([2, 422]) == (2, 421)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "persimmon",
      "gpu": "single",
      "test": "tests/models/persimmon/test_modeling_persimmon.py::PersimmonIntegrationTest::test_model_8b_chat_greedy_generation",
      "trace": "(line 131)  AssertionError: 'huma[58 chars]ept: The theory of relativity states that the [80 chars]ion.' != 'huma[58 chars]ept: the speed of light in a vacuum is the sam[33 chars]ence'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 131)  AssertionError: 'huma[58 chars]ept: The theory of relativity states that the [80 chars]ion.' != 'huma[58 chars]ept: the speed of light in a vacuum is the sam[33 chars]ence'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "persimmon",
      "gpu": "single",
      "test": "tests/models/persimmon/test_modeling_persimmon.py::PersimmonIntegrationTest::test_model_8b_chat_logits",
      "trace": "(line 99)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 99)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "phi3",
      "gpu": "single",
      "test": "tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_export_static_cache",
      "trace": "(line 1318)  torch._dynamo.exc.Unsupported: Data-dependent branching",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1318)  torch._dynamo.exc.Unsupported: Data-dependent branching",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "phimoe",
      "gpu": "single",
      "test": "tests/models/phimoe/test_modeling_phimoe.py::PhimoeIntegrationTest::test_model_phimoe_instruct_logits",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.56 GiB. GPU 0 has a total capacity of 22.30 GiB of which 814.69 MiB is free. Process 197047 has 21.50 GiB memory in use. Of the allocated memory 17.22 GiB is allocated by PyTorch, and 3.91 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 152)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pi0",
      "gpu": "single",
      "test": "tests/models/pi0/test_modeling_pi0.py::PI0ModelIntegrationTest::test_train_pi0_base_libero",
      "trace": "(line 193)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 321344 has 22.29 GiB memory in use. Of the allocated memory 21.50 GiB is allocated by PyTorch, and 477.92 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 193)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 198435 has 22.29 GiB memory in use. Of the allocated memory 21.50 GiB is allocated by PyTorch, and 477.92 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "OOM",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pixio",
      "gpu": "single",
      "test": "tests/models/pixio/test_modeling_pixio.py::PixioModelIntegrationTest::test_inference_no_head",
      "trace": "(line 277)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 277)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "single",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartJavaCsIntegrationTest::test_java_cs_generate_batch",
      "trace": "(line 379)  AssertionError: assert ['public int ...turn a * b *'] == ['public int ...rn a * b * c']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 379)  AssertionError: assert ['public int ...turn a * b *'] == ['public int ...rn a * b * c']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "single",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartJavaCsIntegrationTest::test_java_cs_generate_one",
      "trace": "(line 370)  AssertionError: 'public int maximum(int a, int b, int c){return Math.Max(' != 'public int maximum(int a, int b, int c){return Math.Max(a'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  AssertionError: 'public int maximum(int a, int b, int c){return Math.Max(' != 'public int maximum(int a, int b, int c){return Math.Max(a'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "single",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartBaseIntegrationTest::test_fill_mask",
      "trace": "(line 444)  AssertionError: '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0' != '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 444)  AssertionError: '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0' != '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pvt",
      "gpu": "single",
      "test": "tests/models/pvt/test_modeling_pvt.py::PvtModelIntegrationTest::test_inference_image_classification",
      "trace": "(line 257)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 257)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pvt",
      "gpu": "single",
      "test": "tests/models/pvt/test_modeling_pvt.py::PvtModelIntegrationTest::test_inference_model",
      "trace": "(line 284)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 284)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pvt_v2",
      "gpu": "single",
      "test": "tests/models/pvt_v2/test_modeling_pvt_v2.py::PvtV2ModelIntegrationTest::test_inference_image_classification",
      "trace": "(line 275)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 275)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "pvt_v2",
      "gpu": "single",
      "test": "tests/models/pvt_v2/test_modeling_pvt_v2.py::PvtV2ModelIntegrationTest::test_inference_model",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen2_5_omni",
      "gpu": "single",
      "test": "tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test",
      "trace": "(line 692)  AssertionError: \"syst[108 chars]d is glass shattering, and the dog is a Labrador Retriever.\" != \"syst[108 chars]d is a glass shattering. The dog in the pictur[22 chars]ver.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 692)  AssertionError: \"syst[108 chars]d is glass shattering, and the dog is a Labrador Retriever.\" != \"syst[108 chars]d is a glass shattering. The dog in the pictur[22 chars]ver.\"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen2_5_omni",
      "gpu": "single",
      "test": "tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 734)  AssertionError: Lists differ: [\"sys[109 chars]d is glass shattering, and the dog is a Labrad[185 chars]er.\"] != [\"sys[109 chars]d is a glass shattering. The dog in the pictur[211 chars]er.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 734)  AssertionError: Lists differ: [\"sys[109 chars]d is glass shattering, and the dog is a Labrad[185 chars]er.\"] != [\"sys[109 chars]d is a glass shattering. The dog in the pictur[211 chars]er.\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen2_5_vl",
      "gpu": "single",
      "test": "tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 611)  AssertionError: Lists differ: ['sys[298 chars]en, a large language model created by Alibaba [84 chars]and'] != ['sys[298 chars]en, an AI language model created by Alibaba Cl[96 chars]on,']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 611)  AssertionError: Lists differ: ['sys[298 chars]en, a large language model created by Alibaba [84 chars]and'] != ['sys[298 chars]en, an AI language model created by Alibaba Cl[96 chars]on,']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen2_moe",
      "gpu": "single",
      "test": "tests/models/qwen2_moe/test_modeling_qwen2_moe.py::Qwen2MoeIntegrationTest::test_model_a2_7b_logits",
      "trace": "(line 147)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 147)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen2_moe",
      "gpu": "single",
      "test": "tests/models/qwen2_moe/test_modeling_qwen2_moe.py::Qwen2MoeIntegrationTest::test_speculative_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3",
      "gpu": "single",
      "test": "tests/models/qwen3/test_modeling_qwen3.py::Qwen3IntegrationTest::test_model_600m_logits",
      "trace": "(line 92)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 92)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3",
      "gpu": "single",
      "test": "tests/models/qwen3/test_modeling_qwen3.py::Qwen3IntegrationTest::test_speculative_generation",
      "trace": "(line 198)  AssertionError: 'My f[22 chars]100% beef, 100% beef, 100% beef.' != 'My f[22 chars]100% vegetable oil. It has a rich, creamy text[19 chars]utty'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 198)  AssertionError: 'My f[22 chars]100% beef, 100% beef, 100% beef.' != 'My f[22 chars]100% vegetable oil. It has a rich, creamy text[19 chars]utty'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3_5",
      "gpu": "single",
      "test": "tests/models/qwen3_5/test_modeling_qwen3_5.py::Qwen3_5IntegrationTest::test_model_video_generation",
      "trace": "(line 811)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 811)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3_5",
      "gpu": "single",
      "test": "tests/models/qwen3_5/test_modeling_qwen3_5.py::Qwen3_5IntegrationTest::test_model_video_generation_batch",
      "trace": "(line 863)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 863)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "single",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 823)  AssertionError: Lists differ: [\"use[99 chars]ation, here is a breakdown of what you're hear[187 chars]n\\n\"] != [\"use[99 chars]ation provided:\\n\\nThe sound you hear is the d[191 chars]hed\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 823)  AssertionError: Lists differ: [\"use[99 chars]ation, here is a breakdown of what you're hear[187 chars]n\\n\"] != [\"use[99 chars]ation provided:\\n\\nThe sound you hear is the d[191 chars]hed\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "single",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test_w_audio",
      "trace": "(line 911)  AssertionError: 'syst[223 chars]derstand spoken content, and I can also make inferences about' != 'syst[223 chars]derstand spoken content, and I can also process and respond to'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 911)  AssertionError: 'syst[223 chars]derstand spoken content, and I can also make inferences about' != 'syst[223 chars]derstand spoken content, and I can also process and respond to'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "single",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 446)  AssertionError: Lists differ: [\"use[92 chars]'s a wild cat species native to the grasslands[182 chars]ons\"] != [\"use[92 chars]'s a small wild cat native to the grasslands a[178 chars]ons\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 446)  AssertionError: Lists differ: [\"use[92 chars]'s a wild cat species native to the grasslands[182 chars]ons\"] != [\"use[92 chars]'s a small wild cat native to the grasslands a[178 chars]ons\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "rag",
      "gpu": "single",
      "test": "tests/models/rag/test_modeling_rag.py::RagModelIntegrationTests::test_rag_sequence_generate_batch",
      "trace": "(line 948)  AssertionError: Lists differ: [' michael gross', ' monday 17 , 2018', ' te[96 chars]ndo'] != [' albert einstein', ' june 22 , 2018', ' am[85 chars]' 8']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 948)  AssertionError: Lists differ: [' michael gross', ' monday 17 , 2018', ' te[96 chars]ndo'] != [' albert einstein', ' june 22 , 2018', ' am[85 chars]' 8']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "rag",
      "gpu": "single",
      "test": "tests/models/rag/test_modeling_rag.py::RagModelIntegrationTests::test_rag_sequence_generate_batch_from_context_input_ids",
      "trace": "(line 1000)  AssertionError: Lists differ: [' michael gross', ' monday 17 , 2018', ' te[96 chars]ndo'] != [' albert einstein', ' june 22 , 2018', ' am[85 chars]' 8']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1000)  AssertionError: Lists differ: [' michael gross', ' monday 17 , 2018', ' te[96 chars]ndo'] != [' albert einstein', ' june 22 , 2018', ' am[85 chars]' 8']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "rag",
      "gpu": "single",
      "test": "tests/models/rag/test_modeling_rag.py::RagModelIntegrationTests::test_rag_sequence_generate_beam",
      "trace": "(line 892)  AssertionError: '\" in the United States. \"People Need Love\"[155 chars]hit.' != '\"She\\'s My Kind of Girl\" was released thro[257 chars]nts.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 892)  AssertionError: '\" in the United States. \"People Need Love\"[155 chars]hit.' != '\"She\\'s My Kind of Girl\" was released thro[257 chars]nts.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "rag",
      "gpu": "single",
      "test": "tests/models/rag/test_modeling_rag.py::RagModelIntegrationTests::test_rag_token_generate_beam",
      "trace": "(line 854)  AssertionError: '\"She[14 chars] Girl' != '\"She[14 chars] Girl\" was released through Epic Records in Ja[179 chars]ses\"'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 854)  AssertionError: '\"She[14 chars] Girl' != '\"She[14 chars] Girl\" was released through Epic Records in Ja[179 chars]ses\"'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "single",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_2b_generate",
      "trace": "(line 157)  AssertionError: Lists differ: ['Hel[325 chars]oday the 19th of June 2019, I was in the offic[256 chars] to'] != ['Hel[325 chars]oday is a new app that allows you to make mone[256 chars]app']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 157)  AssertionError: Lists differ: ['Hel[325 chars]oday the 19th of June 2019, I was in the offic[256 chars] to'] != ['Hel[325 chars]oday is a new app that allows you to make mone[256 chars]app']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "single",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_2b_sample",
      "trace": "(line 195)  AssertionError: Lists differ: ['Wha[24 chars]Deep Learning (or deep learning) is one of the[107 chars]ple'] != ['Wha[24 chars]Deep learning is the next frontier in computer[98 chars] is']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 195)  AssertionError: Lists differ: ['Wha[24 chars]Deep Learning (or deep learning) is one of the[107 chars]ple'] != ['Wha[24 chars]Deep learning is the next frontier in computer[98 chars] is']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "single",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_longer_than_window",
      "trace": "(line 243)  AssertionError: Lists differ: [' Jean-Philippe Guillet said, \"We have no[245 chars]eo.'] != [\" Robin's comments follow claims by two m[249 chars]the\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 243)  AssertionError: Lists differ: [' Jean-Philippe Guillet said, \"We have no[245 chars]eo.'] != [\" Robin's comments follow claims by two m[249 chars]the\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "single",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_model_2b_8bit",
      "trace": "(line 222)  AssertionError: Lists differ: ['Hel[26 chars] the effects of the environment on the human b[124 chars]aur\"] != ['Hel[26 chars] the topic of \"The impact of social media on t[102 chars] 3D\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 222)  AssertionError: Lists differ: ['Hel[26 chars] the effects of the environment on the human b[124 chars]aur\"] != ['Hel[26 chars] the topic of \"The impact of social media on t[102 chars] 3D\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "reformer",
      "gpu": "single",
      "test": "tests/models/reformer/test_modeling_reformer.py::ReformerIntegrationTests::test_pretrained_generate_crime_and_punish",
      "trace": "(line 1370)  AssertionError: 'A fe[36 chars]is ideas, so attentively two or three thousand roubles, and' != 'A fe[36 chars]is ideas, at the first entrance. He was positively for an inst'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  AssertionError: 'A fe[36 chars]is ideas, so attentively two or three thousand roubles, and' != 'A fe[36 chars]is ideas, at the first entrance. He was positively for an inst'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "regnet",
      "gpu": "single",
      "test": "tests/models/regnet/test_modeling_regnet.py::RegNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 243)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 243)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "resnet",
      "gpu": "single",
      "test": "tests/models/resnet/test_modeling_resnet.py::ResNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 291)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 291)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "single",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_speech_to_speech_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "single",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_speech_to_text_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "single",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_to_rus_speech",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "single",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_speech_to_speech_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "single",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_speech_to_text_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "single",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_to_rus_speech",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-03-16",
      "last_green_day": "2026-03-15",
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seed_oss",
      "gpu": "single",
      "test": "tests/models/seed_oss/test_modeling_seed_oss.py::SeedOssIntegrationTest::test_model_36b_eager",
      "trace": "(line 95)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 95)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "seed_oss",
      "gpu": "single",
      "test": "tests/models/seed_oss/test_modeling_seed_oss.py::SeedOssIntegrationTest::test_model_36b_sdpa",
      "trace": "(line 114)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 114)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "smollm3",
      "gpu": "single",
      "test": "tests/models/smollm3/test_modeling_smollm3.py::SmolLM3IntegrationTest::test_export_static_cache",
      "trace": "(line 198)  AssertionError: 'Gravity is the force that pulls objects [69 chars] and' != [\"Gravity is the force that pulls objects[85 chars] of\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 198)  AssertionError: 'Gravity is the force that pulls objects [69 chars] and' != [\"Gravity is the force that pulls objects[85 chars] of\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "smollm3",
      "gpu": "single",
      "test": "tests/models/smollm3/test_modeling_smollm3.py::SmolLM3IntegrationTest::test_model_3b_logits",
      "trace": "(line 89)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 89)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "stablelm",
      "gpu": "single",
      "test": "tests/models/stablelm/test_modeling_stablelm.py::StableLmModelIntegrationTest::test_model_stablelm_3b_4e1t_logits",
      "trace": "(line 65)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 65)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "stablelm",
      "gpu": "single",
      "test": "tests/models/stablelm/test_modeling_stablelm.py::StableLmModelIntegrationTest::test_model_tiny_random_stablelm_2_logits",
      "trace": "(line 98)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 98)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "single",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_4bit",
      "trace": "(line 152)  AssertionError: Lists differ: ['Hel[188 chars]of', 'def hello_world():\\n\\treturn \"Hello Worl[95 chars]ute'] != ['Hel[188 chars]of', \"def hello_world(): hello_world():\\n    r[117 chars]'})\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 152)  AssertionError: Lists differ: ['Hel[188 chars]of', 'def hello_world():\\n\\treturn \"Hello Worl[95 chars]ute'] != ['Hel[188 chars]of', \"def hello_world(): hello_world():\\n    r[117 chars]'})\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "single",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_eager",
      "trace": "(line 99)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 99)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "single",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_sdpa",
      "trace": "(line 79)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 79)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "superpoint",
      "gpu": "single",
      "test": "tests/models/superpoint/test_modeling_superpoint.py::SuperPointModelIntegrationTest::test_inference",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "swiftformer",
      "gpu": "single",
      "test": "tests/models/swiftformer/test_modeling_swiftformer.py::SwiftFormerModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 263)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 263)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "swin2sr",
      "gpu": "single",
      "test": "tests/models/swin2sr/test_modeling_swin2sr.py::Swin2SRModelIntegrationTest::test_inference_fp16",
      "trace": "(line 332)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 332)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "swinv2",
      "gpu": "single",
      "test": "tests/models/swinv2/test_modeling_swinv2.py::Swinv2ModelIntegrationTest::test_inference_fp16",
      "trace": "(line 492)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 492)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "t5gemma2",
      "gpu": "single",
      "test": "tests/models/t5gemma2/test_modeling_t5gemma2.py::T5Gemma2IntegrationTest::test_model_generation_batch_270m",
      "trace": "(line 1128)  AssertionError: Lists differ: [' a [83 chars]e UK.\\n\\nThe bumblebee is a species of bee tha[15 chars]the'] != [' a [83 chars]e UK.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1128)  AssertionError: Lists differ: [' a [83 chars]e UK.\\n\\nThe bumblebee is a species of bee tha[15 chars]the'] != [' a [83 chars]e UK.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "table_transformer",
      "gpu": "single",
      "test": "tests/models/table_transformer/test_modeling_table_transformer.py::TableTransformerModelIntegrationTests::test_table_detection",
      "trace": "(line 554)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 554)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "univnet",
      "gpu": "single",
      "test": "tests/models/univnet/test_modeling_univnet.py::UnivNetModelIntegrationTests::test_integration",
      "trace": "(line 330)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 330)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "video_llava",
      "gpu": "single",
      "test": "tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_llama",
      "trace": "(line 491)  AssertionError: 'USER: \\nDescribe the video in details. A[572 chars]ion.' != \"USER: \\nDescribe the video in details. A[675 chars]ing.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 491)  AssertionError: 'USER: \\nDescribe the video in details. A[572 chars]ion.' != \"USER: \\nDescribe the video in details. A[675 chars]ing.\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "video_llava",
      "gpu": "single",
      "test": "tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_mixed_inputs",
      "trace": "(line 464)  AssertionError: Lists differ: ['USE[183 chars]se it shows a baby sitting on a bed and reading a book. The'] != ['USE[183 chars]se it shows a baby sitting on a bed and reading a book, which']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 464)  AssertionError: Lists differ: ['USE[183 chars]se it shows a baby sitting on a bed and reading a book. The'] != ['USE[183 chars]se it shows a baby sitting on a bed and reading a book, which']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "videomae",
      "gpu": "single",
      "test": "tests/models/videomae/test_modeling_videomae.py::VideoMAEModelIntegrationTest::test_inference_for_pretraining",
      "trace": "(line 478)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 478)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "videomae",
      "gpu": "single",
      "test": "tests/models/videomae/test_modeling_videomae.py::VideoMAEModelIntegrationTest::test_inference_for_video_classification",
      "trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-09",
      "last_green_day": "2026-04-08",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vilt",
      "gpu": "single",
      "test": "tests/models/vilt/test_modeling_vilt.py::ViltModelIntegrationTest::test_inference_masked_lm",
      "trace": "(line 575)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 575)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "single",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_cordv2",
      "trace": "(line 1352)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1352)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "single",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_docvqa",
      "trace": "(line 1288)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1288)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "single",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_rvlcdip",
      "trace": "(line 1414)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1414)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "single",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::NougatModelIntegrationTest::test_forward_pass",
      "trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1bbf66-0a191f36732bf9863c3d2f6b;a009d360-1d7f-4772-8614-a4f458cb787e)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a23a787-7c10c6ba6e3008546506008b;9b76207e-a0d9-43bb-8724-acb37931823d)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "single",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::NougatModelIntegrationTest::test_generation",
      "trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1bbf68-4b9c73315f1487c353a41833;2c2a1462-c4c9-4e06-98e0-e4e67e9af54c)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a23a788-170df9245abeefd965d3ab81;8b914188-eb15-479b-a3d2-e96e582e7ceb)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vits",
      "gpu": "single",
      "test": "tests/models/vits/test_modeling_vits.py::VitsModelIntegrationTests::test_forward_fp16",
      "trace": "(line 433)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 433)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "vivit",
      "gpu": "single",
      "test": "tests/models/vivit/test_modeling_vivit.py::VivitModelIntegrationTest::test_inference_for_video_classification",
      "trace": "(line 361)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 361)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "single",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_multi_turn_text_and_audio",
      "trace": "(line 381)  AssertionError: Lists differ: ['Des[790 chars]as a farewell address by a president, reflecti[151 chars]xt.'] != ['Des[790 chars]as a political speech by a president, reflecti[151 chars]xt.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 381)  AssertionError: Lists differ: ['Des[790 chars]as a farewell address by a president, reflecti[151 chars]xt.'] != ['Des[790 chars]as a political speech by a president, reflecti[151 chars]xt.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "single",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_audio_only",
      "trace": "(line 163)  AssertionError: Lists differ: ['The[442 chars]king what A\\'s tattoo says, and A always respo[777 chars]nt.'] != ['The[442 chars]king A what his tattoo says, and A always resp[884 chars]on.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 163)  AssertionError: Lists differ: ['The[442 chars]king what A\\'s tattoo says, and A always respo[777 chars]nt.'] != ['The[442 chars]king A what his tattoo says, and A always resp[884 chars]on.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "single",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_text_and_audio",
      "trace": "(line 203)  AssertionError: Lists differ: [\"Wha[241 chars]. He expresses gratitude for the conversations[429 chars]en.\"] != [\"Wha[241 chars]. He acknowledges the diverse perspectives and[412 chars]es.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 203)  AssertionError: Lists differ: [\"Wha[241 chars]. He expresses gratitude for the conversations[429 chars]en.\"] != [\"Wha[241 chars]. He acknowledges the diverse perspectives and[412 chars]es.\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "single",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_text_and_multiple_audios_batched",
      "trace": "(line 327)  AssertionError: Lists differ: [\"Who[609 chars]m is likely the Seattle Mariners, as the comme[446 chars]me.'] != [\"Who[609 chars]m is the Mariners, and the commentator is exci[414 chars]nt.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 327)  AssertionError: Lists differ: [\"Who[609 chars]m is likely the Seattle Mariners, as the comme[446 chars]me.'] != [\"Who[609 chars]m is the Mariners, and the commentator is exci[414 chars]nt.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "voxtral_realtime",
      "gpu": "single",
      "test": "tests/models/voxtral_realtime/test_modeling_voxtral_realtime.py::VoxtralRealtimeForConditionalGenerationIntegrationTest::test_batched_longform",
      "trace": "(line 349)  AssertionError: Lists differ: [' Come on! Dude. You got a tattoo. So did you, dud[1097 chars]the\"] != [' Come on. Dude. You got a tattoo. So did you, dud[1097 chars]the\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 349)  AssertionError: Lists differ: [' Come on! Dude. You got a tattoo. So did you, dud[1097 chars]the\"] != [' Come on. Dude. You got a tattoo. So did you, dud[1097 chars]the\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_distil_token_timestamp_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_generation_multilingual",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_timestamp_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation",
      "trace": "(line 1882)  KeyError: 0",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1882)  KeyError: 0",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_token_timestamp_generation",
      "trace": "(line 2023)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2023)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_speculative_decoding_distil",
      "trace": "(line 323)  UnboundLocalError: local variable 'is_updated' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 323)  UnboundLocalError: local variable 'is_updated' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_speculative_decoding_non_distil",
      "trace": "(line 2390)  AssertionError: Lists differ: [' Mr[35 chars]dle classes and we are glad to welcome his gospel. Thank you.'] != [' Mr[35 chars]dle classes and we are glad to welcome his gospel.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2390)  AssertionError: Lists differ: [' Mr[35 chars]dle classes and we are glad to welcome his gospel. Thank you.'] != [' Mr[35 chars]dle classes and we are glad to welcome his gospel.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_en_batched_generation",
      "trace": "(line 1541)  AssertionError: The values for attribute 'shape' do not match: torch.Size([4, 18]) != torch.Size([4, 20]).",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1541)  AssertionError: The values for attribute 'shape' do not match: torch.Size([4, 18]) != torch.Size([4, 20]).",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_en_generation",
      "trace": "(line 1383)  AssertionError: ' Mr.[15 chars] apostle of the middle classes, and we are glad to' != ' Mr.[15 chars] apostle of the middle classes, and we are glad to welcome his'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1383)  AssertionError: ' Mr.[15 chars] apostle of the middle classes, and we are glad to' != ' Mr.[15 chars] apostle of the middle classes, and we are glad to welcome his'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_generation",
      "trace": "(line 1399)  AssertionError: ' Mr.[21 chars]le of the middle classes and we are glad' != ' Mr.[21 chars]le of the middle classes and we are glad to welcome his gospel'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1399)  AssertionError: ' Mr.[21 chars]le of the middle classes and we are glad' != ' Mr.[21 chars]le of the middle classes and we are glad to welcome his gospel'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation",
      "trace": "(line 1698)  KeyError: 0",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1698)  KeyError: 0",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_specaugment_librispeech",
      "trace": "(line 2137)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2137)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_static_generation_long_form",
      "trace": "(line 3098)  RuntimeError: The size of tensor a (352) must match the size of tensor b (354) at non-singleton dimension 1",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 3098)  RuntimeError: The size of tensor a (352) must match the size of tensor b (354) at non-singleton dimension 1",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_timestamp_generation",
      "trace": "(line 4160)  IndexError: list index out of range",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4176)  IndexError: list index out of range",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard",
      "trace": "(line 2787)  AssertionError: Lists differ: [\" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!'] != [\" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2787)  AssertionError: Lists differ: [\" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!'] != [\" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond",
      "trace": "(line 2841)  AssertionError: Lists differ: [\" Fo[425 chars]a fischer shows in lip nitskey attack the fisc[5579 chars]y .\"] != [\" Fo[425 chars]a fisher shows in lip-nitsky attack that culmi[7900 chars]le!\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2841)  AssertionError: Lists differ: [\" Fo[425 chars]a fischer shows in lip nitskey attack the fisc[5579 chars]y .\"] != [\" Fo[425 chars]a fisher shows in lip-nitsky attack that culmi[7900 chars]le!\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_no_speech_detection",
      "trace": "(line 2947)  AssertionError: Lists differ: [\" Fo[435 chars]sting And so so so so so so so so so so so so [7329 chars]our\"] != [\" Fo[435 chars]sting\", ' Ladies and gentlemen, you know, I sp[1433 chars]es.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2947)  AssertionError: Lists differ: [\" Fo[435 chars]sting And so so so so so so so so so so so so [7329 chars]our\"] != [\" Fo[435 chars]sting\", ' Ladies and gentlemen, you know, I sp[1433 chars]es.\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_single_batch",
      "trace": "(line 294)  TypeError: '>=' not supported between instances of 'list' and 'int'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 294)  TypeError: '>=' not supported between instances of 'list' and 'int'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "single",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond",
      "trace": "(line 2556)  AssertionError: Lists differ: [\" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.\"] != [\" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2556)  AssertionError: Lists differ: [\" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.\"] != [\" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "zamba",
      "gpu": "single",
      "test": "tests/models/zamba/test_modeling_zamba.py::ZambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 518)  AssertionError: '[PAD[35 chars]me a story about a time when you had to make a difficult' != '[PAD[35 chars]me a story about a time when you were in a difficult situation'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 476)  AssertionError: '<s> [20 chars]g on this lovely evening? I hope you are having a great day. I' != '<s> [20 chars]g on this lovely evening? I hope you are all doing well. I am'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "zamba",
      "gpu": "single",
      "test": "tests/models/zamba/test_modeling_zamba.py::ZambaModelIntegrationTest::test_simple_generate",
      "trace": "(line 501)  AssertionError: The values for attribute 'dtype' do not match: torch.bfloat16 != torch.float32.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 463)  AssertionError: The values for attribute 'dtype' do not match: torch.bfloat16 != torch.float32.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "zamba2",
      "gpu": "single",
      "test": "tests/models/zamba2/test_modeling_zamba2.py::Zamba2ModelIntegrationTest::test_simple_batched_generate_with_padding_0_cuda",
      "trace": "(line 600)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 600)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_TopH_example_integration",
      "trace": "(line 3212)  AssertionError: Lists differ: ['Tel[23 chars]key. Sure, here\\'s one for you:\\n\\nWhy did the[67 chars]s\"!'] != ['Tel[23 chars]key. Why did the monkey go to the doctor? Beca[34 chars]c\"!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 3215)  AssertionError: Lists differ: ['Tel[23 chars]key. Sure, here\\'s one for you:\\n\\nWhy did the[67 chars]s\"!'] != ['Tel[23 chars]key. Why did the monkey go to the doctor? Beca[34 chars]c\"!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_assisted_generation_early_exit",
      "trace": "(line 4074)  AssertionError: Lists differ: ['Ali[20 chars]ng a game of poker. Alice has a pair of 7s and Bob has a pair'] != ['Ali[20 chars]ng a game of poker. Alice has a pair of 8s and Bob has a pair']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4077)  AssertionError: Lists differ: ['Ali[20 chars]ng a game of poker. Alice has a pair of 7s and Bob has a pair'] != ['Ali[20 chars]ng a game of poker. Alice has a pair of 8s and Bob has a pair']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_beam_search_advanced_stopping_criteria",
      "trace": "(line 681)  AssertionError: True is not false",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 681)  AssertionError: True is not false",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_beam_search_early_stop_heuristic",
      "trace": "(line 2962)  AssertionError: \"<|us[317 chars]}\\\\).\\nThe sum of 3 and 5 is \\\\(3 + 5 = 8\\\\).\\[40 chars]\\\\).\" != \"<|us[317 chars]}\\\\).\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2965)  AssertionError: \"<|us[317 chars]}\\\\).\\nThe sum of 3 and 5 is \\\\(3 + 5 = 8\\\\).\\[40 chars]\\\\).\" != \"<|us[317 chars]}\\\\).\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_cache_device_map_with_vision_layer_device_map",
      "trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.vision_tower.embeddings.patch_embedding.weight, model.vision_tower.embeddings.patch_embedding.bias, model.vision_tower.embeddings.position_embedding.weight, model.vision_tower.encoder.layers.0.layer_norm1.weight, model.vision_tower.encoder.layers.0.layer_norm1.bias, model.vision_tower.encoder.layers.0.self_attn.k_proj.weight, model.vision_tower.encoder.layers.0.self_attn.k_proj.bias, model.vision_tower.encoder.layers.0.self_attn.v_proj.weight, model.vision_tower.encoder.layers.0.self_attn.v_proj.bias, model.vision_tower.encoder.layers.0.self_attn.q_proj.weight, model.vision_tower.encoder.layers.0.self_attn.q_proj.bias, model.vision_tower.encoder.layers.0.self_attn.out_proj.weight, model.vision_tower.encoder.layers.0.self_attn.out_proj.bias, model.vision_tower.encoder.layers.0.layer_norm2.weight, model.vision_tower.encoder.layers.0.layer_norm2.bias, model.vision_tower.encoder.layers.0.mlp.fc1.weight, model.vision_tower.encoder.layers.0.mlp.fc1.bias, model.vision_tower.encoder.layers.0.mlp.fc2.weight, model.vision_tower.encoder.layers.0.mlp.fc2.bias, model.vision_tower.encoder.layers.1.layer_norm1.weight, model.vision_tower.encoder.layers.1.layer_norm1.bias, model.vision_tower.encoder.layers.1.self_attn.k_proj.weight, model.vision_tower.encoder.layers.1.self_attn.k_proj.bias, model.vision_tower.encoder.layers.1.self_attn.v_proj.weight, model.vision_tower.encoder.layers.1.self_attn.v_proj.bias, model.vision_tower.encoder.layers.1.self_attn.q_proj.weight, model.vision_tower.encoder.layers.1.self_attn.q_proj.bias, model.vision_tower.encoder.layers.1.self_attn.out_proj.weight, model.vision_tower.encoder.layers.1.self_attn.out_proj.bias, model.vision_tower.encoder.layers.1.layer_norm2.weight, model.vision_tower.encoder.layers.1.layer_norm2.bias, model.vision_tower.encoder.layers.1.mlp.fc1.weight, model.vision_tower.encoder.layers.1.mlp.fc1.bias, model.vision_tower.encoder.layers.1.mlp.fc2.weight, model.vision_tower.encoder.layers.1.mlp.fc2.bias, model.vision_tower.encoder.layers.2.layer_norm1.weight, model.vision_tower.encoder.layers.2.layer_norm1.bias, model.vision_tower.encoder.layers.2.self_attn.k_proj.weight, model.vision_tower.encoder.layers.2.self_attn.k_proj.bias, model.vision_tower.encoder.layers.2.self_attn.v_proj.weight, model.vision_tower.encoder.layers.2.self_attn.v_proj.bias, model.vision_tower.encoder.layers.2.self_attn.q_proj.weight, model.vision_tower.encoder.layers.2.self_attn.q_proj.bias, model.vision_tower.encoder.layers.2.self_attn.out_proj.weight, model.vision_tower.encoder.layers.2.self_attn.out_proj.bias, model.vision_tower.encoder.layers.2.layer_norm2.weight, model.vision_tower.encoder.layers.2.layer_norm2.bias, model.vision_tower.encoder.layers.2.mlp.fc1.weight, model.vision_tower.encoder.layers.2.mlp.fc1.bias, model.vision_tower.encoder.layers.2.mlp.fc2.weight, model.vision_tower.encoder.layers.2.mlp.fc2.bias, model.vision_tower.encoder.layers.3.layer_norm1.weight, model.vision_tower.encoder.layers.3.layer_norm1.bias, model.vision_tower.encoder.layers.3.self_attn.k_proj.weight, model.vision_tower.encoder.layers.3.self_attn.k_proj.bias, model.vision_tower.encoder.layers.3.self_attn.v_proj.weight, model.vision_tower.encoder.layers.3.self_attn.v_proj.bias, model.vision_tower.encoder.layers.3.self_attn.q_proj.weight, model.vision_tower.encoder.layers.3.self_attn.q_proj.bias, model.vision_tower.encoder.layers.3.self_attn.out_proj.weight, model.vision_tower.encoder.layers.3.self_attn.out_proj.bias, model.vision_tower.encoder.layers.3.layer_norm2.weight, model.vision_tower.encoder.layers.3.layer_norm2.bias, model.vision_tower.encoder.layers.3.mlp.fc1.weight, model.vision_tower.encoder.layers.3.mlp.fc1.bias, model.vision_tower.encoder.layers.3.mlp.fc2.weight, model.vision_tower.encoder.layers.3.mlp.fc2.bias, model.vision_tower.encoder.layers.4.layer_norm1.weight, model.vision_tower.encoder.layers.4.layer_norm1.bias, model.vision_tower.encoder.layers.4.self_attn.k_proj.weight, model.vision_tower.encoder.layers.4.self_attn.k_proj.bias, model.vision_tower.encoder.layers.4.self_attn.v_proj.weight, model.vision_tower.encoder.layers.4.self_attn.v_proj.bias, model.vision_tower.encoder.layers.4.self_attn.q_proj.weight, model.vision_tower.encoder.layers.4.self_attn.q_proj.bias, model.vision_tower.encoder.layers.4.self_attn.out_proj.weight, model.vision_tower.encoder.layers.4.self_attn.out_proj.bias, model.vision_tower.encoder.layers.4.layer_norm2.weight, model.vision_tower.encoder.layers.4.layer_norm2.bias, model.vision_tower.encoder.layers.4.mlp.fc1.weight, model.vision_tower.encoder.layers.4.mlp.fc1.bias, model.vision_tower.encoder.layers.4.mlp.fc2.weight, model.vision_tower.encoder.layers.4.mlp.fc2.bias, model.vision_tower.encoder.layers.5.layer_norm1.weight, model.vision_tower.encoder.layers.5.layer_norm1.bias, model.vision_tower.encoder.layers.5.self_attn.k_proj.weight, model.vision_tower.encoder.layers.5.self_attn.k_proj.bias, model.vision_tower.encoder.layers.5.self_attn.v_proj.weight, model.vision_tower.encoder.layers.5.self_attn.v_proj.bias, model.vision_tower.encoder.layers.5.self_attn.q_proj.weight, model.vision_tower.encoder.layers.5.self_attn.q_proj.bias, model.vision_tower.encoder.layers.5.self_attn.out_proj.weight, model.vision_tower.encoder.layers.5.self_attn.out_proj.bias, model.vision_tower.encoder.layers.5.layer_norm2.weight, model.vision_tower.encoder.layers.5.layer_norm2.bias, model.vision_tower.encoder.layers.5.mlp.fc1.weight, model.vision_tower.encoder.layers.5.mlp.fc1.bias, model.vision_tower.encoder.layers.5.mlp.fc2.weight, model.vision_tower.encoder.layers.5.mlp.fc2.bias, model.vision_tower.encoder.layers.6.layer_norm1.weight, model.vision_tower.encoder.layers.6.layer_norm1.bias, model.vision_tower.encoder.layers.6.self_attn.k_proj.weight, model.vision_tower.encoder.layers.6.self_attn.k_proj.bias, model.vision_tower.encoder.layers.6.self_attn.v_proj.weight, model.vision_tower.encoder.layers.6.self_attn.v_proj.bias, model.vision_tower.encoder.layers.6.self_attn.q_proj.weight, model.vision_tower.encoder.layers.6.self_attn.q_proj.bias, model.vision_tower.encoder.layers.6.self_attn.out_proj.weight, model.vision_tower.encoder.layers.6.self_attn.out_proj.bias, model.vision_tower.encoder.layers.6.layer_norm2.weight, model.vision_tower.encoder.layers.6.layer_norm2.bias, model.vision_tower.encoder.layers.6.mlp.fc1.weight, model.vision_tower.encoder.layers.6.mlp.fc1.bias, model.vision_tower.encoder.layers.6.mlp.fc2.weight, model.vision_tower.encoder.layers.6.mlp.fc2.bias, model.vision_tower.encoder.layers.7.layer_norm1.weight, model.vision_tower.encoder.layers.7.layer_norm1.bias, model.vision_tower.encoder.layers.7.self_attn.k_proj.weight, model.vision_tower.encoder.layers.7.self_attn.k_proj.bias, model.vision_tower.encoder.layers.7.self_attn.v_proj.weight, model.vision_tower.encoder.layers.7.self_attn.v_proj.bias, model.vision_tower.encoder.layers.7.self_attn.q_proj.weight, model.vision_tower.encoder.layers.7.self_attn.q_proj.bias, model.vision_tower.encoder.layers.7.self_attn.out_proj.weight, model.vision_tower.encoder.layers.7.self_attn.out_proj.bias, model.vision_tower.encoder.layers.7.layer_norm2.weight, model.vision_tower.encoder.layers.7.layer_norm2.bias, model.vision_tower.encoder.layers.7.mlp.fc1.weight, model.vision_tower.encoder.layers.7.mlp.fc1.bias, model.vision_tower.encoder.layers.7.mlp.fc2.weight, model.vision_tower.encoder.layers.7.mlp.fc2.bias, model.vision_tower.encoder.layers.8.layer_norm1.weight, model.vision_tower.encoder.layers.8.layer_norm1.bias, model.vision_tower.encoder.layers.8.self_attn.k_proj.weight, model.vision_tower.encoder.layers.8.self_attn.k_proj.bias, model.vision_tower.encoder.layers.8.self_attn.v_proj.weight, model.vision_tower.encoder.layers.8.self_attn.v_proj.bias, model.vision_tower.encoder.layers.8.self_attn.q_proj.weight, model.vision_tower.encoder.layers.8.self_attn.q_proj.bias, model.vision_tower.encoder.layers.8.self_attn.out_proj.weight, model.vision_tower.encoder.layers.8.self_attn.out_proj.bias, model.vision_tower.encoder.layers.8.layer_norm2.weight, model.vision_tower.encoder.layers.8.layer_norm2.bias, model.vision_tower.encoder.layers.8.mlp.fc1.weight, model.vision_tower.encoder.layers.8.mlp.fc1.bias, model.vision_tower.encoder.layers.8.mlp.fc2.weight, model.vision_tower.encoder.layers.8.mlp.fc2.bias, model.vision_tower.encoder.layers.9.layer_norm1.weight, model.vision_tower.encoder.layers.9.layer_norm1.bias, model.vision_tower.encoder.layers.9.self_attn.k_proj.weight, model.vision_tower.encoder.layers.9.self_attn.k_proj.bias, model.vision_tower.encoder.layers.9.self_attn.v_proj.weight, model.vision_tower.encoder.layers.9.self_attn.v_proj.bias, model.vision_tower.encoder.layers.9.self_attn.q_proj.weight, model.vision_tower.encoder.layers.9.self_attn.q_proj.bias, model.vision_tower.encoder.layers.9.self_attn.out_proj.weight, model.vision_tower.encoder.layers.9.self_attn.out_proj.bias, model.vision_tower.encoder.layers.9.layer_norm2.weight, model.vision_tower.encoder.layers.9.layer_norm2.bias, model.vision_tower.encoder.layers.9.mlp.fc1.weight, model.vision_tower.encoder.layers.9.mlp.fc1.bias, model.vision_tower.encoder.layers.9.mlp.fc2.weight, model.vision_tower.encoder.layers.9.mlp.fc2.bias, model.vision_tower.encoder.layers.10.layer_norm1.weight, model.vision_tower.encoder.layers.10.layer_norm1.bias, model.vision_tower.encoder.layers.10.self_attn.k_proj.weight, model.vision_tower.encoder.layers.10.self_attn.k_proj.bias, model.vision_tower.encoder.layers.10.self_attn.v_proj.weight, model.vision_tower.encoder.layers.10.self_attn.v_proj.bias, model.vision_tower.encoder.layers.10.self_attn.q_proj.weight, model.vision_tower.encoder.layers.10.self_attn.q_proj.bias, model.vision_tower.encoder.layers.10.self_attn.out_proj.weight, model.vision_tower.encoder.layers.10.self_attn.out_proj.bias, model.vision_tower.encoder.layers.10.layer_norm2.weight, model.vision_tower.encoder.layers.10.layer_norm2.bias, model.vision_tower.encoder.layers.10.mlp.fc1.weight, model.vision_tower.encoder.layers.10.mlp.fc1.bias, model.vision_tower.encoder.layers.10.mlp.fc2.weight, model.vision_tower.encoder.layers.10.mlp.fc2.bias, model.vision_tower.encoder.layers.11.layer_norm1.weight, model.vision_tower.encoder.layers.11.layer_norm1.bias, model.vision_tower.encoder.layers.11.self_attn.k_proj.weight, model.vision_tower.encoder.layers.11.self_attn.k_proj.bias, model.vision_tower.encoder.layers.11.self_attn.v_proj.weight, model.vision_tower.encoder.layers.11.self_attn.v_proj.bias, model.vision_tower.encoder.layers.11.self_attn.q_proj.weight, model.vision_tower.encoder.layers.11.self_attn.q_proj.bias, model.vision_tower.encoder.layers.11.self_attn.out_proj.weight, model.vision_tower.encoder.layers.11.self_attn.out_proj.bias, model.vision_tower.encoder.layers.11.layer_norm2.weight, model.vision_tower.encoder.layers.11.layer_norm2.bias, model.vision_tower.encoder.layers.11.mlp.fc1.weight, model.vision_tower.encoder.layers.11.mlp.fc1.bias, model.vision_tower.encoder.layers.11.mlp.fc2.weight, model.vision_tower.encoder.layers.11.mlp.fc2.bias, model.vision_tower.encoder.layers.12.layer_norm1.weight, model.vision_tower.encoder.layers.12.layer_norm1.bias, model.vision_tower.encoder.layers.12.self_attn.k_proj.weight, model.vision_tower.encoder.layers.12.self_attn.k_proj.bias, model.vision_tower.encoder.layers.12.self_attn.v_proj.weight, model.vision_tower.encoder.layers.12.self_attn.v_proj.bias, model.vision_tower.encoder.layers.12.self_attn.q_proj.weight, model.vision_tower.encoder.layers.12.self_attn.q_proj.bias, model.vision_tower.encoder.layers.12.self_attn.out_proj.weight, model.vision_tower.encoder.layers.12.self_attn.out_proj.bias, model.vision_tower.encoder.layers.12.layer_norm2.weight, model.vision_tower.encoder.layers.12.layer_norm2.bias, model.vision_tower.encoder.layers.12.mlp.fc1.weight, model.vision_tower.encoder.layers.12.mlp.fc1.bias, model.vision_tower.encoder.layers.12.mlp.fc2.weight, model.vision_tower.encoder.layers.12.mlp.fc2.bias, model.vision_tower.encoder.layers.13.layer_norm1.weight, model.vision_tower.encoder.layers.13.layer_norm1.bias, model.vision_tower.encoder.layers.13.self_attn.k_proj.weight, model.vision_tower.encoder.layers.13.self_attn.k_proj.bias, model.vision_tower.encoder.layers.13.self_attn.v_proj.weight, model.vision_tower.encoder.layers.13.self_attn.v_proj.bias, model.vision_tower.encoder.layers.13.self_attn.q_proj.weight, model.vision_tower.encoder.layers.13.self_attn.q_proj.bias, model.vision_tower.encoder.layers.13.self_attn.out_proj.weight, model.vision_tower.encoder.layers.13.self_attn.out_proj.bias, model.vision_tower.encoder.layers.13.layer_norm2.weight, model.vision_tower.encoder.layers.13.layer_norm2.bias, model.vision_tower.encoder.layers.13.mlp.fc1.weight, model.vision_tower.encoder.layers.13.mlp.fc1.bias, model.vision_tower.encoder.layers.13.mlp.fc2.weight, model.vision_tower.encoder.layers.13.mlp.fc2.bias, model.vision_tower.encoder.layers.14.layer_norm1.weight, model.vision_tower.encoder.layers.14.layer_norm1.bias, model.vision_tower.encoder.layers.14.self_attn.k_proj.weight, model.vision_tower.encoder.layers.14.self_attn.k_proj.bias, model.vision_tower.encoder.layers.14.self_attn.v_proj.weight, model.vision_tower.encoder.layers.14.self_attn.v_proj.bias, model.vision_tower.encoder.layers.14.self_attn.q_proj.weight, model.vision_tower.encoder.layers.14.self_attn.q_proj.bias, model.vision_tower.encoder.layers.14.self_attn.out_proj.weight, model.vision_tower.encoder.layers.14.self_attn.out_proj.bias, model.vision_tower.encoder.layers.14.layer_norm2.weight, model.vision_tower.encoder.layers.14.layer_norm2.bias, model.vision_tower.encoder.layers.14.mlp.fc1.weight, model.vision_tower.encoder.layers.14.mlp.fc1.bias, model.vision_tower.encoder.layers.14.mlp.fc2.weight, model.vision_tower.encoder.layers.14.mlp.fc2.bias, model.vision_tower.encoder.layers.15.layer_norm1.weight, model.vision_tower.encoder.layers.15.layer_norm1.bias, model.vision_tower.encoder.layers.15.self_attn.k_proj.weight, model.vision_tower.encoder.layers.15.self_attn.k_proj.bias, model.vision_tower.encoder.layers.15.self_attn.v_proj.weight, model.vision_tower.encoder.layers.15.self_attn.v_proj.bias, model.vision_tower.encoder.layers.15.self_attn.q_proj.weight, model.vision_tower.encoder.layers.15.self_attn.q_proj.bias, model.vision_tower.encoder.layers.15.self_attn.out_proj.weight, model.vision_tower.encoder.layers.15.self_attn.out_proj.bias, model.vision_tower.encoder.layers.15.layer_norm2.weight, model.vision_tower.encoder.layers.15.layer_norm2.bias, model.vision_tower.encoder.layers.15.mlp.fc1.weight, model.vision_tower.encoder.layers.15.mlp.fc1.bias, model.vision_tower.encoder.layers.15.mlp.fc2.weight, model.vision_tower.encoder.layers.15.mlp.fc2.bias, model.vision_tower.encoder.layers.16.layer_norm1.weight, model.vision_tower.encoder.layers.16.layer_norm1.bias, model.vision_tower.encoder.layers.16.self_attn.k_proj.weight, model.vision_tower.encoder.layers.16.self_attn.k_proj.bias, model.vision_tower.encoder.layers.16.self_attn.v_proj.weight, model.vision_tower.encoder.layers.16.self_attn.v_proj.bias, model.vision_tower.encoder.layers.16.self_attn.q_proj.weight, model.vision_tower.encoder.layers.16.self_attn.q_proj.bias, model.vision_tower.encoder.layers.16.self_attn.out_proj.weight, model.vision_tower.encoder.layers.16.self_attn.out_proj.bias, model.vision_tower.encoder.layers.16.layer_norm2.weight, model.vision_tower.encoder.layers.16.layer_norm2.bias, model.vision_tower.encoder.layers.16.mlp.fc1.weight, model.vision_tower.encoder.layers.16.mlp.fc1.bias, model.vision_tower.encoder.layers.16.mlp.fc2.weight, model.vision_tower.encoder.layers.16.mlp.fc2.bias, model.vision_tower.encoder.layers.17.layer_norm1.weight, model.vision_tower.encoder.layers.17.layer_norm1.bias, model.vision_tower.encoder.layers.17.self_attn.k_proj.weight, model.vision_tower.encoder.layers.17.self_attn.k_proj.bias, model.vision_tower.encoder.layers.17.self_attn.v_proj.weight, model.vision_tower.encoder.layers.17.self_attn.v_proj.bias, model.vision_tower.encoder.layers.17.self_attn.q_proj.weight, model.vision_tower.encoder.layers.17.self_attn.q_proj.bias, model.vision_tower.encoder.layers.17.self_attn.out_proj.weight, model.vision_tower.encoder.layers.17.self_attn.out_proj.bias, model.vision_tower.encoder.layers.17.layer_norm2.weight, model.vision_tower.encoder.layers.17.layer_norm2.bias, model.vision_tower.encoder.layers.17.mlp.fc1.weight, model.vision_tower.encoder.layers.17.mlp.fc1.bias, model.vision_tower.encoder.layers.17.mlp.fc2.weight, model.vision_tower.encoder.layers.17.mlp.fc2.bias, model.vision_tower.encoder.layers.18.layer_norm1.weight, model.vision_tower.encoder.layers.18.layer_norm1.bias, model.vision_tower.encoder.layers.18.self_attn.k_proj.weight, model.vision_tower.encoder.layers.18.self_attn.k_proj.bias, model.vision_tower.encoder.layers.18.self_attn.v_proj.weight, model.vision_tower.encoder.layers.18.self_attn.v_proj.bias, model.vision_tower.encoder.layers.18.self_attn.q_proj.weight, model.vision_tower.encoder.layers.18.self_attn.q_proj.bias, model.vision_tower.encoder.layers.18.self_attn.out_proj.weight, model.vision_tower.encoder.layers.18.self_attn.out_proj.bias, model.vision_tower.encoder.layers.18.layer_norm2.weight, model.vision_tower.encoder.layers.18.layer_norm2.bias, model.vision_tower.encoder.layers.18.mlp.fc1.weight, model.vision_tower.encoder.layers.18.mlp.fc1.bias, model.vision_tower.encoder.layers.18.mlp.fc2.weight, model.vision_tower.encoder.layers.18.mlp.fc2.bias, model.vision_tower.encoder.layers.19.layer_norm1.weight, model.vision_tower.encoder.layers.19.layer_norm1.bias, model.vision_tower.encoder.layers.19.self_attn.k_proj.weight, model.vision_tower.encoder.layers.19.self_attn.k_proj.bias, model.vision_tower.encoder.layers.19.self_attn.v_proj.weight, model.vision_tower.encoder.layers.19.self_attn.v_proj.bias, model.vision_tower.encoder.layers.19.self_attn.q_proj.weight, model.vision_tower.encoder.layers.19.self_attn.q_proj.bias, model.vision_tower.encoder.layers.19.self_attn.out_proj.weight, model.vision_tower.encoder.layers.19.self_attn.out_proj.bias, model.vision_tower.encoder.layers.19.layer_norm2.weight, model.vision_tower.encoder.layers.19.layer_norm2.bias, model.vision_tower.encoder.layers.19.mlp.fc1.weight, model.vision_tower.encoder.layers.19.mlp.fc1.bias, model.vision_tower.encoder.layers.19.mlp.fc2.weight, model.vision_tower.encoder.layers.19.mlp.fc2.bias, model.vision_tower.encoder.layers.20.layer_norm1.weight, model.vision_tower.encoder.layers.20.layer_norm1.bias, model.vision_tower.encoder.layers.20.self_attn.k_proj.weight, model.vision_tower.encoder.layers.20.self_attn.k_proj.bias, model.vision_tower.encoder.layers.20.self_attn.v_proj.weight, model.vision_tower.encoder.layers.20.self_attn.v_proj.bias, model.vision_tower.encoder.layers.20.self_attn.q_proj.weight, model.vision_tower.encoder.layers.20.self_attn.q_proj.bias, model.vision_tower.encoder.layers.20.self_attn.out_proj.weight, model.vision_tower.encoder.layers.20.self_attn.out_proj.bias, model.vision_tower.encoder.layers.20.layer_norm2.weight, model.vision_tower.encoder.layers.20.layer_norm2.bias, model.vision_tower.encoder.layers.20.mlp.fc1.weight, model.vision_tower.encoder.layers.20.mlp.fc1.bias, model.vision_tower.encoder.layers.20.mlp.fc2.weight, model.vision_tower.encoder.layers.20.mlp.fc2.bias, model.vision_tower.encoder.layers.21.layer_norm1.weight, model.vision_tower.encoder.layers.21.layer_norm1.bias, model.vision_tower.encoder.layers.21.self_attn.k_proj.weight, model.vision_tower.encoder.layers.21.self_attn.k_proj.bias, model.vision_tower.encoder.layers.21.self_attn.v_proj.weight, model.vision_tower.encoder.layers.21.self_attn.v_proj.bias, model.vision_tower.encoder.layers.21.self_attn.q_proj.weight, model.vision_tower.encoder.layers.21.self_attn.q_proj.bias, model.vision_tower.encoder.layers.21.self_attn.out_proj.weight, model.vision_tower.encoder.layers.21.self_attn.out_proj.bias, model.vision_tower.encoder.layers.21.layer_norm2.weight, model.vision_tower.encoder.layers.21.layer_norm2.bias, model.vision_tower.encoder.layers.21.mlp.fc1.weight, model.vision_tower.encoder.layers.21.mlp.fc1.bias, model.vision_tower.encoder.layers.21.mlp.fc2.weight, model.vision_tower.encoder.layers.21.mlp.fc2.bias, model.vision_tower.encoder.layers.22.layer_norm1.weight, model.vision_tower.encoder.layers.22.layer_norm1.bias, model.vision_tower.encoder.layers.22.self_attn.k_proj.weight, model.vision_tower.encoder.layers.22.self_attn.k_proj.bias, model.vision_tower.encoder.layers.22.self_attn.v_proj.weight, model.vision_tower.encoder.layers.22.self_attn.v_proj.bias, model.vision_tower.encoder.layers.22.self_attn.q_proj.weight, model.vision_tower.encoder.layers.22.self_attn.q_proj.bias, model.vision_tower.encoder.layers.22.self_attn.out_proj.weight, model.vision_tower.encoder.layers.22.self_attn.out_proj.bias, model.vision_tower.encoder.layers.22.layer_norm2.weight, model.vision_tower.encoder.layers.22.layer_norm2.bias, model.vision_tower.encoder.layers.22.mlp.fc1.weight, model.vision_tower.encoder.layers.22.mlp.fc1.bias, model.vision_tower.encoder.layers.22.mlp.fc2.weight, model.vision_tower.encoder.layers.22.mlp.fc2.bias, model.vision_tower.encoder.layers.23.layer_norm1.weight, model.vision_tower.encoder.layers.23.layer_norm1.bias, model.vision_tower.encoder.layers.23.self_attn.k_proj.weight, model.vision_tower.encoder.layers.23.self_attn.k_proj.bias, model.vision_tower.encoder.layers.23.self_attn.v_proj.weight, model.vision_tower.encoder.layers.23.self_attn.v_proj.bias, model.vision_tower.encoder.layers.23.self_attn.q_proj.weight, model.vision_tower.encoder.layers.23.self_attn.q_proj.bias, model.vision_tower.encoder.layers.23.self_attn.out_proj.weight, model.vision_tower.encoder.layers.23.self_attn.out_proj.bias, model.vision_tower.encoder.layers.23.layer_norm2.weight, model.vision_tower.encoder.layers.23.layer_norm2.bias, model.vision_tower.encoder.layers.23.mlp.fc1.weight, model.vision_tower.encoder.layers.23.mlp.fc1.bias, model.vision_tower.encoder.layers.23.mlp.fc2.weight, model.vision_tower.encoder.layers.23.mlp.fc2.bias, model.vision_tower.encoder.layers.24.layer_norm1.weight, model.vision_tower.encoder.layers.24.layer_norm1.bias, model.vision_tower.encoder.layers.24.self_attn.k_proj.weight, model.vision_tower.encoder.layers.24.self_attn.k_proj.bias, model.vision_tower.encoder.layers.24.self_attn.v_proj.weight, model.vision_tower.encoder.layers.24.self_attn.v_proj.bias, model.vision_tower.encoder.layers.24.self_attn.q_proj.weight, model.vision_tower.encoder.layers.24.self_attn.q_proj.bias, model.vision_tower.encoder.layers.24.self_attn.out_proj.weight, model.vision_tower.encoder.layers.24.self_attn.out_proj.bias, model.vision_tower.encoder.layers.24.layer_norm2.weight, model.vision_tower.encoder.layers.24.layer_norm2.bias, model.vision_tower.encoder.layers.24.mlp.fc1.weight, model.vision_tower.encoder.layers.24.mlp.fc1.bias, model.vision_tower.encoder.layers.24.mlp.fc2.weight, model.vision_tower.encoder.layers.24.mlp.fc2.bias, model.vision_tower.encoder.layers.25.layer_norm1.weight, model.vision_tower.encoder.layers.25.layer_norm1.bias, model.vision_tower.encoder.layers.25.self_attn.k_proj.weight, model.vision_tower.encoder.layers.25.self_attn.k_proj.bias, model.vision_tower.encoder.layers.25.self_attn.v_proj.weight, model.vision_tower.encoder.layers.25.self_attn.v_proj.bias, model.vision_tower.encoder.layers.25.self_attn.q_proj.weight, model.vision_tower.encoder.layers.25.self_attn.q_proj.bias, model.vision_tower.encoder.layers.25.self_attn.out_proj.weight, model.vision_tower.encoder.layers.25.self_attn.out_proj.bias, model.vision_tower.encoder.layers.25.layer_norm2.weight, model.vision_tower.encoder.layers.25.layer_norm2.bias, model.vision_tower.encoder.layers.25.mlp.fc1.weight, model.vision_tower.encoder.layers.25.mlp.fc1.bias, model.vision_tower.encoder.layers.25.mlp.fc2.weight, model.vision_tower.encoder.layers.25.mlp.fc2.bias, model.vision_tower.encoder.layers.26.layer_norm1.weight, model.vision_tower.encoder.layers.26.layer_norm1.bias, model.vision_tower.encoder.layers.26.self_attn.k_proj.weight, model.vision_tower.encoder.layers.26.self_attn.k_proj.bias, model.vision_tower.encoder.layers.26.self_attn.v_proj.weight, model.vision_tower.encoder.layers.26.self_attn.v_proj.bias, model.vision_tower.encoder.layers.26.self_attn.q_proj.weight, model.vision_tower.encoder.layers.26.self_attn.q_proj.bias, model.vision_tower.encoder.layers.26.self_attn.out_proj.weight, model.vision_tower.encoder.layers.26.self_attn.out_proj.bias, model.vision_tower.encoder.layers.26.layer_norm2.weight, model.vision_tower.encoder.layers.26.layer_norm2.bias, model.vision_tower.encoder.layers.26.mlp.fc1.weight, model.vision_tower.encoder.layers.26.mlp.fc1.bias, model.vision_tower.encoder.layers.26.mlp.fc2.weight, model.vision_tower.encoder.layers.26.mlp.fc2.bias, model.vision_tower.post_layernorm.weight, model.vision_tower.post_layernorm.bias, model.multi_modal_projector.mm_input_projection_weight, model.multi_modal_projector.mm_soft_emb_norm.weight, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.self_attn.q_norm.weight, model.language_model.layers.0.self_attn.k_norm.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.0.pre_feedforward_layernorm.weight, model.language_model.layers.0.post_feedforward_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.self_attn.q_norm.weight, model.language_model.layers.1.self_attn.k_norm.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.layers.1.pre_feedforward_layernorm.weight, model.language_model.layers.1.post_feedforward_layernorm.weight, model.language_model.layers.2.self_attn.q_proj.weight, model.language_model.layers.2.self_attn.k_proj.weight, model.language_model.layers.2.self_attn.v_proj.weight, model.language_model.layers.2.self_attn.o_proj.weight, model.language_model.layers.2.self_attn.q_norm.weight, model.language_model.layers.2.self_attn.k_norm.weight, model.language_model.layers.2.mlp.gate_proj.weight, model.language_model.layers.2.mlp.up_proj.weight, model.language_model.layers.2.mlp.down_proj.weight, model.language_model.layers.2.input_layernorm.weight, model.language_model.layers.2.post_attention_layernorm.weight, model.language_model.layers.2.pre_feedforward_layernorm.weight, model.language_model.layers.2.post_feedforward_layernorm.weight, model.language_model.layers.3.self_attn.q_proj.weight, model.language_model.layers.3.self_attn.k_proj.weight, model.language_model.layers.3.self_attn.v_proj.weight, model.language_model.layers.3.self_attn.o_proj.weight, model.language_model.layers.3.self_attn.q_norm.weight, model.language_model.layers.3.self_attn.k_norm.weight, model.language_model.layers.3.mlp.gate_proj.weight, model.language_model.layers.3.mlp.up_proj.weight, model.language_model.layers.3.mlp.down_proj.weight, model.language_model.layers.3.input_layernorm.weight, model.language_model.layers.3.post_attention_layernorm.weight, model.language_model.layers.3.pre_feedforward_layernorm.weight, model.language_model.layers.3.post_feedforward_layernorm.weight, model.language_model.layers.4.self_attn.q_proj.weight, model.language_model.layers.4.self_attn.k_proj.weight, model.language_model.layers.4.self_attn.v_proj.weight, model.language_model.layers.4.self_attn.o_proj.weight, model.language_model.layers.4.self_attn.q_norm.weight, model.language_model.layers.4.self_attn.k_norm.weight, model.language_model.layers.4.mlp.gate_proj.weight, model.language_model.layers.4.mlp.up_proj.weight, model.language_model.layers.4.mlp.down_proj.weight, model.language_model.layers.4.input_layernorm.weight, model.language_model.layers.4.post_attention_layernorm.weight, model.language_model.layers.4.pre_feedforward_layernorm.weight, model.language_model.layers.4.post_feedforward_layernorm.weight, model.language_model.layers.5.self_attn.q_proj.weight, model.language_model.layers.5.self_attn.k_proj.weight, model.language_model.layers.5.self_attn.v_proj.weight, model.language_model.layers.5.self_attn.o_proj.weight, model.language_model.layers.5.self_attn.q_norm.weight, model.language_model.layers.5.self_attn.k_norm.weight, model.language_model.layers.5.mlp.gate_proj.weight, model.language_model.layers.5.mlp.up_proj.weight, model.language_model.layers.5.mlp.down_proj.weight, model.language_model.layers.5.input_layernorm.weight, model.language_model.layers.5.post_attention_layernorm.weight, model.language_model.layers.5.pre_feedforward_layernorm.weight, model.language_model.layers.5.post_feedforward_layernorm.weight, model.language_model.layers.6.self_attn.q_proj.weight, model.language_model.layers.6.self_attn.k_proj.weight, model.language_model.layers.6.self_attn.v_proj.weight, model.language_model.layers.6.self_attn.o_proj.weight, model.language_model.layers.6.self_attn.q_norm.weight, model.language_model.layers.6.self_attn.k_norm.weight, model.language_model.layers.6.mlp.gate_proj.weight, model.language_model.layers.6.mlp.up_proj.weight, model.language_model.layers.6.mlp.down_proj.weight, model.language_model.layers.6.input_layernorm.weight, model.language_model.layers.6.post_attention_layernorm.weight, model.language_model.layers.6.pre_feedforward_layernorm.weight, model.language_model.layers.6.post_feedforward_layernorm.weight, model.language_model.layers.7.self_attn.q_proj.weight, model.language_model.layers.7.self_attn.k_proj.weight, model.language_model.layers.7.self_attn.v_proj.weight, model.language_model.layers.7.self_attn.o_proj.weight, model.language_model.layers.7.self_attn.q_norm.weight, model.language_model.layers.7.self_attn.k_norm.weight, model.language_model.layers.7.mlp.gate_proj.weight, model.language_model.layers.7.mlp.up_proj.weight, model.language_model.layers.7.mlp.down_proj.weight, model.language_model.layers.7.input_layernorm.weight, model.language_model.layers.7.post_attention_layernorm.weight, model.language_model.layers.7.pre_feedforward_layernorm.weight, model.language_model.layers.7.post_feedforward_layernorm.weight, model.language_model.layers.8.self_attn.q_proj.weight, model.language_model.layers.8.self_attn.k_proj.weight, model.language_model.layers.8.self_attn.v_proj.weight, model.language_model.layers.8.self_attn.o_proj.weight, model.language_model.layers.8.self_attn.q_norm.weight, model.language_model.layers.8.self_attn.k_norm.weight, model.language_model.layers.8.mlp.gate_proj.weight, model.language_model.layers.8.mlp.up_proj.weight, model.language_model.layers.8.mlp.down_proj.weight, model.language_model.layers.8.input_layernorm.weight, model.language_model.layers.8.post_attention_layernorm.weight, model.language_model.layers.8.pre_feedforward_layernorm.weight, model.language_model.layers.8.post_feedforward_layernorm.weight, model.language_model.layers.9.self_attn.q_proj.weight, model.language_model.layers.9.self_attn.k_proj.weight, model.language_model.layers.9.self_attn.v_proj.weight, model.language_model.layers.9.self_attn.o_proj.weight, model.language_model.layers.9.self_attn.q_norm.weight, model.language_model.layers.9.self_attn.k_norm.weight, model.language_model.layers.9.mlp.gate_proj.weight, model.language_model.layers.9.mlp.up_proj.weight, model.language_model.layers.9.mlp.down_proj.weight, model.language_model.layers.9.input_layernorm.weight, model.language_model.layers.9.post_attention_layernorm.weight, model.language_model.layers.9.pre_feedforward_layernorm.weight, model.language_model.layers.9.post_feedforward_layernorm.weight, model.language_model.layers.10.self_attn.q_proj.weight, model.language_model.layers.10.self_attn.k_proj.weight, model.language_model.layers.10.self_attn.v_proj.weight, model.language_model.layers.10.self_attn.o_proj.weight, model.language_model.layers.10.self_attn.q_norm.weight, model.language_model.layers.10.self_attn.k_norm.weight, model.language_model.layers.10.mlp.gate_proj.weight, model.language_model.layers.10.mlp.up_proj.weight, model.language_model.layers.10.mlp.down_proj.weight, model.language_model.layers.10.input_layernorm.weight, model.language_model.layers.10.post_attention_layernorm.weight, model.language_model.layers.10.pre_feedforward_layernorm.weight, model.language_model.layers.10.post_feedforward_layernorm.weight, model.language_model.layers.11.self_attn.q_proj.weight, model.language_model.layers.11.self_attn.k_proj.weight, model.language_model.layers.11.self_attn.v_proj.weight, model.language_model.layers.11.self_attn.o_proj.weight, model.language_model.layers.11.self_attn.q_norm.weight, model.language_model.layers.11.self_attn.k_norm.weight, model.language_model.layers.11.mlp.gate_proj.weight, model.language_model.layers.11.mlp.up_proj.weight, model.language_model.layers.11.mlp.down_proj.weight, model.language_model.layers.11.input_layernorm.weight, model.language_model.layers.11.post_attention_layernorm.weight, model.language_model.layers.11.pre_feedforward_layernorm.weight, model.language_model.layers.11.post_feedforward_layernorm.weight, model.language_model.layers.12.self_attn.q_proj.weight, model.language_model.layers.12.self_attn.k_proj.weight, model.language_model.layers.12.self_attn.v_proj.weight, model.language_model.layers.12.self_attn.o_proj.weight, model.language_model.layers.12.self_attn.q_norm.weight, model.language_model.layers.12.self_attn.k_norm.weight, model.language_model.layers.12.mlp.gate_proj.weight, model.language_model.layers.12.mlp.up_proj.weight, model.language_model.layers.12.mlp.down_proj.weight, model.language_model.layers.12.input_layernorm.weight, model.language_model.layers.12.post_attention_layernorm.weight, model.language_model.layers.12.pre_feedforward_layernorm.weight, model.language_model.layers.12.post_feedforward_layernorm.weight, model.language_model.layers.13.self_attn.q_proj.weight, model.language_model.layers.13.self_attn.k_proj.weight, model.language_model.layers.13.self_attn.v_proj.weight, model.language_model.layers.13.self_attn.o_proj.weight, model.language_model.layers.13.self_attn.q_norm.weight, model.language_model.layers.13.self_attn.k_norm.weight, model.language_model.layers.13.mlp.gate_proj.weight, model.language_model.layers.13.mlp.up_proj.weight, model.language_model.layers.13.mlp.down_proj.weight, model.language_model.layers.13.input_layernorm.weight, model.language_model.layers.13.post_attention_layernorm.weight, model.language_model.layers.13.pre_feedforward_layernorm.weight, model.language_model.layers.13.post_feedforward_layernorm.weight, model.language_model.layers.14.self_attn.q_proj.weight, model.language_model.layers.14.self_attn.k_proj.weight, model.language_model.layers.14.self_attn.v_proj.weight, model.language_model.layers.14.self_attn.o_proj.weight, model.language_model.layers.14.self_attn.q_norm.weight, model.language_model.layers.14.self_attn.k_norm.weight, model.language_model.layers.14.mlp.gate_proj.weight, model.language_model.layers.14.mlp.up_proj.weight, model.language_model.layers.14.mlp.down_proj.weight, model.language_model.layers.14.input_layernorm.weight, model.language_model.layers.14.post_attention_layernorm.weight, model.language_model.layers.14.pre_feedforward_layernorm.weight, model.language_model.layers.14.post_feedforward_layernorm.weight, model.language_model.layers.15.self_attn.q_proj.weight, model.language_model.layers.15.self_attn.k_proj.weight, model.language_model.layers.15.self_attn.v_proj.weight, model.language_model.layers.15.self_attn.o_proj.weight, model.language_model.layers.15.self_attn.q_norm.weight, model.language_model.layers.15.self_attn.k_norm.weight, model.language_model.layers.15.mlp.gate_proj.weight, model.language_model.layers.15.mlp.up_proj.weight, model.language_model.layers.15.mlp.down_proj.weight, model.language_model.layers.15.input_layernorm.weight, model.language_model.layers.15.post_attention_layernorm.weight, model.language_model.layers.15.pre_feedforward_layernorm.weight, model.language_model.layers.15.post_feedforward_layernorm.weight, model.language_model.layers.16.self_attn.q_proj.weight, model.language_model.layers.16.self_attn.k_proj.weight, model.language_model.layers.16.self_attn.v_proj.weight, model.language_model.layers.16.self_attn.o_proj.weight, model.language_model.layers.16.self_attn.q_norm.weight, model.language_model.layers.16.self_attn.k_norm.weight, model.language_model.layers.16.mlp.gate_proj.weight, model.language_model.layers.16.mlp.up_proj.weight, model.language_model.layers.16.mlp.down_proj.weight, model.language_model.layers.16.input_layernorm.weight, model.language_model.layers.16.post_attention_layernorm.weight, model.language_model.layers.16.pre_feedforward_layernorm.weight, model.language_model.layers.16.post_feedforward_layernorm.weight, model.language_model.layers.17.self_attn.q_proj.weight, model.language_model.layers.17.self_attn.k_proj.weight, model.language_model.layers.17.self_attn.v_proj.weight, model.language_model.layers.17.self_attn.o_proj.weight, model.language_model.layers.17.self_attn.q_norm.weight, model.language_model.layers.17.self_attn.k_norm.weight, model.language_model.layers.17.mlp.gate_proj.weight, model.language_model.layers.17.mlp.up_proj.weight, model.language_model.layers.17.mlp.down_proj.weight, model.language_model.layers.17.input_layernorm.weight, model.language_model.layers.17.post_attention_layernorm.weight, model.language_model.layers.17.pre_feedforward_layernorm.weight, model.language_model.layers.17.post_feedforward_layernorm.weight, model.language_model.layers.18.self_attn.q_proj.weight, model.language_model.layers.18.self_attn.k_proj.weight, model.language_model.layers.18.self_attn.v_proj.weight, model.language_model.layers.18.self_attn.o_proj.weight, model.language_model.layers.18.self_attn.q_norm.weight, model.language_model.layers.18.self_attn.k_norm.weight, model.language_model.layers.18.mlp.gate_proj.weight, model.language_model.layers.18.mlp.up_proj.weight, model.language_model.layers.18.mlp.down_proj.weight, model.language_model.layers.18.input_layernorm.weight, model.language_model.layers.18.post_attention_layernorm.weight, model.language_model.layers.18.pre_feedforward_layernorm.weight, model.language_model.layers.18.post_feedforward_layernorm.weight, model.language_model.layers.19.self_attn.q_proj.weight, model.language_model.layers.19.self_attn.k_proj.weight, model.language_model.layers.19.self_attn.v_proj.weight, model.language_model.layers.19.self_attn.o_proj.weight, model.language_model.layers.19.self_attn.q_norm.weight, model.language_model.layers.19.self_attn.k_norm.weight, model.language_model.layers.19.mlp.gate_proj.weight, model.language_model.layers.19.mlp.up_proj.weight, model.language_model.layers.19.mlp.down_proj.weight, model.language_model.layers.19.input_layernorm.weight, model.language_model.layers.19.post_attention_layernorm.weight, model.language_model.layers.19.pre_feedforward_layernorm.weight, model.language_model.layers.19.post_feedforward_layernorm.weight, model.language_model.layers.20.self_attn.q_proj.weight, model.language_model.layers.20.self_attn.k_proj.weight, model.language_model.layers.20.self_attn.v_proj.weight, model.language_model.layers.20.self_attn.o_proj.weight, model.language_model.layers.20.self_attn.q_norm.weight, model.language_model.layers.20.self_attn.k_norm.weight, model.language_model.layers.20.mlp.gate_proj.weight, model.language_model.layers.20.mlp.up_proj.weight, model.language_model.layers.20.mlp.down_proj.weight, model.language_model.layers.20.input_layernorm.weight, model.language_model.layers.20.post_attention_layernorm.weight, model.language_model.layers.20.pre_feedforward_layernorm.weight, model.language_model.layers.20.post_feedforward_layernorm.weight, model.language_model.layers.21.self_attn.q_proj.weight, model.language_model.layers.21.self_attn.k_proj.weight, model.language_model.layers.21.self_attn.v_proj.weight, model.language_model.layers.21.self_attn.o_proj.weight, model.language_model.layers.21.self_attn.q_norm.weight, model.language_model.layers.21.self_attn.k_norm.weight, model.language_model.layers.21.mlp.gate_proj.weight, model.language_model.layers.21.mlp.up_proj.weight, model.language_model.layers.21.mlp.down_proj.weight, model.language_model.layers.21.input_layernorm.weight, model.language_model.layers.21.post_attention_layernorm.weight, model.language_model.layers.21.pre_feedforward_layernorm.weight, model.language_model.layers.21.post_feedforward_layernorm.weight, model.language_model.layers.22.self_attn.q_proj.weight, model.language_model.layers.22.self_attn.k_proj.weight, model.language_model.layers.22.self_attn.v_proj.weight, model.language_model.layers.22.self_attn.o_proj.weight, model.language_model.layers.22.self_attn.q_norm.weight, model.language_model.layers.22.self_attn.k_norm.weight, model.language_model.layers.22.mlp.gate_proj.weight, model.language_model.layers.22.mlp.up_proj.weight, model.language_model.layers.22.mlp.down_proj.weight, model.language_model.layers.22.input_layernorm.weight, model.language_model.layers.22.post_attention_layernorm.weight, model.language_model.layers.22.pre_feedforward_layernorm.weight, model.language_model.layers.22.post_feedforward_layernorm.weight, model.language_model.layers.23.self_attn.q_proj.weight, model.language_model.layers.23.self_attn.k_proj.weight, model.language_model.layers.23.self_attn.v_proj.weight, model.language_model.layers.23.self_attn.o_proj.weight, model.language_model.layers.23.self_attn.q_norm.weight, model.language_model.layers.23.self_attn.k_norm.weight, model.language_model.layers.23.mlp.gate_proj.weight, model.language_model.layers.23.mlp.up_proj.weight, model.language_model.layers.23.mlp.down_proj.weight, model.language_model.layers.23.input_layernorm.weight, model.language_model.layers.23.post_attention_layernorm.weight, model.language_model.layers.23.pre_feedforward_layernorm.weight, model.language_model.layers.23.post_feedforward_layernorm.weight, model.language_model.layers.24.self_attn.q_proj.weight, model.language_model.layers.24.self_attn.k_proj.weight, model.language_model.layers.24.self_attn.v_proj.weight, model.language_model.layers.24.self_attn.o_proj.weight, model.language_model.layers.24.self_attn.q_norm.weight, model.language_model.layers.24.self_attn.k_norm.weight, model.language_model.layers.24.mlp.gate_proj.weight, model.language_model.layers.24.mlp.up_proj.weight, model.language_model.layers.24.mlp.down_proj.weight, model.language_model.layers.24.input_layernorm.weight, model.language_model.layers.24.post_attention_layernorm.weight, model.language_model.layers.24.pre_feedforward_layernorm.weight, model.language_model.layers.24.post_feedforward_layernorm.weight, model.language_model.layers.25.self_attn.q_proj.weight, model.language_model.layers.25.self_attn.k_proj.weight, model.language_model.layers.25.self_attn.v_proj.weight, model.language_model.layers.25.self_attn.o_proj.weight, model.language_model.layers.25.self_attn.q_norm.weight, model.language_model.layers.25.self_attn.k_norm.weight, model.language_model.layers.25.mlp.gate_proj.weight, model.language_model.layers.25.mlp.up_proj.weight, model.language_model.layers.25.mlp.down_proj.weight, model.language_model.layers.25.input_layernorm.weight, model.language_model.layers.25.post_attention_layernorm.weight, model.language_model.layers.25.pre_feedforward_layernorm.weight, model.language_model.layers.25.post_feedforward_layernorm.weight, model.language_model.layers.26.self_attn.q_proj.weight, model.language_model.layers.26.self_attn.k_proj.weight, model.language_model.layers.26.self_attn.v_proj.weight, model.language_model.layers.26.self_attn.o_proj.weight, model.language_model.layers.26.self_attn.q_norm.weight, model.language_model.layers.26.self_attn.k_norm.weight, model.language_model.layers.26.mlp.gate_proj.weight, model.language_model.layers.26.mlp.up_proj.weight, model.language_model.layers.26.mlp.down_proj.weight, model.language_model.layers.26.input_layernorm.weight, model.language_model.layers.26.post_attention_layernorm.weight, model.language_model.layers.26.pre_feedforward_layernorm.weight, model.language_model.layers.26.post_feedforward_layernorm.weight, model.language_model.layers.27.self_attn.q_proj.weight, model.language_model.layers.27.self_attn.k_proj.weight, model.language_model.layers.27.self_attn.v_proj.weight, model.language_model.layers.27.self_attn.o_proj.weight, model.language_model.layers.27.self_attn.q_norm.weight, model.language_model.layers.27.self_attn.k_norm.weight, model.language_model.layers.27.mlp.gate_proj.weight, model.language_model.layers.27.mlp.up_proj.weight, model.language_model.layers.27.mlp.down_proj.weight, model.language_model.layers.27.input_layernorm.weight, model.language_model.layers.27.post_attention_layernorm.weight, model.language_model.layers.27.pre_feedforward_layernorm.weight, model.language_model.layers.27.post_feedforward_layernorm.weight, model.language_model.layers.28.self_attn.q_proj.weight, model.language_model.layers.28.self_attn.k_proj.weight, model.language_model.layers.28.self_attn.v_proj.weight, model.language_model.layers.28.self_attn.o_proj.weight, model.language_model.layers.28.self_attn.q_norm.weight, model.language_model.layers.28.self_attn.k_norm.weight, model.language_model.layers.28.mlp.gate_proj.weight, model.language_model.layers.28.mlp.up_proj.weight, model.language_model.layers.28.mlp.down_proj.weight, model.language_model.layers.28.input_layernorm.weight, model.language_model.layers.28.post_attention_layernorm.weight, model.language_model.layers.28.pre_feedforward_layernorm.weight, model.language_model.layers.28.post_feedforward_layernorm.weight, model.language_model.layers.29.self_attn.q_proj.weight, model.language_model.layers.29.self_attn.k_proj.weight, model.language_model.layers.29.self_attn.v_proj.weight, model.language_model.layers.29.self_attn.o_proj.weight, model.language_model.layers.29.self_attn.q_norm.weight, model.language_model.layers.29.self_attn.k_norm.weight, model.language_model.layers.29.mlp.gate_proj.weight, model.language_model.layers.29.mlp.up_proj.weight, model.language_model.layers.29.mlp.down_proj.weight, model.language_model.layers.29.input_layernorm.weight, model.language_model.layers.29.post_attention_layernorm.weight, model.language_model.layers.29.pre_feedforward_layernorm.weight, model.language_model.layers.29.post_feedforward_layernorm.weight, model.language_model.layers.30.self_attn.q_proj.weight, model.language_model.layers.30.self_attn.k_proj.weight, model.language_model.layers.30.self_attn.v_proj.weight, model.language_model.layers.30.self_attn.o_proj.weight, model.language_model.layers.30.self_attn.q_norm.weight, model.language_model.layers.30.self_attn.k_norm.weight, model.language_model.layers.30.mlp.gate_proj.weight, model.language_model.layers.30.mlp.up_proj.weight, model.language_model.layers.30.mlp.down_proj.weight, model.language_model.layers.30.input_layernorm.weight, model.language_model.layers.30.post_attention_layernorm.weight, model.language_model.layers.30.pre_feedforward_layernorm.weight, model.language_model.layers.30.post_feedforward_layernorm.weight, model.language_model.layers.31.self_attn.q_proj.weight, model.language_model.layers.31.self_attn.k_proj.weight, model.language_model.layers.31.self_attn.v_proj.weight, model.language_model.layers.31.self_attn.o_proj.weight, model.language_model.layers.31.self_attn.q_norm.weight, model.language_model.layers.31.self_attn.k_norm.weight, model.language_model.layers.31.mlp.gate_proj.weight, model.language_model.layers.31.mlp.up_proj.weight, model.language_model.layers.31.mlp.down_proj.weight, model.language_model.layers.31.input_layernorm.weight, model.language_model.layers.31.post_attention_layernorm.weight, model.language_model.layers.31.pre_feedforward_layernorm.weight, model.language_model.layers.31.post_feedforward_layernorm.weight, model.language_model.layers.32.self_attn.q_proj.weight, model.language_model.layers.32.self_attn.k_proj.weight, model.language_model.layers.32.self_attn.v_proj.weight, model.language_model.layers.32.self_attn.o_proj.weight, model.language_model.layers.32.self_attn.q_norm.weight, model.language_model.layers.32.self_attn.k_norm.weight, model.language_model.layers.32.mlp.gate_proj.weight, model.language_model.layers.32.mlp.up_proj.weight, model.language_model.layers.32.mlp.down_proj.weight, model.language_model.layers.32.input_layernorm.weight, model.language_model.layers.32.post_attention_layernorm.weight, model.language_model.layers.32.pre_feedforward_layernorm.weight, model.language_model.layers.32.post_feedforward_layernorm.weight, model.language_model.layers.33.self_attn.q_proj.weight, model.language_model.layers.33.self_attn.k_proj.weight, model.language_model.layers.33.self_attn.v_proj.weight, model.language_model.layers.33.self_attn.o_proj.weight, model.language_model.layers.33.self_attn.q_norm.weight, model.language_model.layers.33.self_attn.k_norm.weight, model.language_model.layers.33.mlp.gate_proj.weight, model.language_model.layers.33.mlp.up_proj.weight, model.language_model.layers.33.mlp.down_proj.weight, model.language_model.layers.33.input_layernorm.weight, model.language_model.layers.33.post_attention_layernorm.weight, model.language_model.layers.33.pre_feedforward_layernorm.weight, model.language_model.layers.33.post_feedforward_layernorm.weight, model.language_model.norm.weight, lm_head.weight",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.vision_tower.embeddings.patch_embedding.weight, model.vision_tower.embeddings.patch_embedding.bias, model.vision_tower.embeddings.position_embedding.weight, model.vision_tower.encoder.layers.0.layer_norm1.weight, model.vision_tower.encoder.layers.0.layer_norm1.bias, model.vision_tower.encoder.layers.0.self_attn.k_proj.weight, model.vision_tower.encoder.layers.0.self_attn.k_proj.bias, model.vision_tower.encoder.layers.0.self_attn.v_proj.weight, model.vision_tower.encoder.layers.0.self_attn.v_proj.bias, model.vision_tower.encoder.layers.0.self_attn.q_proj.weight, model.vision_tower.encoder.layers.0.self_attn.q_proj.bias, model.vision_tower.encoder.layers.0.self_attn.out_proj.weight, model.vision_tower.encoder.layers.0.self_attn.out_proj.bias, model.vision_tower.encoder.layers.0.layer_norm2.weight, model.vision_tower.encoder.layers.0.layer_norm2.bias, model.vision_tower.encoder.layers.0.mlp.fc1.weight, model.vision_tower.encoder.layers.0.mlp.fc1.bias, model.vision_tower.encoder.layers.0.mlp.fc2.weight, model.vision_tower.encoder.layers.0.mlp.fc2.bias, model.vision_tower.encoder.layers.1.layer_norm1.weight, model.vision_tower.encoder.layers.1.layer_norm1.bias, model.vision_tower.encoder.layers.1.self_attn.k_proj.weight, model.vision_tower.encoder.layers.1.self_attn.k_proj.bias, model.vision_tower.encoder.layers.1.self_attn.v_proj.weight, model.vision_tower.encoder.layers.1.self_attn.v_proj.bias, model.vision_tower.encoder.layers.1.self_attn.q_proj.weight, model.vision_tower.encoder.layers.1.self_attn.q_proj.bias, model.vision_tower.encoder.layers.1.self_attn.out_proj.weight, model.vision_tower.encoder.layers.1.self_attn.out_proj.bias, model.vision_tower.encoder.layers.1.layer_norm2.weight, model.vision_tower.encoder.layers.1.layer_norm2.bias, model.vision_tower.encoder.layers.1.mlp.fc1.weight, model.vision_tower.encoder.layers.1.mlp.fc1.bias, model.vision_tower.encoder.layers.1.mlp.fc2.weight, model.vision_tower.encoder.layers.1.mlp.fc2.bias, model.vision_tower.encoder.layers.2.layer_norm1.weight, model.vision_tower.encoder.layers.2.layer_norm1.bias, model.vision_tower.encoder.layers.2.self_attn.k_proj.weight, model.vision_tower.encoder.layers.2.self_attn.k_proj.bias, model.vision_tower.encoder.layers.2.self_attn.v_proj.weight, model.vision_tower.encoder.layers.2.self_attn.v_proj.bias, model.vision_tower.encoder.layers.2.self_attn.q_proj.weight, model.vision_tower.encoder.layers.2.self_attn.q_proj.bias, model.vision_tower.encoder.layers.2.self_attn.out_proj.weight, model.vision_tower.encoder.layers.2.self_attn.out_proj.bias, model.vision_tower.encoder.layers.2.layer_norm2.weight, model.vision_tower.encoder.layers.2.layer_norm2.bias, model.vision_tower.encoder.layers.2.mlp.fc1.weight, model.vision_tower.encoder.layers.2.mlp.fc1.bias, model.vision_tower.encoder.layers.2.mlp.fc2.weight, model.vision_tower.encoder.layers.2.mlp.fc2.bias, model.vision_tower.encoder.layers.3.layer_norm1.weight, model.vision_tower.encoder.layers.3.layer_norm1.bias, model.vision_tower.encoder.layers.3.self_attn.k_proj.weight, model.vision_tower.encoder.layers.3.self_attn.k_proj.bias, model.vision_tower.encoder.layers.3.self_attn.v_proj.weight, model.vision_tower.encoder.layers.3.self_attn.v_proj.bias, model.vision_tower.encoder.layers.3.self_attn.q_proj.weight, model.vision_tower.encoder.layers.3.self_attn.q_proj.bias, model.vision_tower.encoder.layers.3.self_attn.out_proj.weight, model.vision_tower.encoder.layers.3.self_attn.out_proj.bias, model.vision_tower.encoder.layers.3.layer_norm2.weight, model.vision_tower.encoder.layers.3.layer_norm2.bias, model.vision_tower.encoder.layers.3.mlp.fc1.weight, model.vision_tower.encoder.layers.3.mlp.fc1.bias, model.vision_tower.encoder.layers.3.mlp.fc2.weight, model.vision_tower.encoder.layers.3.mlp.fc2.bias, model.vision_tower.encoder.layers.4.layer_norm1.weight, model.vision_tower.encoder.layers.4.layer_norm1.bias, model.vision_tower.encoder.layers.4.self_attn.k_proj.weight, model.vision_tower.encoder.layers.4.self_attn.k_proj.bias, model.vision_tower.encoder.layers.4.self_attn.v_proj.weight, model.vision_tower.encoder.layers.4.self_attn.v_proj.bias, model.vision_tower.encoder.layers.4.self_attn.q_proj.weight, model.vision_tower.encoder.layers.4.self_attn.q_proj.bias, model.vision_tower.encoder.layers.4.self_attn.out_proj.weight, model.vision_tower.encoder.layers.4.self_attn.out_proj.bias, model.vision_tower.encoder.layers.4.layer_norm2.weight, model.vision_tower.encoder.layers.4.layer_norm2.bias, model.vision_tower.encoder.layers.4.mlp.fc1.weight, model.vision_tower.encoder.layers.4.mlp.fc1.bias, model.vision_tower.encoder.layers.4.mlp.fc2.weight, model.vision_tower.encoder.layers.4.mlp.fc2.bias, model.vision_tower.encoder.layers.5.layer_norm1.weight, model.vision_tower.encoder.layers.5.layer_norm1.bias, model.vision_tower.encoder.layers.5.self_attn.k_proj.weight, model.vision_tower.encoder.layers.5.self_attn.k_proj.bias, model.vision_tower.encoder.layers.5.self_attn.v_proj.weight, model.vision_tower.encoder.layers.5.self_attn.v_proj.bias, model.vision_tower.encoder.layers.5.self_attn.q_proj.weight, model.vision_tower.encoder.layers.5.self_attn.q_proj.bias, model.vision_tower.encoder.layers.5.self_attn.out_proj.weight, model.vision_tower.encoder.layers.5.self_attn.out_proj.bias, model.vision_tower.encoder.layers.5.layer_norm2.weight, model.vision_tower.encoder.layers.5.layer_norm2.bias, model.vision_tower.encoder.layers.5.mlp.fc1.weight, model.vision_tower.encoder.layers.5.mlp.fc1.bias, model.vision_tower.encoder.layers.5.mlp.fc2.weight, model.vision_tower.encoder.layers.5.mlp.fc2.bias, model.vision_tower.encoder.layers.6.layer_norm1.weight, model.vision_tower.encoder.layers.6.layer_norm1.bias, model.vision_tower.encoder.layers.6.self_attn.k_proj.weight, model.vision_tower.encoder.layers.6.self_attn.k_proj.bias, model.vision_tower.encoder.layers.6.self_attn.v_proj.weight, model.vision_tower.encoder.layers.6.self_attn.v_proj.bias, model.vision_tower.encoder.layers.6.self_attn.q_proj.weight, model.vision_tower.encoder.layers.6.self_attn.q_proj.bias, model.vision_tower.encoder.layers.6.self_attn.out_proj.weight, model.vision_tower.encoder.layers.6.self_attn.out_proj.bias, model.vision_tower.encoder.layers.6.layer_norm2.weight, model.vision_tower.encoder.layers.6.layer_norm2.bias, model.vision_tower.encoder.layers.6.mlp.fc1.weight, model.vision_tower.encoder.layers.6.mlp.fc1.bias, model.vision_tower.encoder.layers.6.mlp.fc2.weight, model.vision_tower.encoder.layers.6.mlp.fc2.bias, model.vision_tower.encoder.layers.7.layer_norm1.weight, model.vision_tower.encoder.layers.7.layer_norm1.bias, model.vision_tower.encoder.layers.7.self_attn.k_proj.weight, model.vision_tower.encoder.layers.7.self_attn.k_proj.bias, model.vision_tower.encoder.layers.7.self_attn.v_proj.weight, model.vision_tower.encoder.layers.7.self_attn.v_proj.bias, model.vision_tower.encoder.layers.7.self_attn.q_proj.weight, model.vision_tower.encoder.layers.7.self_attn.q_proj.bias, model.vision_tower.encoder.layers.7.self_attn.out_proj.weight, model.vision_tower.encoder.layers.7.self_attn.out_proj.bias, model.vision_tower.encoder.layers.7.layer_norm2.weight, model.vision_tower.encoder.layers.7.layer_norm2.bias, model.vision_tower.encoder.layers.7.mlp.fc1.weight, model.vision_tower.encoder.layers.7.mlp.fc1.bias, model.vision_tower.encoder.layers.7.mlp.fc2.weight, model.vision_tower.encoder.layers.7.mlp.fc2.bias, model.vision_tower.encoder.layers.8.layer_norm1.weight, model.vision_tower.encoder.layers.8.layer_norm1.bias, model.vision_tower.encoder.layers.8.self_attn.k_proj.weight, model.vision_tower.encoder.layers.8.self_attn.k_proj.bias, model.vision_tower.encoder.layers.8.self_attn.v_proj.weight, model.vision_tower.encoder.layers.8.self_attn.v_proj.bias, model.vision_tower.encoder.layers.8.self_attn.q_proj.weight, model.vision_tower.encoder.layers.8.self_attn.q_proj.bias, model.vision_tower.encoder.layers.8.self_attn.out_proj.weight, model.vision_tower.encoder.layers.8.self_attn.out_proj.bias, model.vision_tower.encoder.layers.8.layer_norm2.weight, model.vision_tower.encoder.layers.8.layer_norm2.bias, model.vision_tower.encoder.layers.8.mlp.fc1.weight, model.vision_tower.encoder.layers.8.mlp.fc1.bias, model.vision_tower.encoder.layers.8.mlp.fc2.weight, model.vision_tower.encoder.layers.8.mlp.fc2.bias, model.vision_tower.encoder.layers.9.layer_norm1.weight, model.vision_tower.encoder.layers.9.layer_norm1.bias, model.vision_tower.encoder.layers.9.self_attn.k_proj.weight, model.vision_tower.encoder.layers.9.self_attn.k_proj.bias, model.vision_tower.encoder.layers.9.self_attn.v_proj.weight, model.vision_tower.encoder.layers.9.self_attn.v_proj.bias, model.vision_tower.encoder.layers.9.self_attn.q_proj.weight, model.vision_tower.encoder.layers.9.self_attn.q_proj.bias, model.vision_tower.encoder.layers.9.self_attn.out_proj.weight, model.vision_tower.encoder.layers.9.self_attn.out_proj.bias, model.vision_tower.encoder.layers.9.layer_norm2.weight, model.vision_tower.encoder.layers.9.layer_norm2.bias, model.vision_tower.encoder.layers.9.mlp.fc1.weight, model.vision_tower.encoder.layers.9.mlp.fc1.bias, model.vision_tower.encoder.layers.9.mlp.fc2.weight, model.vision_tower.encoder.layers.9.mlp.fc2.bias, model.vision_tower.encoder.layers.10.layer_norm1.weight, model.vision_tower.encoder.layers.10.layer_norm1.bias, model.vision_tower.encoder.layers.10.self_attn.k_proj.weight, model.vision_tower.encoder.layers.10.self_attn.k_proj.bias, model.vision_tower.encoder.layers.10.self_attn.v_proj.weight, model.vision_tower.encoder.layers.10.self_attn.v_proj.bias, model.vision_tower.encoder.layers.10.self_attn.q_proj.weight, model.vision_tower.encoder.layers.10.self_attn.q_proj.bias, model.vision_tower.encoder.layers.10.self_attn.out_proj.weight, model.vision_tower.encoder.layers.10.self_attn.out_proj.bias, model.vision_tower.encoder.layers.10.layer_norm2.weight, model.vision_tower.encoder.layers.10.layer_norm2.bias, model.vision_tower.encoder.layers.10.mlp.fc1.weight, model.vision_tower.encoder.layers.10.mlp.fc1.bias, model.vision_tower.encoder.layers.10.mlp.fc2.weight, model.vision_tower.encoder.layers.10.mlp.fc2.bias, model.vision_tower.encoder.layers.11.layer_norm1.weight, model.vision_tower.encoder.layers.11.layer_norm1.bias, model.vision_tower.encoder.layers.11.self_attn.k_proj.weight, model.vision_tower.encoder.layers.11.self_attn.k_proj.bias, model.vision_tower.encoder.layers.11.self_attn.v_proj.weight, model.vision_tower.encoder.layers.11.self_attn.v_proj.bias, model.vision_tower.encoder.layers.11.self_attn.q_proj.weight, model.vision_tower.encoder.layers.11.self_attn.q_proj.bias, model.vision_tower.encoder.layers.11.self_attn.out_proj.weight, model.vision_tower.encoder.layers.11.self_attn.out_proj.bias, model.vision_tower.encoder.layers.11.layer_norm2.weight, model.vision_tower.encoder.layers.11.layer_norm2.bias, model.vision_tower.encoder.layers.11.mlp.fc1.weight, model.vision_tower.encoder.layers.11.mlp.fc1.bias, model.vision_tower.encoder.layers.11.mlp.fc2.weight, model.vision_tower.encoder.layers.11.mlp.fc2.bias, model.vision_tower.encoder.layers.12.layer_norm1.weight, model.vision_tower.encoder.layers.12.layer_norm1.bias, model.vision_tower.encoder.layers.12.self_attn.k_proj.weight, model.vision_tower.encoder.layers.12.self_attn.k_proj.bias, model.vision_tower.encoder.layers.12.self_attn.v_proj.weight, model.vision_tower.encoder.layers.12.self_attn.v_proj.bias, model.vision_tower.encoder.layers.12.self_attn.q_proj.weight, model.vision_tower.encoder.layers.12.self_attn.q_proj.bias, model.vision_tower.encoder.layers.12.self_attn.out_proj.weight, model.vision_tower.encoder.layers.12.self_attn.out_proj.bias, model.vision_tower.encoder.layers.12.layer_norm2.weight, model.vision_tower.encoder.layers.12.layer_norm2.bias, model.vision_tower.encoder.layers.12.mlp.fc1.weight, model.vision_tower.encoder.layers.12.mlp.fc1.bias, model.vision_tower.encoder.layers.12.mlp.fc2.weight, model.vision_tower.encoder.layers.12.mlp.fc2.bias, model.vision_tower.encoder.layers.13.layer_norm1.weight, model.vision_tower.encoder.layers.13.layer_norm1.bias, model.vision_tower.encoder.layers.13.self_attn.k_proj.weight, model.vision_tower.encoder.layers.13.self_attn.k_proj.bias, model.vision_tower.encoder.layers.13.self_attn.v_proj.weight, model.vision_tower.encoder.layers.13.self_attn.v_proj.bias, model.vision_tower.encoder.layers.13.self_attn.q_proj.weight, model.vision_tower.encoder.layers.13.self_attn.q_proj.bias, model.vision_tower.encoder.layers.13.self_attn.out_proj.weight, model.vision_tower.encoder.layers.13.self_attn.out_proj.bias, model.vision_tower.encoder.layers.13.layer_norm2.weight, model.vision_tower.encoder.layers.13.layer_norm2.bias, model.vision_tower.encoder.layers.13.mlp.fc1.weight, model.vision_tower.encoder.layers.13.mlp.fc1.bias, model.vision_tower.encoder.layers.13.mlp.fc2.weight, model.vision_tower.encoder.layers.13.mlp.fc2.bias, model.vision_tower.encoder.layers.14.layer_norm1.weight, model.vision_tower.encoder.layers.14.layer_norm1.bias, model.vision_tower.encoder.layers.14.self_attn.k_proj.weight, model.vision_tower.encoder.layers.14.self_attn.k_proj.bias, model.vision_tower.encoder.layers.14.self_attn.v_proj.weight, model.vision_tower.encoder.layers.14.self_attn.v_proj.bias, model.vision_tower.encoder.layers.14.self_attn.q_proj.weight, model.vision_tower.encoder.layers.14.self_attn.q_proj.bias, model.vision_tower.encoder.layers.14.self_attn.out_proj.weight, model.vision_tower.encoder.layers.14.self_attn.out_proj.bias, model.vision_tower.encoder.layers.14.layer_norm2.weight, model.vision_tower.encoder.layers.14.layer_norm2.bias, model.vision_tower.encoder.layers.14.mlp.fc1.weight, model.vision_tower.encoder.layers.14.mlp.fc1.bias, model.vision_tower.encoder.layers.14.mlp.fc2.weight, model.vision_tower.encoder.layers.14.mlp.fc2.bias, model.vision_tower.encoder.layers.15.layer_norm1.weight, model.vision_tower.encoder.layers.15.layer_norm1.bias, model.vision_tower.encoder.layers.15.self_attn.k_proj.weight, model.vision_tower.encoder.layers.15.self_attn.k_proj.bias, model.vision_tower.encoder.layers.15.self_attn.v_proj.weight, model.vision_tower.encoder.layers.15.self_attn.v_proj.bias, model.vision_tower.encoder.layers.15.self_attn.q_proj.weight, model.vision_tower.encoder.layers.15.self_attn.q_proj.bias, model.vision_tower.encoder.layers.15.self_attn.out_proj.weight, model.vision_tower.encoder.layers.15.self_attn.out_proj.bias, model.vision_tower.encoder.layers.15.layer_norm2.weight, model.vision_tower.encoder.layers.15.layer_norm2.bias, model.vision_tower.encoder.layers.15.mlp.fc1.weight, model.vision_tower.encoder.layers.15.mlp.fc1.bias, model.vision_tower.encoder.layers.15.mlp.fc2.weight, model.vision_tower.encoder.layers.15.mlp.fc2.bias, model.vision_tower.encoder.layers.16.layer_norm1.weight, model.vision_tower.encoder.layers.16.layer_norm1.bias, model.vision_tower.encoder.layers.16.self_attn.k_proj.weight, model.vision_tower.encoder.layers.16.self_attn.k_proj.bias, model.vision_tower.encoder.layers.16.self_attn.v_proj.weight, model.vision_tower.encoder.layers.16.self_attn.v_proj.bias, model.vision_tower.encoder.layers.16.self_attn.q_proj.weight, model.vision_tower.encoder.layers.16.self_attn.q_proj.bias, model.vision_tower.encoder.layers.16.self_attn.out_proj.weight, model.vision_tower.encoder.layers.16.self_attn.out_proj.bias, model.vision_tower.encoder.layers.16.layer_norm2.weight, model.vision_tower.encoder.layers.16.layer_norm2.bias, model.vision_tower.encoder.layers.16.mlp.fc1.weight, model.vision_tower.encoder.layers.16.mlp.fc1.bias, model.vision_tower.encoder.layers.16.mlp.fc2.weight, model.vision_tower.encoder.layers.16.mlp.fc2.bias, model.vision_tower.encoder.layers.17.layer_norm1.weight, model.vision_tower.encoder.layers.17.layer_norm1.bias, model.vision_tower.encoder.layers.17.self_attn.k_proj.weight, model.vision_tower.encoder.layers.17.self_attn.k_proj.bias, model.vision_tower.encoder.layers.17.self_attn.v_proj.weight, model.vision_tower.encoder.layers.17.self_attn.v_proj.bias, model.vision_tower.encoder.layers.17.self_attn.q_proj.weight, model.vision_tower.encoder.layers.17.self_attn.q_proj.bias, model.vision_tower.encoder.layers.17.self_attn.out_proj.weight, model.vision_tower.encoder.layers.17.self_attn.out_proj.bias, model.vision_tower.encoder.layers.17.layer_norm2.weight, model.vision_tower.encoder.layers.17.layer_norm2.bias, model.vision_tower.encoder.layers.17.mlp.fc1.weight, model.vision_tower.encoder.layers.17.mlp.fc1.bias, model.vision_tower.encoder.layers.17.mlp.fc2.weight, model.vision_tower.encoder.layers.17.mlp.fc2.bias, model.vision_tower.encoder.layers.18.layer_norm1.weight, model.vision_tower.encoder.layers.18.layer_norm1.bias, model.vision_tower.encoder.layers.18.self_attn.k_proj.weight, model.vision_tower.encoder.layers.18.self_attn.k_proj.bias, model.vision_tower.encoder.layers.18.self_attn.v_proj.weight, model.vision_tower.encoder.layers.18.self_attn.v_proj.bias, model.vision_tower.encoder.layers.18.self_attn.q_proj.weight, model.vision_tower.encoder.layers.18.self_attn.q_proj.bias, model.vision_tower.encoder.layers.18.self_attn.out_proj.weight, model.vision_tower.encoder.layers.18.self_attn.out_proj.bias, model.vision_tower.encoder.layers.18.layer_norm2.weight, model.vision_tower.encoder.layers.18.layer_norm2.bias, model.vision_tower.encoder.layers.18.mlp.fc1.weight, model.vision_tower.encoder.layers.18.mlp.fc1.bias, model.vision_tower.encoder.layers.18.mlp.fc2.weight, model.vision_tower.encoder.layers.18.mlp.fc2.bias, model.vision_tower.encoder.layers.19.layer_norm1.weight, model.vision_tower.encoder.layers.19.layer_norm1.bias, model.vision_tower.encoder.layers.19.self_attn.k_proj.weight, model.vision_tower.encoder.layers.19.self_attn.k_proj.bias, model.vision_tower.encoder.layers.19.self_attn.v_proj.weight, model.vision_tower.encoder.layers.19.self_attn.v_proj.bias, model.vision_tower.encoder.layers.19.self_attn.q_proj.weight, model.vision_tower.encoder.layers.19.self_attn.q_proj.bias, model.vision_tower.encoder.layers.19.self_attn.out_proj.weight, model.vision_tower.encoder.layers.19.self_attn.out_proj.bias, model.vision_tower.encoder.layers.19.layer_norm2.weight, model.vision_tower.encoder.layers.19.layer_norm2.bias, model.vision_tower.encoder.layers.19.mlp.fc1.weight, model.vision_tower.encoder.layers.19.mlp.fc1.bias, model.vision_tower.encoder.layers.19.mlp.fc2.weight, model.vision_tower.encoder.layers.19.mlp.fc2.bias, model.vision_tower.encoder.layers.20.layer_norm1.weight, model.vision_tower.encoder.layers.20.layer_norm1.bias, model.vision_tower.encoder.layers.20.self_attn.k_proj.weight, model.vision_tower.encoder.layers.20.self_attn.k_proj.bias, model.vision_tower.encoder.layers.20.self_attn.v_proj.weight, model.vision_tower.encoder.layers.20.self_attn.v_proj.bias, model.vision_tower.encoder.layers.20.self_attn.q_proj.weight, model.vision_tower.encoder.layers.20.self_attn.q_proj.bias, model.vision_tower.encoder.layers.20.self_attn.out_proj.weight, model.vision_tower.encoder.layers.20.self_attn.out_proj.bias, model.vision_tower.encoder.layers.20.layer_norm2.weight, model.vision_tower.encoder.layers.20.layer_norm2.bias, model.vision_tower.encoder.layers.20.mlp.fc1.weight, model.vision_tower.encoder.layers.20.mlp.fc1.bias, model.vision_tower.encoder.layers.20.mlp.fc2.weight, model.vision_tower.encoder.layers.20.mlp.fc2.bias, model.vision_tower.encoder.layers.21.layer_norm1.weight, model.vision_tower.encoder.layers.21.layer_norm1.bias, model.vision_tower.encoder.layers.21.self_attn.k_proj.weight, model.vision_tower.encoder.layers.21.self_attn.k_proj.bias, model.vision_tower.encoder.layers.21.self_attn.v_proj.weight, model.vision_tower.encoder.layers.21.self_attn.v_proj.bias, model.vision_tower.encoder.layers.21.self_attn.q_proj.weight, model.vision_tower.encoder.layers.21.self_attn.q_proj.bias, model.vision_tower.encoder.layers.21.self_attn.out_proj.weight, model.vision_tower.encoder.layers.21.self_attn.out_proj.bias, model.vision_tower.encoder.layers.21.layer_norm2.weight, model.vision_tower.encoder.layers.21.layer_norm2.bias, model.vision_tower.encoder.layers.21.mlp.fc1.weight, model.vision_tower.encoder.layers.21.mlp.fc1.bias, model.vision_tower.encoder.layers.21.mlp.fc2.weight, model.vision_tower.encoder.layers.21.mlp.fc2.bias, model.vision_tower.encoder.layers.22.layer_norm1.weight, model.vision_tower.encoder.layers.22.layer_norm1.bias, model.vision_tower.encoder.layers.22.self_attn.k_proj.weight, model.vision_tower.encoder.layers.22.self_attn.k_proj.bias, model.vision_tower.encoder.layers.22.self_attn.v_proj.weight, model.vision_tower.encoder.layers.22.self_attn.v_proj.bias, model.vision_tower.encoder.layers.22.self_attn.q_proj.weight, model.vision_tower.encoder.layers.22.self_attn.q_proj.bias, model.vision_tower.encoder.layers.22.self_attn.out_proj.weight, model.vision_tower.encoder.layers.22.self_attn.out_proj.bias, model.vision_tower.encoder.layers.22.layer_norm2.weight, model.vision_tower.encoder.layers.22.layer_norm2.bias, model.vision_tower.encoder.layers.22.mlp.fc1.weight, model.vision_tower.encoder.layers.22.mlp.fc1.bias, model.vision_tower.encoder.layers.22.mlp.fc2.weight, model.vision_tower.encoder.layers.22.mlp.fc2.bias, model.vision_tower.encoder.layers.23.layer_norm1.weight, model.vision_tower.encoder.layers.23.layer_norm1.bias, model.vision_tower.encoder.layers.23.self_attn.k_proj.weight, model.vision_tower.encoder.layers.23.self_attn.k_proj.bias, model.vision_tower.encoder.layers.23.self_attn.v_proj.weight, model.vision_tower.encoder.layers.23.self_attn.v_proj.bias, model.vision_tower.encoder.layers.23.self_attn.q_proj.weight, model.vision_tower.encoder.layers.23.self_attn.q_proj.bias, model.vision_tower.encoder.layers.23.self_attn.out_proj.weight, model.vision_tower.encoder.layers.23.self_attn.out_proj.bias, model.vision_tower.encoder.layers.23.layer_norm2.weight, model.vision_tower.encoder.layers.23.layer_norm2.bias, model.vision_tower.encoder.layers.23.mlp.fc1.weight, model.vision_tower.encoder.layers.23.mlp.fc1.bias, model.vision_tower.encoder.layers.23.mlp.fc2.weight, model.vision_tower.encoder.layers.23.mlp.fc2.bias, model.vision_tower.encoder.layers.24.layer_norm1.weight, model.vision_tower.encoder.layers.24.layer_norm1.bias, model.vision_tower.encoder.layers.24.self_attn.k_proj.weight, model.vision_tower.encoder.layers.24.self_attn.k_proj.bias, model.vision_tower.encoder.layers.24.self_attn.v_proj.weight, model.vision_tower.encoder.layers.24.self_attn.v_proj.bias, model.vision_tower.encoder.layers.24.self_attn.q_proj.weight, model.vision_tower.encoder.layers.24.self_attn.q_proj.bias, model.vision_tower.encoder.layers.24.self_attn.out_proj.weight, model.vision_tower.encoder.layers.24.self_attn.out_proj.bias, model.vision_tower.encoder.layers.24.layer_norm2.weight, model.vision_tower.encoder.layers.24.layer_norm2.bias, model.vision_tower.encoder.layers.24.mlp.fc1.weight, model.vision_tower.encoder.layers.24.mlp.fc1.bias, model.vision_tower.encoder.layers.24.mlp.fc2.weight, model.vision_tower.encoder.layers.24.mlp.fc2.bias, model.vision_tower.encoder.layers.25.layer_norm1.weight, model.vision_tower.encoder.layers.25.layer_norm1.bias, model.vision_tower.encoder.layers.25.self_attn.k_proj.weight, model.vision_tower.encoder.layers.25.self_attn.k_proj.bias, model.vision_tower.encoder.layers.25.self_attn.v_proj.weight, model.vision_tower.encoder.layers.25.self_attn.v_proj.bias, model.vision_tower.encoder.layers.25.self_attn.q_proj.weight, model.vision_tower.encoder.layers.25.self_attn.q_proj.bias, model.vision_tower.encoder.layers.25.self_attn.out_proj.weight, model.vision_tower.encoder.layers.25.self_attn.out_proj.bias, model.vision_tower.encoder.layers.25.layer_norm2.weight, model.vision_tower.encoder.layers.25.layer_norm2.bias, model.vision_tower.encoder.layers.25.mlp.fc1.weight, model.vision_tower.encoder.layers.25.mlp.fc1.bias, model.vision_tower.encoder.layers.25.mlp.fc2.weight, model.vision_tower.encoder.layers.25.mlp.fc2.bias, model.vision_tower.encoder.layers.26.layer_norm1.weight, model.vision_tower.encoder.layers.26.layer_norm1.bias, model.vision_tower.encoder.layers.26.self_attn.k_proj.weight, model.vision_tower.encoder.layers.26.self_attn.k_proj.bias, model.vision_tower.encoder.layers.26.self_attn.v_proj.weight, model.vision_tower.encoder.layers.26.self_attn.v_proj.bias, model.vision_tower.encoder.layers.26.self_attn.q_proj.weight, model.vision_tower.encoder.layers.26.self_attn.q_proj.bias, model.vision_tower.encoder.layers.26.self_attn.out_proj.weight, model.vision_tower.encoder.layers.26.self_attn.out_proj.bias, model.vision_tower.encoder.layers.26.layer_norm2.weight, model.vision_tower.encoder.layers.26.layer_norm2.bias, model.vision_tower.encoder.layers.26.mlp.fc1.weight, model.vision_tower.encoder.layers.26.mlp.fc1.bias, model.vision_tower.encoder.layers.26.mlp.fc2.weight, model.vision_tower.encoder.layers.26.mlp.fc2.bias, model.vision_tower.post_layernorm.weight, model.vision_tower.post_layernorm.bias, model.multi_modal_projector.mm_input_projection_weight, model.multi_modal_projector.mm_soft_emb_norm.weight, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.self_attn.q_norm.weight, model.language_model.layers.0.self_attn.k_norm.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.0.pre_feedforward_layernorm.weight, model.language_model.layers.0.post_feedforward_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.self_attn.q_norm.weight, model.language_model.layers.1.self_attn.k_norm.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.layers.1.pre_feedforward_layernorm.weight, model.language_model.layers.1.post_feedforward_layernorm.weight, model.language_model.layers.2.self_attn.q_proj.weight, model.language_model.layers.2.self_attn.k_proj.weight, model.language_model.layers.2.self_attn.v_proj.weight, model.language_model.layers.2.self_attn.o_proj.weight, model.language_model.layers.2.self_attn.q_norm.weight, model.language_model.layers.2.self_attn.k_norm.weight, model.language_model.layers.2.mlp.gate_proj.weight, model.language_model.layers.2.mlp.up_proj.weight, model.language_model.layers.2.mlp.down_proj.weight, model.language_model.layers.2.input_layernorm.weight, model.language_model.layers.2.post_attention_layernorm.weight, model.language_model.layers.2.pre_feedforward_layernorm.weight, model.language_model.layers.2.post_feedforward_layernorm.weight, model.language_model.layers.3.self_attn.q_proj.weight, model.language_model.layers.3.self_attn.k_proj.weight, model.language_model.layers.3.self_attn.v_proj.weight, model.language_model.layers.3.self_attn.o_proj.weight, model.language_model.layers.3.self_attn.q_norm.weight, model.language_model.layers.3.self_attn.k_norm.weight, model.language_model.layers.3.mlp.gate_proj.weight, model.language_model.layers.3.mlp.up_proj.weight, model.language_model.layers.3.mlp.down_proj.weight, model.language_model.layers.3.input_layernorm.weight, model.language_model.layers.3.post_attention_layernorm.weight, model.language_model.layers.3.pre_feedforward_layernorm.weight, model.language_model.layers.3.post_feedforward_layernorm.weight, model.language_model.layers.4.self_attn.q_proj.weight, model.language_model.layers.4.self_attn.k_proj.weight, model.language_model.layers.4.self_attn.v_proj.weight, model.language_model.layers.4.self_attn.o_proj.weight, model.language_model.layers.4.self_attn.q_norm.weight, model.language_model.layers.4.self_attn.k_norm.weight, model.language_model.layers.4.mlp.gate_proj.weight, model.language_model.layers.4.mlp.up_proj.weight, model.language_model.layers.4.mlp.down_proj.weight, model.language_model.layers.4.input_layernorm.weight, model.language_model.layers.4.post_attention_layernorm.weight, model.language_model.layers.4.pre_feedforward_layernorm.weight, model.language_model.layers.4.post_feedforward_layernorm.weight, model.language_model.layers.5.self_attn.q_proj.weight, model.language_model.layers.5.self_attn.k_proj.weight, model.language_model.layers.5.self_attn.v_proj.weight, model.language_model.layers.5.self_attn.o_proj.weight, model.language_model.layers.5.self_attn.q_norm.weight, model.language_model.layers.5.self_attn.k_norm.weight, model.language_model.layers.5.mlp.gate_proj.weight, model.language_model.layers.5.mlp.up_proj.weight, model.language_model.layers.5.mlp.down_proj.weight, model.language_model.layers.5.input_layernorm.weight, model.language_model.layers.5.post_attention_layernorm.weight, model.language_model.layers.5.pre_feedforward_layernorm.weight, model.language_model.layers.5.post_feedforward_layernorm.weight, model.language_model.layers.6.self_attn.q_proj.weight, model.language_model.layers.6.self_attn.k_proj.weight, model.language_model.layers.6.self_attn.v_proj.weight, model.language_model.layers.6.self_attn.o_proj.weight, model.language_model.layers.6.self_attn.q_norm.weight, model.language_model.layers.6.self_attn.k_norm.weight, model.language_model.layers.6.mlp.gate_proj.weight, model.language_model.layers.6.mlp.up_proj.weight, model.language_model.layers.6.mlp.down_proj.weight, model.language_model.layers.6.input_layernorm.weight, model.language_model.layers.6.post_attention_layernorm.weight, model.language_model.layers.6.pre_feedforward_layernorm.weight, model.language_model.layers.6.post_feedforward_layernorm.weight, model.language_model.layers.7.self_attn.q_proj.weight, model.language_model.layers.7.self_attn.k_proj.weight, model.language_model.layers.7.self_attn.v_proj.weight, model.language_model.layers.7.self_attn.o_proj.weight, model.language_model.layers.7.self_attn.q_norm.weight, model.language_model.layers.7.self_attn.k_norm.weight, model.language_model.layers.7.mlp.gate_proj.weight, model.language_model.layers.7.mlp.up_proj.weight, model.language_model.layers.7.mlp.down_proj.weight, model.language_model.layers.7.input_layernorm.weight, model.language_model.layers.7.post_attention_layernorm.weight, model.language_model.layers.7.pre_feedforward_layernorm.weight, model.language_model.layers.7.post_feedforward_layernorm.weight, model.language_model.layers.8.self_attn.q_proj.weight, model.language_model.layers.8.self_attn.k_proj.weight, model.language_model.layers.8.self_attn.v_proj.weight, model.language_model.layers.8.self_attn.o_proj.weight, model.language_model.layers.8.self_attn.q_norm.weight, model.language_model.layers.8.self_attn.k_norm.weight, model.language_model.layers.8.mlp.gate_proj.weight, model.language_model.layers.8.mlp.up_proj.weight, model.language_model.layers.8.mlp.down_proj.weight, model.language_model.layers.8.input_layernorm.weight, model.language_model.layers.8.post_attention_layernorm.weight, model.language_model.layers.8.pre_feedforward_layernorm.weight, model.language_model.layers.8.post_feedforward_layernorm.weight, model.language_model.layers.9.self_attn.q_proj.weight, model.language_model.layers.9.self_attn.k_proj.weight, model.language_model.layers.9.self_attn.v_proj.weight, model.language_model.layers.9.self_attn.o_proj.weight, model.language_model.layers.9.self_attn.q_norm.weight, model.language_model.layers.9.self_attn.k_norm.weight, model.language_model.layers.9.mlp.gate_proj.weight, model.language_model.layers.9.mlp.up_proj.weight, model.language_model.layers.9.mlp.down_proj.weight, model.language_model.layers.9.input_layernorm.weight, model.language_model.layers.9.post_attention_layernorm.weight, model.language_model.layers.9.pre_feedforward_layernorm.weight, model.language_model.layers.9.post_feedforward_layernorm.weight, model.language_model.layers.10.self_attn.q_proj.weight, model.language_model.layers.10.self_attn.k_proj.weight, model.language_model.layers.10.self_attn.v_proj.weight, model.language_model.layers.10.self_attn.o_proj.weight, model.language_model.layers.10.self_attn.q_norm.weight, model.language_model.layers.10.self_attn.k_norm.weight, model.language_model.layers.10.mlp.gate_proj.weight, model.language_model.layers.10.mlp.up_proj.weight, model.language_model.layers.10.mlp.down_proj.weight, model.language_model.layers.10.input_layernorm.weight, model.language_model.layers.10.post_attention_layernorm.weight, model.language_model.layers.10.pre_feedforward_layernorm.weight, model.language_model.layers.10.post_feedforward_layernorm.weight, model.language_model.layers.11.self_attn.q_proj.weight, model.language_model.layers.11.self_attn.k_proj.weight, model.language_model.layers.11.self_attn.v_proj.weight, model.language_model.layers.11.self_attn.o_proj.weight, model.language_model.layers.11.self_attn.q_norm.weight, model.language_model.layers.11.self_attn.k_norm.weight, model.language_model.layers.11.mlp.gate_proj.weight, model.language_model.layers.11.mlp.up_proj.weight, model.language_model.layers.11.mlp.down_proj.weight, model.language_model.layers.11.input_layernorm.weight, model.language_model.layers.11.post_attention_layernorm.weight, model.language_model.layers.11.pre_feedforward_layernorm.weight, model.language_model.layers.11.post_feedforward_layernorm.weight, model.language_model.layers.12.self_attn.q_proj.weight, model.language_model.layers.12.self_attn.k_proj.weight, model.language_model.layers.12.self_attn.v_proj.weight, model.language_model.layers.12.self_attn.o_proj.weight, model.language_model.layers.12.self_attn.q_norm.weight, model.language_model.layers.12.self_attn.k_norm.weight, model.language_model.layers.12.mlp.gate_proj.weight, model.language_model.layers.12.mlp.up_proj.weight, model.language_model.layers.12.mlp.down_proj.weight, model.language_model.layers.12.input_layernorm.weight, model.language_model.layers.12.post_attention_layernorm.weight, model.language_model.layers.12.pre_feedforward_layernorm.weight, model.language_model.layers.12.post_feedforward_layernorm.weight, model.language_model.layers.13.self_attn.q_proj.weight, model.language_model.layers.13.self_attn.k_proj.weight, model.language_model.layers.13.self_attn.v_proj.weight, model.language_model.layers.13.self_attn.o_proj.weight, model.language_model.layers.13.self_attn.q_norm.weight, model.language_model.layers.13.self_attn.k_norm.weight, model.language_model.layers.13.mlp.gate_proj.weight, model.language_model.layers.13.mlp.up_proj.weight, model.language_model.layers.13.mlp.down_proj.weight, model.language_model.layers.13.input_layernorm.weight, model.language_model.layers.13.post_attention_layernorm.weight, model.language_model.layers.13.pre_feedforward_layernorm.weight, model.language_model.layers.13.post_feedforward_layernorm.weight, model.language_model.layers.14.self_attn.q_proj.weight, model.language_model.layers.14.self_attn.k_proj.weight, model.language_model.layers.14.self_attn.v_proj.weight, model.language_model.layers.14.self_attn.o_proj.weight, model.language_model.layers.14.self_attn.q_norm.weight, model.language_model.layers.14.self_attn.k_norm.weight, model.language_model.layers.14.mlp.gate_proj.weight, model.language_model.layers.14.mlp.up_proj.weight, model.language_model.layers.14.mlp.down_proj.weight, model.language_model.layers.14.input_layernorm.weight, model.language_model.layers.14.post_attention_layernorm.weight, model.language_model.layers.14.pre_feedforward_layernorm.weight, model.language_model.layers.14.post_feedforward_layernorm.weight, model.language_model.layers.15.self_attn.q_proj.weight, model.language_model.layers.15.self_attn.k_proj.weight, model.language_model.layers.15.self_attn.v_proj.weight, model.language_model.layers.15.self_attn.o_proj.weight, model.language_model.layers.15.self_attn.q_norm.weight, model.language_model.layers.15.self_attn.k_norm.weight, model.language_model.layers.15.mlp.gate_proj.weight, model.language_model.layers.15.mlp.up_proj.weight, model.language_model.layers.15.mlp.down_proj.weight, model.language_model.layers.15.input_layernorm.weight, model.language_model.layers.15.post_attention_layernorm.weight, model.language_model.layers.15.pre_feedforward_layernorm.weight, model.language_model.layers.15.post_feedforward_layernorm.weight, model.language_model.layers.16.self_attn.q_proj.weight, model.language_model.layers.16.self_attn.k_proj.weight, model.language_model.layers.16.self_attn.v_proj.weight, model.language_model.layers.16.self_attn.o_proj.weight, model.language_model.layers.16.self_attn.q_norm.weight, model.language_model.layers.16.self_attn.k_norm.weight, model.language_model.layers.16.mlp.gate_proj.weight, model.language_model.layers.16.mlp.up_proj.weight, model.language_model.layers.16.mlp.down_proj.weight, model.language_model.layers.16.input_layernorm.weight, model.language_model.layers.16.post_attention_layernorm.weight, model.language_model.layers.16.pre_feedforward_layernorm.weight, model.language_model.layers.16.post_feedforward_layernorm.weight, model.language_model.layers.17.self_attn.q_proj.weight, model.language_model.layers.17.self_attn.k_proj.weight, model.language_model.layers.17.self_attn.v_proj.weight, model.language_model.layers.17.self_attn.o_proj.weight, model.language_model.layers.17.self_attn.q_norm.weight, model.language_model.layers.17.self_attn.k_norm.weight, model.language_model.layers.17.mlp.gate_proj.weight, model.language_model.layers.17.mlp.up_proj.weight, model.language_model.layers.17.mlp.down_proj.weight, model.language_model.layers.17.input_layernorm.weight, model.language_model.layers.17.post_attention_layernorm.weight, model.language_model.layers.17.pre_feedforward_layernorm.weight, model.language_model.layers.17.post_feedforward_layernorm.weight, model.language_model.layers.18.self_attn.q_proj.weight, model.language_model.layers.18.self_attn.k_proj.weight, model.language_model.layers.18.self_attn.v_proj.weight, model.language_model.layers.18.self_attn.o_proj.weight, model.language_model.layers.18.self_attn.q_norm.weight, model.language_model.layers.18.self_attn.k_norm.weight, model.language_model.layers.18.mlp.gate_proj.weight, model.language_model.layers.18.mlp.up_proj.weight, model.language_model.layers.18.mlp.down_proj.weight, model.language_model.layers.18.input_layernorm.weight, model.language_model.layers.18.post_attention_layernorm.weight, model.language_model.layers.18.pre_feedforward_layernorm.weight, model.language_model.layers.18.post_feedforward_layernorm.weight, model.language_model.layers.19.self_attn.q_proj.weight, model.language_model.layers.19.self_attn.k_proj.weight, model.language_model.layers.19.self_attn.v_proj.weight, model.language_model.layers.19.self_attn.o_proj.weight, model.language_model.layers.19.self_attn.q_norm.weight, model.language_model.layers.19.self_attn.k_norm.weight, model.language_model.layers.19.mlp.gate_proj.weight, model.language_model.layers.19.mlp.up_proj.weight, model.language_model.layers.19.mlp.down_proj.weight, model.language_model.layers.19.input_layernorm.weight, model.language_model.layers.19.post_attention_layernorm.weight, model.language_model.layers.19.pre_feedforward_layernorm.weight, model.language_model.layers.19.post_feedforward_layernorm.weight, model.language_model.layers.20.self_attn.q_proj.weight, model.language_model.layers.20.self_attn.k_proj.weight, model.language_model.layers.20.self_attn.v_proj.weight, model.language_model.layers.20.self_attn.o_proj.weight, model.language_model.layers.20.self_attn.q_norm.weight, model.language_model.layers.20.self_attn.k_norm.weight, model.language_model.layers.20.mlp.gate_proj.weight, model.language_model.layers.20.mlp.up_proj.weight, model.language_model.layers.20.mlp.down_proj.weight, model.language_model.layers.20.input_layernorm.weight, model.language_model.layers.20.post_attention_layernorm.weight, model.language_model.layers.20.pre_feedforward_layernorm.weight, model.language_model.layers.20.post_feedforward_layernorm.weight, model.language_model.layers.21.self_attn.q_proj.weight, model.language_model.layers.21.self_attn.k_proj.weight, model.language_model.layers.21.self_attn.v_proj.weight, model.language_model.layers.21.self_attn.o_proj.weight, model.language_model.layers.21.self_attn.q_norm.weight, model.language_model.layers.21.self_attn.k_norm.weight, model.language_model.layers.21.mlp.gate_proj.weight, model.language_model.layers.21.mlp.up_proj.weight, model.language_model.layers.21.mlp.down_proj.weight, model.language_model.layers.21.input_layernorm.weight, model.language_model.layers.21.post_attention_layernorm.weight, model.language_model.layers.21.pre_feedforward_layernorm.weight, model.language_model.layers.21.post_feedforward_layernorm.weight, model.language_model.layers.22.self_attn.q_proj.weight, model.language_model.layers.22.self_attn.k_proj.weight, model.language_model.layers.22.self_attn.v_proj.weight, model.language_model.layers.22.self_attn.o_proj.weight, model.language_model.layers.22.self_attn.q_norm.weight, model.language_model.layers.22.self_attn.k_norm.weight, model.language_model.layers.22.mlp.gate_proj.weight, model.language_model.layers.22.mlp.up_proj.weight, model.language_model.layers.22.mlp.down_proj.weight, model.language_model.layers.22.input_layernorm.weight, model.language_model.layers.22.post_attention_layernorm.weight, model.language_model.layers.22.pre_feedforward_layernorm.weight, model.language_model.layers.22.post_feedforward_layernorm.weight, model.language_model.layers.23.self_attn.q_proj.weight, model.language_model.layers.23.self_attn.k_proj.weight, model.language_model.layers.23.self_attn.v_proj.weight, model.language_model.layers.23.self_attn.o_proj.weight, model.language_model.layers.23.self_attn.q_norm.weight, model.language_model.layers.23.self_attn.k_norm.weight, model.language_model.layers.23.mlp.gate_proj.weight, model.language_model.layers.23.mlp.up_proj.weight, model.language_model.layers.23.mlp.down_proj.weight, model.language_model.layers.23.input_layernorm.weight, model.language_model.layers.23.post_attention_layernorm.weight, model.language_model.layers.23.pre_feedforward_layernorm.weight, model.language_model.layers.23.post_feedforward_layernorm.weight, model.language_model.layers.24.self_attn.q_proj.weight, model.language_model.layers.24.self_attn.k_proj.weight, model.language_model.layers.24.self_attn.v_proj.weight, model.language_model.layers.24.self_attn.o_proj.weight, model.language_model.layers.24.self_attn.q_norm.weight, model.language_model.layers.24.self_attn.k_norm.weight, model.language_model.layers.24.mlp.gate_proj.weight, model.language_model.layers.24.mlp.up_proj.weight, model.language_model.layers.24.mlp.down_proj.weight, model.language_model.layers.24.input_layernorm.weight, model.language_model.layers.24.post_attention_layernorm.weight, model.language_model.layers.24.pre_feedforward_layernorm.weight, model.language_model.layers.24.post_feedforward_layernorm.weight, model.language_model.layers.25.self_attn.q_proj.weight, model.language_model.layers.25.self_attn.k_proj.weight, model.language_model.layers.25.self_attn.v_proj.weight, model.language_model.layers.25.self_attn.o_proj.weight, model.language_model.layers.25.self_attn.q_norm.weight, model.language_model.layers.25.self_attn.k_norm.weight, model.language_model.layers.25.mlp.gate_proj.weight, model.language_model.layers.25.mlp.up_proj.weight, model.language_model.layers.25.mlp.down_proj.weight, model.language_model.layers.25.input_layernorm.weight, model.language_model.layers.25.post_attention_layernorm.weight, model.language_model.layers.25.pre_feedforward_layernorm.weight, model.language_model.layers.25.post_feedforward_layernorm.weight, model.language_model.layers.26.self_attn.q_proj.weight, model.language_model.layers.26.self_attn.k_proj.weight, model.language_model.layers.26.self_attn.v_proj.weight, model.language_model.layers.26.self_attn.o_proj.weight, model.language_model.layers.26.self_attn.q_norm.weight, model.language_model.layers.26.self_attn.k_norm.weight, model.language_model.layers.26.mlp.gate_proj.weight, model.language_model.layers.26.mlp.up_proj.weight, model.language_model.layers.26.mlp.down_proj.weight, model.language_model.layers.26.input_layernorm.weight, model.language_model.layers.26.post_attention_layernorm.weight, model.language_model.layers.26.pre_feedforward_layernorm.weight, model.language_model.layers.26.post_feedforward_layernorm.weight, model.language_model.layers.27.self_attn.q_proj.weight, model.language_model.layers.27.self_attn.k_proj.weight, model.language_model.layers.27.self_attn.v_proj.weight, model.language_model.layers.27.self_attn.o_proj.weight, model.language_model.layers.27.self_attn.q_norm.weight, model.language_model.layers.27.self_attn.k_norm.weight, model.language_model.layers.27.mlp.gate_proj.weight, model.language_model.layers.27.mlp.up_proj.weight, model.language_model.layers.27.mlp.down_proj.weight, model.language_model.layers.27.input_layernorm.weight, model.language_model.layers.27.post_attention_layernorm.weight, model.language_model.layers.27.pre_feedforward_layernorm.weight, model.language_model.layers.27.post_feedforward_layernorm.weight, model.language_model.layers.28.self_attn.q_proj.weight, model.language_model.layers.28.self_attn.k_proj.weight, model.language_model.layers.28.self_attn.v_proj.weight, model.language_model.layers.28.self_attn.o_proj.weight, model.language_model.layers.28.self_attn.q_norm.weight, model.language_model.layers.28.self_attn.k_norm.weight, model.language_model.layers.28.mlp.gate_proj.weight, model.language_model.layers.28.mlp.up_proj.weight, model.language_model.layers.28.mlp.down_proj.weight, model.language_model.layers.28.input_layernorm.weight, model.language_model.layers.28.post_attention_layernorm.weight, model.language_model.layers.28.pre_feedforward_layernorm.weight, model.language_model.layers.28.post_feedforward_layernorm.weight, model.language_model.layers.29.self_attn.q_proj.weight, model.language_model.layers.29.self_attn.k_proj.weight, model.language_model.layers.29.self_attn.v_proj.weight, model.language_model.layers.29.self_attn.o_proj.weight, model.language_model.layers.29.self_attn.q_norm.weight, model.language_model.layers.29.self_attn.k_norm.weight, model.language_model.layers.29.mlp.gate_proj.weight, model.language_model.layers.29.mlp.up_proj.weight, model.language_model.layers.29.mlp.down_proj.weight, model.language_model.layers.29.input_layernorm.weight, model.language_model.layers.29.post_attention_layernorm.weight, model.language_model.layers.29.pre_feedforward_layernorm.weight, model.language_model.layers.29.post_feedforward_layernorm.weight, model.language_model.layers.30.self_attn.q_proj.weight, model.language_model.layers.30.self_attn.k_proj.weight, model.language_model.layers.30.self_attn.v_proj.weight, model.language_model.layers.30.self_attn.o_proj.weight, model.language_model.layers.30.self_attn.q_norm.weight, model.language_model.layers.30.self_attn.k_norm.weight, model.language_model.layers.30.mlp.gate_proj.weight, model.language_model.layers.30.mlp.up_proj.weight, model.language_model.layers.30.mlp.down_proj.weight, model.language_model.layers.30.input_layernorm.weight, model.language_model.layers.30.post_attention_layernorm.weight, model.language_model.layers.30.pre_feedforward_layernorm.weight, model.language_model.layers.30.post_feedforward_layernorm.weight, model.language_model.layers.31.self_attn.q_proj.weight, model.language_model.layers.31.self_attn.k_proj.weight, model.language_model.layers.31.self_attn.v_proj.weight, model.language_model.layers.31.self_attn.o_proj.weight, model.language_model.layers.31.self_attn.q_norm.weight, model.language_model.layers.31.self_attn.k_norm.weight, model.language_model.layers.31.mlp.gate_proj.weight, model.language_model.layers.31.mlp.up_proj.weight, model.language_model.layers.31.mlp.down_proj.weight, model.language_model.layers.31.input_layernorm.weight, model.language_model.layers.31.post_attention_layernorm.weight, model.language_model.layers.31.pre_feedforward_layernorm.weight, model.language_model.layers.31.post_feedforward_layernorm.weight, model.language_model.layers.32.self_attn.q_proj.weight, model.language_model.layers.32.self_attn.k_proj.weight, model.language_model.layers.32.self_attn.v_proj.weight, model.language_model.layers.32.self_attn.o_proj.weight, model.language_model.layers.32.self_attn.q_norm.weight, model.language_model.layers.32.self_attn.k_norm.weight, model.language_model.layers.32.mlp.gate_proj.weight, model.language_model.layers.32.mlp.up_proj.weight, model.language_model.layers.32.mlp.down_proj.weight, model.language_model.layers.32.input_layernorm.weight, model.language_model.layers.32.post_attention_layernorm.weight, model.language_model.layers.32.pre_feedforward_layernorm.weight, model.language_model.layers.32.post_feedforward_layernorm.weight, model.language_model.layers.33.self_attn.q_proj.weight, model.language_model.layers.33.self_attn.k_proj.weight, model.language_model.layers.33.self_attn.v_proj.weight, model.language_model.layers.33.self_attn.o_proj.weight, model.language_model.layers.33.self_attn.q_norm.weight, model.language_model.layers.33.self_attn.k_norm.weight, model.language_model.layers.33.mlp.gate_proj.weight, model.language_model.layers.33.mlp.up_proj.weight, model.language_model.layers.33.mlp.down_proj.weight, model.language_model.layers.33.input_layernorm.weight, model.language_model.layers.33.post_attention_layernorm.weight, model.language_model.layers.33.pre_feedforward_layernorm.weight, model.language_model.layers.33.post_feedforward_layernorm.weight, model.language_model.norm.weight, lm_head.weight",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_green_red_watermark_generation",
      "trace": "(line 659)  AttributeError: 'dict' object has no attribute 'validate'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 658)  AttributeError: 'dict' object has no attribute 'validate'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "single",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_validate_assistant",
      "trace": "(line 1909)  torch.AcceleratorError: CUDA error: device-side assert triggered",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1909)  torch.AcceleratorError: CUDA error: device-side assert triggered",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "cuda_runtime",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "single",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_with_compile_and_higher_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "single",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_with_compile_and_lower_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "single",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_without_compile_and_with_higher_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "single",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_without_compile_and_with_lower_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "utils",
      "gpu": "single",
      "test": "tests/utils/test_cache_utils.py::CacheHardIntegrationTest::test_cache_copy",
      "trace": "(line 436)  AssertionError: Lists differ: ['You are a helpful assistant. Help me to [390 chars] is'] != [\"You are a helpful assistant. Help me to [385 chars] is']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 436)  AssertionError: Lists differ: ['You are a helpful assistant. Help me to [390 chars] is'] != [\"You are a helpful assistant. Help me to [385 chars] is']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    },
    {
      "model": "utils",
      "gpu": "single",
      "test": "tests/utils/test_cache_utils.py::CacheHardIntegrationTest::test_dynamic_cache_hard",
      "trace": "(line 319)  AssertionError: \"Here[57 chars]ave fur, they have four legs, they have a tail[1045 chars]have\" != \"Here[57 chars]ave four legs, they have a tail, they have a f[1078 chars]They\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 319)  AssertionError: \"Here[57 chars]ave fur, they have four legs, they have a tail[1045 chars]have\" != \"Here[57 chars]ave four legs, they have a tail, they have a f[1078 chars]They\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "status": "flaky: test both passed and failed during the check of the current run on the previous commit: 032db9c8d6c3c3cb89e71cc414bfb5a469b1a6da",
      "author": null,
      "big_model": false
    }
  ],
  "unpinned": [
    {
      "model": "audioflamingo3",
      "gpu": "multi",
      "test": "tests/models/audioflamingo3/test_modeling_audioflamingo3.py::AudioFlamingo3ForConditionalGenerationIntegrationTest::test_fixture_batched_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "audioflamingo3",
      "gpu": "multi",
      "test": "tests/models/audioflamingo3/test_modeling_audioflamingo3.py::AudioFlamingo3ForConditionalGenerationIntegrationTest::test_fixture_single_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "aya_vision",
      "gpu": "multi",
      "test": "tests/models/aya_vision/test_modeling_aya_vision.py::AyaVisionIntegrationTest::test_small_model_integration_generate_chat_template",
      "trace": "(line 355)  AssertionError: 'The [29 chars] two cats resting on a bright pink blanket spread across a red' != 'The [29 chars] two cats resting on a bright pink blanket. The cats,'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 355)  AssertionError: 'The [29 chars] two cats resting on a bright pink blanket spread across a red' != 'The [29 chars] two cats resting on a bright pink blanket. The cats,'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bamba",
      "gpu": "multi",
      "test": "tests/models/bamba/test_modeling_bamba.py::BambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 780)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.37 GiB is free. Process 122389 has 18.93 GiB memory in use. Of the allocated memory 18.43 GiB is allocated by PyTorch, and 29.32 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 779)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.37 GiB is free. Process 198681 has 18.93 GiB memory in use. Of the allocated memory 18.43 GiB is allocated by PyTorch, and 29.32 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "bamba",
      "gpu": "multi",
      "test": "tests/models/bamba/test_modeling_bamba.py::BambaModelIntegrationTest::test_simple_generate",
      "trace": "(line 780)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.28 GiB is free. Process 122389 has 19.01 GiB memory in use. Of the allocated memory 18.51 GiB is allocated by PyTorch, and 32.29 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 779)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 3.28 GiB is free. Process 198681 has 19.01 GiB memory in use. Of the allocated memory 18.51 GiB is allocated by PyTorch, and 32.29 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "big_bird",
      "gpu": "multi",
      "test": "tests/models/big_bird/test_modeling_big_bird.py::BigBirdModelIntegrationTest::test_fill_mask",
      "trace": "(line 906)  AssertionError: '' != 'happiness'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 906)  AssertionError: '' != 'happiness'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bitnet",
      "gpu": "multi",
      "test": "tests/models/bitnet/test_modeling_bitnet.py::BitNetIntegrationTest::test_model_generation",
      "trace": "(line 309)  RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "bitnet",
      "gpu": "multi",
      "test": "tests/models/bitnet/test_modeling_bitnet.py::BitNetIntegrationTest::test_model_logits",
      "trace": "(line 309)  RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::BFloat16 != unsigned char",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "blip_2",
      "gpu": "multi",
      "test": "tests/models/blip_2/test_modeling_blip_2.py::Blip2ModelIntegrationTest::test_inference_t5",
      "trace": "(line 1616)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1616)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "blip_2",
      "gpu": "multi",
      "test": "tests/models/blip_2/test_modeling_blip_2.py::Blip2ModelIntegrationTest::test_inference_t5_batched_beam_search",
      "trace": "(line 1671)  AssertionError: Lists differ: [0, 3, 9, 2335, 19, 3823, 30, 8, 2608, 28, 160, 1782, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1671)  AssertionError: Lists differ: [0, 3, 9, 2335, 19, 3823, 30, 8, 2608, 28, 160, 1782, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "blip_2",
      "gpu": "multi",
      "test": "tests/models/blip_2/test_modeling_blip_2.py::Blip2ModelIntegrationTest::test_inference_t5_multi_accelerator",
      "trace": "(line 1740)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1740)  AssertionError: Lists differ: [0, 2335, 1556, 28, 1782, 30, 8, 2608, 1] != [0, 3, 9, 2335, 19, 1556, 28, 160, 1782, 30, 8, 2608, 1]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "multi",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_batch_generated_text",
      "trace": "(line 621)  AssertionError: Lists differ: ['Hello what is', 'Running a quick test with the followi[54 chars]the'] != ['Hello what is the best way to get the data from the se[127 chars]on2']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 621)  AssertionError: Lists differ: ['Hello what is', 'Running a quick test with the followi[54 chars]the'] != ['Hello what is the best way to get the data from the se[127 chars]on2']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "multi",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_batch_generation_padding",
      "trace": "(line 586)  AssertionError: Lists differ: [5941[15 chars]632, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,[82 chars]0, 0] != [5941[15 chars]632, 419, 682, 15, 473, 912, 267, 40704, 15, 1[186 chars] 912]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 586)  AssertionError: Lists differ: [5941[15 chars]632, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,[82 chars]0, 0] != [5941[15 chars]632, 419, 682, 15, 473, 912, 267, 40704, 15, 1[186 chars] 912]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bloom",
      "gpu": "multi",
      "test": "tests/models/bloom/test_modeling_bloom.py::BloomIntegrationTest::test_simple_generation",
      "trace": "(line 539)  AssertionError: 'I en[58 chars] play. I am a very active person, and I am a v[75 chars]am a' != 'I en[58 chars] play with the kids. I am a very active person[86 chars]nd I'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 539)  AssertionError: 'I en[58 chars] play. I am a very active person, and I am a v[75 chars]am a' != 'I en[58 chars] play with the kids. I am a very active person[86 chars]nd I'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "multi",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_constrastive_learning",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-large-itm-mlm-itc. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-large-itm-mlm-itc. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "multi",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_image_and_text_retrieval",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "bridgetower",
      "gpu": "multi",
      "test": "tests/models/bridgetower/test_modeling_bridgetower.py::BridgeTowerModelIntegrationTest::test_masked_language_modeling",
      "trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 419)  ValueError: Unrecognized model in BridgeTower/bridgetower-base-itm-mlm. Should have a `model_type` key in its config.json.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "multi",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b",
      "trace": "(line 399)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[92 chars]ted'] != ['Des[115 chars] dot in the center representing the North Star[99 chars]the']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 399)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[92 chars]ted'] != ['Des[115 chars] dot in the center representing the North Star[99 chars]the']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "multi",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b_batched",
      "trace": "(line 445)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[309 chars]on.'] != ['Des[115 chars] dot in the center representing the star Alpha[154 chars]The']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 445)  AssertionError: Lists differ: ['Des[115 chars] dot representing the position of the star Alp[309 chars]on.'] != ['Des[115 chars] dot in the center representing the star Alpha[154 chars]The']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "chameleon",
      "gpu": "multi",
      "test": "tests/models/chameleon/test_modeling_chameleon.py::ChameleonIntegrationTest::test_model_7b_multi_image",
      "trace": "(line 469)  AssertionError: Lists differ: ['Wha[74 chars]een the night sky and the internet. The first [115 chars]The'] != ['Wha[74 chars]een two celestial objects, the stars and the c[113 chars]map']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  AssertionError: Lists differ: ['Wha[74 chars]een the night sky and the internet. The first [115 chars]The'] != ['Wha[74 chars]een two celestial objects, the stars and the c[113 chars]map']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "clvp",
      "gpu": "multi",
      "test": "tests/models/clvp/test_modeling_clvp.py::ClvpIntegrationTest::test_conditional_encoder",
      "trace": "(line 552)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 552)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "clvp",
      "gpu": "multi",
      "test": "tests/models/clvp/test_modeling_clvp.py::ClvpIntegrationTest::test_full_model_integration",
      "trace": "(line 1310)  RuntimeError: The expanded size of the tensor (2) must match the existing size (3) at non-singleton dimension 0.  Target sizes: [2].  Tensor sizes: [3]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1310)  RuntimeError: The expanded size of the tensor (2) must match the existing size (3) at non-singleton dimension 0.  Target sizes: [2].  Tensor sizes: [3]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "multi",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2IntegrationTest::test_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([2.3711, 1.6689, 1.8389, 1.9785, 1.9121], dtype=torch.float16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([2.3711, 1.6689, 1.8389, 1.9785, 1.9121], dtype=torch.float16)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "multi",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2IntegrationTest::test_model_integration_generate_chat_template",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.86 GiB. GPU 1 has a total capacity of 22.30 GiB of which 1.48 GiB is free. Process 221664 has 20.82 GiB memory in use. Of the allocated memory 20.40 GiB is allocated by PyTorch, and 15.48 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.86 GiB. GPU 1 has a total capacity of 22.30 GiB of which 1.48 GiB is free. Process 786441 has 20.82 GiB memory in use. Of the allocated memory 20.40 GiB is allocated by PyTorch, and 15.48 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "multi",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2MoeVisionIntegrationTest::test_model_forward_vision",
      "trace": "(line 473)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "first_failure_day": "2026-05-21",
      "last_green_day": "2026-05-20",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "cohere2_vision",
      "gpu": "multi",
      "test": "tests/models/cohere2_vision/test_modeling_cohere2_vision.py::Cohere2MoeVisionIntegrationTest::test_model_generate_vision",
      "trace": "(line 473)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  OSError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/repos/moe/engines/command_a+_bf16'. Use `repo_type` argument if needed.",
      "first_failure_day": "2026-05-21",
      "last_green_day": "2026-05-20",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "colqwen2",
      "gpu": "multi",
      "test": "tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ModelIntegrationTest::test_model_integration_test",
      "trace": "(line 110)  ValueError: images must be an image, list of images or list of list of images",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 110)  ValueError: images must be an image, list of images or list of list of images",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "colqwen2",
      "gpu": "multi",
      "test": "tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ModelIntegrationTest::test_model_integration_test_2",
      "trace": "(line 400)  AssertionError: Expected scores tensor([[16.3750, 10.9375, 14.7500],",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 400)  AssertionError: Expected scores tensor([[16.3750, 10.9375, 14.7500],",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "convnextv2",
      "gpu": "multi",
      "test": "tests/models/convnextv2/test_modeling_convnextv2.py::ConvNextV2ModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 308)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 308)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "cvt",
      "gpu": "multi",
      "test": "tests/models/cvt/test_modeling_cvt.py::CvtModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 271)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 271)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "cwm",
      "gpu": "multi",
      "test": "tests/models/cwm/test_modeling_cwm.py::CwmIntegrationTest::test_cwm_integration",
      "trace": "(line 1968)  AttributeError: 'CwmDecoderLayer' object has no attribute 'attention_type'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1968)  AttributeError: 'CwmDecoderLayer' object has no attribute 'attention_type'",
      "first_failure_day": "2026-03-21",
      "last_green_day": "2026-03-20",
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "cwm",
      "gpu": "multi",
      "test": "tests/models/cwm/test_modeling_cwm.py::CwmIntegrationTest::test_cwm_sliding_window_long_sequence",
      "trace": "(line 255)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 102.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 251142 has 22.25 GiB memory in use. Of the allocated memory 21.82 GiB is allocated by PyTorch, and 20.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 255)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 102.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 920359 has 22.25 GiB memory in use. Of the allocated memory 21.82 GiB is allocated by PyTorch, and 20.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "dab_detr",
      "gpu": "multi",
      "test": "tests/models/dab_detr/test_modeling_dab_detr.py::DabDetrModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 805)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 805)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_0_dac_16khz",
      "trace": "(line 819)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 819)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_1_dac_24khz",
      "trace": "(line 813)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 813)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_2_dac_44khz",
      "trace": "(line 825)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 825)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_0_dac_16khz",
      "trace": "(line 870)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 870)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_1_dac_24khz",
      "trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dac",
      "gpu": "multi",
      "test": "tests/models/dac/test_modeling_dac.py::DacIntegrationTest::test_integration_batch_2_dac_44khz",
      "trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 876)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dbrx",
      "gpu": "multi",
      "test": "tests/models/dbrx/test_modeling_dbrx.py::DbrxModelIntegrationTest::test_tiny_model_logits",
      "trace": "(line 146)  huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'moe_jitter_eps':",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 146)  huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'moe_jitter_eps':",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": true
    },
    {
      "model": "deepseek_v2",
      "gpu": "multi",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_batch_fa2",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 704.00 KiB is free. Process 321345 has 22.29 GiB memory in use. Of the allocated memory 21.60 GiB is allocated by PyTorch, and 33.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 704.00 KiB is free. Process 1141193 has 22.29 GiB memory in use. Of the allocated memory 21.60 GiB is allocated by PyTorch, and 33.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "deepseek_v2",
      "gpu": "multi",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_deepseek_v2_lite",
      "trace": "(line 528)  OSError: deepseek-ai/DeepSeek-V2-Lite does not appear to have a file named model-00001-of-000004.safetensors. Checkout 'https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite/tree/main' for available files.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 140.69 MiB is free. Process 1141193 has 22.16 GiB memory in use. Of the allocated memory 21.48 GiB is allocated by PyTorch, and 9.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "deepseek_v2",
      "gpu": "multi",
      "test": "tests/models/deepseek_v2/test_modeling_deepseek_v2.py::DeepseekV2IntegrationTest::test_logits_eager",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.10 GiB. GPU 0 has a total capacity of 22.30 GiB of which 140.69 MiB is free. Process 321345 has 22.16 GiB memory in use. Of the allocated memory 21.48 GiB is allocated by PyTorch, and 9.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "deepseek_v3",
      "gpu": "multi",
      "test": "tests/models/deepseek_v3/test_modeling_deepseek_v3.py::DeepseekV3IntegrationTest::test_compile_static_cache",
      "trace": "(line 424)  AssertionError: Lists differ: ['Sim[41 chars]that  Frojekecdytes\u0c3e\u0c32\u0c41 sic\u02b0tinaccianntuala bre[327 chars]rew'] != ['Sim[41 chars]that aportersh455elike injection tactics-altit[355 chars]ick']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 424)  AssertionError: Lists differ: ['Sim[41 chars]that  Frojekecdytes\u0c3e\u0c32\u0c41 sic\u02b0tinaccianntuala bre[327 chars]rew'] != ['Sim[41 chars]that aportersh455elike injection tactics-altit[355 chars]ick']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "deepseek_vl",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl/test_modeling_deepseek_vl.py::DeepseekVLIntegrationTest::test_model_text_generation",
      "trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "deepseek_vl",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl/test_modeling_deepseek_vl.py::DeepseekVLIntegrationTest::test_model_text_generation_batched",
      "trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "deepseek_vl",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl/test_modeling_deepseek_vl.py::DeepseekVLIntegrationTest::test_model_text_generation_with_multi_image",
      "trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "deepseek_vl_hybrid",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl_hybrid/test_modeling_deepseek_vl_hybrid.py::DeepseekVLHybridIntegrationTest::test_model_text_generation",
      "trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 67)  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "deepseek_vl_hybrid",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl_hybrid/test_modeling_deepseek_vl_hybrid.py::DeepseekVLHybridIntegrationTest::test_model_text_generation_batched",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 36.69 MiB is free. Process 331795 has 22.26 GiB memory in use. Of the allocated memory 21.75 GiB is allocated by PyTorch, and 5.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 38.69 MiB is free. Process 961314 has 22.26 GiB memory in use. Of the allocated memory 21.75 GiB is allocated by PyTorch, and 3.84 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "deepseek_vl_hybrid",
      "gpu": "multi",
      "test": "tests/models/deepseek_vl_hybrid/test_modeling_deepseek_vl_hybrid.py::DeepseekVLHybridIntegrationTest::test_model_text_generation_with_multi_image",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 16.69 MiB is free. Process 331795 has 22.28 GiB memory in use. Of the allocated memory 21.77 GiB is allocated by PyTorch, and 5.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 18.69 MiB is free. Process 961314 has 22.28 GiB memory in use. Of the allocated memory 21.77 GiB is allocated by PyTorch, and 3.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "depth_anything",
      "gpu": "multi",
      "test": "tests/models/depth_anything/test_modeling_depth_anything.py::DepthAnythingModelIntegrationTest::test_inference",
      "trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "dia",
      "gpu": "multi",
      "test": "tests/models/dia/test_modeling_dia.py::DiaForConditionalGenerationIntegrationTest::test_dia_model_integration_generate_audio_context",
      "trace": "(line 732)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 732)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "diffllama",
      "gpu": "multi",
      "test": "tests/models/diffllama/test_modeling_diffllama.py::DiffLlamaIntegrationTest::test_compile_static_cache",
      "trace": "(line 484)  AssertionError: Lists differ: ['Sim[41 chars]that 1) the speed of light is constant in all [301 chars]y p'] != ['Sim[41 chars]that 2.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 '[133 chars]a a']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 484)  AssertionError: Lists differ: ['Sim[41 chars]that 1) the speed of light is constant in all [301 chars]y p'] != ['Sim[41 chars]that 2.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 '[133 chars]a a']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_batched_images_batched_boxes",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_batched_images_batched_points_multi_points",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_batched_images_multi_points",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_from_existing_points_and_mask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_one_point_multimask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "edgetam",
      "gpu": "multi",
      "test": "tests/models/edgetam/test_modeling_edgetam.py::EdgeTamModelIntegrationTest::test_inference_mask_generation_one_point_no_multimask",
      "trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 249)  TypeError: FeatureListNet.forward() got an unexpected keyword argument 'original_sizes'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "efficientnet",
      "gpu": "multi",
      "test": "tests/models/efficientnet/test_modeling_efficientnet.py::EfficientNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 259)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "multi",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generate_images",
      "trace": "(line 1968)  AttributeError: 'Emu3ForConditionalGeneration' object has no attribute 'vocabulary_mapping'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1968)  AttributeError: 'Emu3ForConditionalGeneration' object has no attribute 'vocabulary_mapping'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "multi",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation",
      "trace": "(line 363)  AssertionError: Lists differ: ['USE[85 chars]ANT: The image captures a moment of tranquilit[145 chars] in'] != ['USE[85 chars]ANT: 1. The image is a 1.\u4f60\u597d!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 363)  AssertionError: Lists differ: ['USE[85 chars]ANT: The image captures a moment of tranquilit[145 chars] in'] != ['USE[85 chars]ANT: 1. The image is a 1.\u4f60\u597d!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "multi",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_multi_image",
      "trace": "(line 92)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.32 GiB. GPU 0 has a total capacity of 22.30 GiB of which 1.28 GiB is free. Process 375211 has 21.02 GiB memory in use. Of the allocated memory 15.18 GiB is allocated by PyTorch, and 5.34 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 213)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 132.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 8.69 MiB is free. Process 1082951 has 22.29 GiB memory in use. Of the allocated memory 21.68 GiB is allocated by PyTorch, and 110.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "eomt_dinov3",
      "gpu": "multi",
      "test": "tests/models/eomt_dinov3/test_modeling_eomt_dinov3.py::EomtDinov3ForUniversalSegmentationIntegrationTest::test_inference_bf16",
      "trace": "(line 310)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 310)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "evolla",
      "gpu": "multi",
      "test": "tests/models/evolla/test_modeling_evolla.py::EvollaModelIntegrationTest::test_inference_natural_language_protein_reasoning",
      "trace": "(line 364)  AssertionError: 'This protein' not found in 'systemYouareanAIexpertthatcanansweranyquestionsaboutprotein.userWhatisthefunctionofthisprotein?assistant\u010aThis\u0120protein\u0120is\u0120a\u0120critical\u0120enzyme\u0120involved\u0120in\u0120the\u0120metabolic\u0120pathway\u0120of\u0120purine\u0120metabolism,\u0120specifically\u0120in\u0120the\u0120salvage\u0120pathway\u0120of\u0120IMP\u0120(inosine\u0120monophosphate)\u0120biosynthesis.\u0120Its\u0120primary\u0120function\u0120is\u0120to\u0120catalyze\u0120the\u0120conversion\u0120of\u0120hypoxanthine\u0120and\u0120guanine\u0120into\u0120their\u0120respective\u0120nucleotide\u0120monophosphates,\u0120which\u0120are\u0120essential\u0120building\u0120blocks\u0120for\u0120nucleic\u0120acids.\u010a\u010aThe\u0120protein\u0120is\u0120annotated\u0120with\u0120several\u0120molecular\u0120functions,\u0120including\u0120guanine\u0120phosphoribosyltransferase\u0120activity\u0120and\u0120hypoxanthine\u0120phosphorib'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 364)  AssertionError: 'This protein' not found in 'systemYouareanAIexpertthatcanansweranyquestionsaboutprotein.userWhatisthefunctionofthisprotein?assistant\u010aThis\u0120protein\u0120is\u0120a\u0120critical\u0120enzyme\u0120involved\u0120in\u0120the\u0120metabolic\u0120pathway\u0120of\u0120purine\u0120metabolism,\u0120specifically\u0120in\u0120the\u0120salvage\u0120pathway\u0120of\u0120IMP\u0120(inosine\u0120monophosphate)\u0120biosynthesis.\u0120Its\u0120primary\u0120function\u0120is\u0120to\u0120catalyze\u0120the\u0120conversion\u0120of\u0120hypoxanthine\u0120and\u0120guanine\u0120into\u0120their\u0120respective\u0120nucleotide\u0120monophosphates,\u0120which\u0120are\u0120essential\u0120building\u0120blocks\u0120for\u0120nucleic\u0120acids.\u010a\u010aThe\u0120protein\u0120is\u0120annotated\u0120with\u0120several\u0120molecular\u0120functions,\u0120including\u0120guanine\u0120phosphoribosyltransferase\u0120activity\u0120and\u0120hypoxanthine\u0120phosphorib'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "multi",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_generation_beyond_sliding_window",
      "trace": "(line 291)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 220.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 18.69 MiB is free. Process 373829 has 22.28 GiB memory in use. Of the allocated memory 21.52 GiB is allocated by PyTorch, and 270.68 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 291)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 220.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 18.69 MiB is free. Process 1350850 has 22.28 GiB memory in use. Of the allocated memory 21.52 GiB is allocated by PyTorch, and 270.68 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "multi",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_generation_eager",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 992.69 MiB is free. Process 373829 has 21.33 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 10.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 992.69 MiB is free. Process 1350850 has 21.33 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 10.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "multi",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_generation_sdpa",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 992.69 MiB is free. Process 373829 has 21.33 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 10.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 992.69 MiB is free. Process 1350850 has 21.33 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 10.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone4",
      "gpu": "multi",
      "test": "tests/models/exaone4/test_modeling_exaone4.py::Exaone4IntegrationTest::test_model_logits",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 954.69 MiB is free. Process 373829 has 21.36 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 49.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1000.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 994.69 MiB is free. Process 1350850 has 21.32 GiB memory in use. Of the allocated memory 20.82 GiB is allocated by PyTorch, and 9.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone4_5",
      "gpu": "multi",
      "test": "tests/models/exaone4_5/test_modeling_exaone4_5.py::Exaone4_5_IntegrationTest::test_model_generation_image_text",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.46 GiB. GPU 0 has a total capacity of 22.30 GiB of which 1.46 GiB is free. Process 443969 has 20.84 GiB memory in use. Of the allocated memory 20.26 GiB is allocated by PyTorch, and 84.70 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.46 GiB. GPU 0 has a total capacity of 22.30 GiB of which 1.46 GiB is free. Process 1288824 has 20.83 GiB memory in use. Of the allocated memory 20.26 GiB is allocated by PyTorch, and 82.36 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-05-19",
      "last_green_day": "2026-05-18",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone4_5",
      "gpu": "multi",
      "test": "tests/models/exaone4_5/test_modeling_exaone4_5.py::Exaone4_5_IntegrationTest::test_model_generation_text_only",
      "trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.46 GiB. GPU 0 has a total capacity of 22.30 GiB of which 1.46 GiB is free. Process 443969 has 20.84 GiB memory in use. Of the allocated memory 20.26 GiB is allocated by PyTorch, and 85.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 343)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.46 GiB. GPU 0 has a total capacity of 22.30 GiB of which 1.46 GiB is free. Process 1288824 has 20.84 GiB memory in use. Of the allocated memory 20.26 GiB is allocated by PyTorch, and 83.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-05-19",
      "last_green_day": "2026-05-18",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "exaone_moe",
      "gpu": "multi",
      "test": "tests/models/exaone_moe/test_modeling_exaone_moe.py::ExaoneMoeIntegrationTest::test_model_logits",
      "trace": "(line 120)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 120)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "falcon_h1",
      "gpu": "multi",
      "test": "tests/models/falcon_h1/test_modeling_falcon_h1.py::FalconH1ModelIntegrationTest::test_falcon_h1_hard",
      "trace": "(line 470)  AssertionError: 'user\\nTell me about the french revolutio[1920 chars]ct**' != \"user\\nTell me about the french revolutio[1929 chars]n6. \"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 470)  AssertionError: 'user\\nTell me about the french revolutio[1920 chars]ct**' != \"user\\nTell me about the french revolutio[1929 chars]n6. \"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "multi",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_batched_generation",
      "trace": "(line 488)  AssertionError: Lists differ: ['Hello today I will be talking about the \u201cTheory of Rela[161 chars]bal'] != ['Hello today I am going to talk about the \u201cTheory of Rel[159 chars]bal']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 488)  AssertionError: Lists differ: ['Hello today I will be talking about the \u201cTheory of Rela[161 chars]bal'] != ['Hello today I am going to talk about the \u201cTheory of Rel[159 chars]bal']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "multi",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_4bit",
      "trace": "(line 438)  AssertionError: 'Hello today I\\'m going to be talking about the \"A\" in the \"A-B' != \"Hello today Iava,\\n\\nI'm sorry to hear that you're having trouble with the \"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 438)  AssertionError: 'Hello today I\\'m going to be talking about the \"A\" in the \"A-B' != \"Hello today Iava,\\n\\nI'm sorry to hear that you're having trouble with the \"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "multi",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_fp16",
      "trace": "(line 423)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 423)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "first_failure_day": "2026-03-19",
      "last_green_day": "2026-03-18",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "falcon_mamba",
      "gpu": "multi",
      "test": "tests/models/falcon_mamba/test_modeling_falcon_mamba.py::FalconMambaIntegrationTests::test_generation_torch_compile",
      "trace": "(line 451)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 451)  AssertionError: 'Hello today I am going to talk about the \u201cTheory of Re[27 chars]n.\\n' != 'Hello today Iava,\\n\\nI am writing to you today to disc[49 chars]tyle'",
      "first_failure_day": "2026-03-19",
      "last_green_day": "2026-03-18",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "fastspeech2_conformer",
      "gpu": "multi",
      "test": "tests/models/fastspeech2_conformer/test_modeling_fastspeech2_conformer.py::FastSpeech2ConformerModelIntegrationTest::test_training_integration",
      "trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "flava",
      "gpu": "multi",
      "test": "tests/models/flava/test_modeling_flava.py::FlavaModelIntegrationTest::test_inference",
      "trace": "(line 899)  AssertionError: -1352.535400390625 != -1352.4685 within 4 places (0.06690039062505093 difference)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 899)  AssertionError: -1352.535400390625 != -1352.4685 within 4 places (0.06690039062505093 difference)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "flava",
      "gpu": "multi",
      "test": "tests/models/flava/test_modeling_flava.py::FlavaForPreTrainingIntegrationTest::test_inference_with_itm_labels",
      "trace": "(line 1223)  AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 2]) != torch.Size([2, 2]).",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1223)  AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 2]) != torch.Size([2, 2]).",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "flex_olmo",
      "gpu": "multi",
      "test": "tests/models/flex_olmo/test_modeling_flex_olmo.py::FlexOlmoIntegrationTest::test_model_7b_greedy_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "flex_olmo",
      "gpu": "multi",
      "test": "tests/models/flex_olmo/test_modeling_flex_olmo.py::FlexOlmoIntegrationTest::test_model_7b_logits",
      "trace": "(line 87)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 87)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "florence2",
      "gpu": "multi",
      "test": "tests/models/florence2/test_modeling_florence2.py::Florence2ForConditionalGenerationIntegrationTest::test_large_model_inference_eager",
      "trace": "(line 470)  AssertionError: Lists differ: [[2, [144 chars], 5, 2014, 6, 8, 11, 5, 3618, 6, 89, 32, 3980,[51 chars], 2]] != [[2, [144 chars], 5, 921, 6, 8, 11, 5, 3618, 6, 89, 32, 1104, [44 chars], 2]]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 470)  AssertionError: Lists differ: [[2, [144 chars], 5, 2014, 6, 8, 11, 5, 3618, 6, 89, 32, 3980,[51 chars], 2]] != [[2, [144 chars], 5, 921, 6, 8, 11, 5, 3618, 6, 89, 32, 1104, [44 chars], 2]]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "multi",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_inference_no_head",
      "trace": "(line 484)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 484)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "multi",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_translation_direct_0_en_ru",
      "trace": "(line 517)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 517)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "fsmt",
      "gpu": "multi",
      "test": "tests/models/fsmt/test_modeling_fsmt.py::FSMTModelIntegrationTests::test_translation_direct_1_ru_en",
      "trace": "(line 517)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 517)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "fuyu",
      "gpu": "multi",
      "test": "tests/models/fuyu/test_modeling_fuyu.py::FuyuModelIntegrationTest::test_greedy_generation",
      "trace": "(line 295)  AssertionError: '\\x04 A bus parked on the side of a road.' != 'A blue bus parked on the side of a road.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 295)  AssertionError: '\\x04 A bus parked on the side of a road.' != 'A blue bus parked on the side of a road.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_compile_static_cache",
      "trace": "(line 337)  AssertionError: Lists differ: ['Hel[196 chars]tdi 105bhp.\\nI have a problem with the engine [37 chars]the'] != ['Hel[196 chars]tdi 110bhp.\\nI have a problem with the engine [49 chars]ugh']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 337)  AssertionError: Lists differ: ['Hel[196 chars]tdi 105bhp.\\nI have a problem with the engine [37 chars]the'] != ['Hel[196 chars]tdi 110bhp.\\nI have a problem with the engine [49 chars]ugh']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_export_static_cache",
      "trace": "(line 414)  AssertionError: Lists differ: ['Hel[87 chars] in the 1990s. I have been looking on the internet and I have'] != ['Hel[87 chars] in the 1990s. I have looked on the internet and I have found']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 414)  AssertionError: Lists differ: ['Hel[87 chars] in the 1990s. I have been looking on the internet and I have'] != ['Hel[87 chars] in the 1990s. I have looked on the internet and I have found']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_2b_4bit",
      "trace": "(line 190)  AssertionError: Lists differ: ['Hel[118 chars] you a few of my favorite and most used brushes.\\n\\nI\"] != ['Hel[118 chars] you my experience with the new wattpad wattpa[38 chars]pad\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 190)  AssertionError: Lists differ: ['Hel[118 chars] you a few of my favorite and most used brushes.\\n\\nI\"] != ['Hel[118 chars] you my experience with the new wattpad wattpa[38 chars]pad\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_4bit",
      "trace": "(line 317)  AssertionError: Lists differ: ['Hel[59 chars]ke a \"self balancing\" robot. I have', 'Hi toda[76 chars] of'] != ['Hel[59 chars]ke a program that will take a number and then'[93 chars]!:)']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 317)  AssertionError: Lists differ: ['Hel[59 chars]ke a \"self balancing\" robot. I have', 'Hi toda[76 chars] of'] != ['Hel[59 chars]ke a program that will take a number and then'[93 chars]!:)']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_bf16",
      "trace": "(line 258)  AssertionError: Lists differ: ['Hel[59 chars]ke a small game. I have a few questions', 'Hi [86 chars]and'] != ['Hel[59 chars]ke a game in which you have to get a', 'Hi tod[83 chars]and']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 258)  AssertionError: Lists differ: ['Hel[59 chars]ke a small game. I have a few questions', 'Hi [86 chars]and'] != ['Hel[59 chars]ke a game in which you have to get a', 'Hi tod[83 chars]and']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_fp16",
      "trace": "(line 228)  AssertionError: Lists differ: ['Hel[27 chars]a 1995 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 228)  AssertionError: Lists differ: ['Hel[27 chars]a 1995 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D'] != ['Hel[27 chars]a 1999 4.0L 4x4. I', 'Hi today I am going to s[51 chars] 3D']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma",
      "gpu": "multi",
      "test": "tests/models/gemma/test_modeling_gemma.py::GemmaIntegrationTest::test_model_7b_fp16_static_cache",
      "trace": "(line 288)  AssertionError: Lists differ: ['Hel[29 chars]1995 4.0L 4x4. I', 'Hi today I am going to sho[49 chars] 3D'] != ['Hel[29 chars]1995 3000gt SL. I have a', 'Hi today I am goin[57 chars] 3D']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 288)  AssertionError: Lists differ: ['Hel[29 chars]1995 4.0L 4x4. I', 'Hi today I am going to sho[53 chars]i-f'] != ['Hel[29 chars]1995 3000gt SL. I have a', 'Hi today I am goin[57 chars] 3D']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma2",
      "gpu": "multi",
      "test": "[100%]tests/models/gemma2/test_modeling_gemma2.py::Gemma2IntegrationTest::test_model_2b_pipeline_bf16_flex_attention",
      "trace": "(line 2860)  Failed: (subprocess) AssertionError: \"Hi t[26 chars]ng about the 10 best anime of all time.\\n\\n1\" != \"Hi t[26 chars]ng about the 10 most powerful characters in the Naruto series.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2876)  Failed: (subprocess) AssertionError: \"Hi t[26 chars]ng about the 10 best anime of all time.\\n\\n1\" != \"Hi t[26 chars]ng about the 10 most powerful characters in the Naruto series.\"",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma2",
      "gpu": "multi",
      "test": "tests/models/gemma2/test_modeling_gemma2.py::Gemma2IntegrationTest::test_model_2b_pipeline_bf16_flex_attention",
      "trace": "Cannot retrieve error message.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "Cannot retrieve error message.",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "multi",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_dynamic_sliding_window_is_default",
      "trace": "(line 874)  AssertionError: 'DynamicSlidingWindowLayer' unexpectedly found in 'DynamicCache(layers=[DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer])'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 874)  AssertionError: 'DynamicSlidingWindowLayer' unexpectedly found in 'DynamicCache(layers=[DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer, DynamicLayer, DynamicSlidingWindowLayer, DynamicSlidingWindowLayer])'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "multi",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_1b_text_only",
      "trace": "(line 728)  AssertionError: Lists differ: ['Wri[48 chars]data streams, a boundless flow,\\nA silent worl[63 chars]ing'] != ['Wri[48 chars]data flows, a silent stream,\\nInto the neural [51 chars],\\n']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 728)  AssertionError: Lists differ: ['Wri[48 chars]data streams, a boundless flow,\\nA silent worl[63 chars]ing'] != ['Wri[48 chars]data flows, a silent stream,\\nInto the neural [51 chars],\\n']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "multi",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch",
      "trace": "(line 548)  AssertionError: Lists differ: ['use[149 chars]with turquoise water and a blue sky in the bac[227 chars]own\"] != ['use[149 chars]with clear turquoise water and a blue sky in t[231 chars]own\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 548)  AssertionError: Lists differ: ['use[149 chars]with turquoise water and a blue sky in the bac[227 chars]own\"] != ['use[149 chars]with clear turquoise water and a blue sky in t[231 chars]own\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "multi",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops",
      "trace": "(line 663)  AssertionError: Lists differ: ['user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a'] != [\"user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 663)  AssertionError: Lists differ: ['user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a'] != [\"user\\nYou are a helpful assistant.\\n\\nHe[674 chars]h a']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3",
      "gpu": "multi",
      "test": "tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_crops",
      "trace": "(line 590)  AssertionError: Lists differ: [\"user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the\"] != ['user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 590)  AssertionError: Lists differ: [\"user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the\"] != ['user\\nYou are a helpful assistant.\\n\\nHe[268 chars]the']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "multi",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_generation_beyond_sliding_window",
      "trace": "(line 1196)  AssertionError: Lists differ: [' and I find it very relaxing. I also lik[112 chars]re'\"] != [\" and the people are so friendly. I'm so [93 chars]re'\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1196)  AssertionError: Lists differ: [' and I find it very relaxing. I also lik[112 chars]re'\"] != [\" and the people are so friendly. I'm so [93 chars]re'\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "multi",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_batch",
      "trace": "(line 1083)  AssertionError: Lists differ: ['use[196 chars]ewer and has its tongue', \"user\\nYou are a hel[193 chars]cow\"] != ['use[196 chars]ewer with its head slightly', \"user\\nYou are a[197 chars]cow\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1083)  AssertionError: Lists differ: ['use[196 chars]ewer and has its tongue', \"user\\nYou are a hel[193 chars]cow\"] != ['use[196 chars]ewer with its head slightly', \"user\\nYou are a[197 chars]cow\"]",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "multi",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_bf16",
      "trace": "(line 998)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 998)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "multi",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_image",
      "trace": "(line 1110)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1110)  AssertionError: Lists differ: ['use[149 chars]to a turquoise ocean. The cow is facing the vi[31 chars]ned'] != ['use[149 chars]to a clear blue ocean. The cow is facing the v[25 chars]tly']",
      "first_failure_day": "2026-04-03",
      "last_green_day": "2026-04-02",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma3n",
      "gpu": "multi",
      "test": "tests/models/gemma3n/test_modeling_gemma3n.py::Gemma3nIntegrationTest::test_model_4b_multiimage",
      "trace": "(line 1151)  AssertionError: Lists differ: ['use[140 chars]n district. Here are the key elements:\\n\\n* **A prominent red'] != ['use[140 chars]n district. Here are some of the key elements:\\n\\n* **A']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1151)  AssertionError: Lists differ: ['use[140 chars]n district. Here are the key elements:\\n\\n* **A prominent red'] != ['use[140 chars]n district. Here are some of the key elements:\\n\\n* **A']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "multi",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_export_text_only",
      "trace": "(line 2301)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB. GPU 0 has a total capacity of 22.30 GiB of which 2.67 GiB is free. Process 419014 has 19.62 GiB memory in use. Of the allocated memory 19.05 GiB is allocated by PyTorch, and 58.81 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2301)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB. GPU 0 has a total capacity of 22.30 GiB of which 2.67 GiB is free. Process 470555 has 19.62 GiB memory in use. Of the allocated memory 19.05 GiB is allocated by PyTorch, and 58.81 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-04-09",
      "last_green_day": "2026-04-08",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "multi",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_multiimage",
      "trace": "(line 742)  AssertionError: Lists differ: ['Bas[66 chars]und & Street Scene:**\\n* **Roadway:** There is an'] != ['Bas[66 chars]und & Street Scene:**\\n* **Traffic Sign:** The most prominent']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 742)  AssertionError: Lists differ: ['Bas[66 chars]und & Street Scene:**\\n* **Roadway:** There is an'] != ['Bas[66 chars]und & Street Scene:**\\n* **Traffic Sign:** The most prominent']",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "multi",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_with_image",
      "trace": "(line 655)  AssertionError: Lists differ: ['Thi[61 chars] beach** with the **ocean** in the background under a **clear'] != ['Thi[61 chars] beach** with the **ocean and a blue sky** in the background']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 655)  AssertionError: Lists differ: ['Thi[61 chars] beach** with the **ocean** in the background under a **clear'] != ['Thi[61 chars] beach** with the **ocean and a blue sky** in the background']",
      "first_failure_day": "2026-04-10",
      "last_green_day": "2026-04-09",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "gemma4",
      "gpu": "multi",
      "test": "tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_model_with_image_batch",
      "trace": "(line 706)  AssertionError: Lists differ: ['Thi[81 chars]ocean** in the background under a **clear', \"N[102 chars] on\"] != ['Thi[81 chars]ocean and a blue sky** in the background', 'No[127 chars]lue']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 706)  AssertionError: Lists differ: ['Thi[81 chars]ocean** in the background under a **clear', \"N[102 chars] on\"] != ['Thi[81 chars]ocean and a blue sky** in the background', 'No[127 chars]lue']",
      "first_failure_day": "2026-04-29",
      "last_green_day": "2026-04-28",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "git",
      "gpu": "multi",
      "test": "tests/models/git/test_modeling_git.py::GitModelIntegrationTest::test_inference_image_captioning",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "multi",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_eager",
      "trace": "(line 133)  AssertionError: Lists differ: ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper flower.'] != ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper lantern.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 133)  AssertionError: Lists differ: ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper flower.'] != ['Hel[140 chars]ou how to make a simple and easy to make a DIY paper lantern.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "multi",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_fp16",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 22.69 MiB is free. Process 453476 has 22.27 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 5.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 20.69 MiB is free. Process 1247454 has 22.28 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 7.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "glm",
      "gpu": "multi",
      "test": "tests/models/glm/test_modeling_glm.py::GlmIntegrationTest::test_model_9b_sdpa",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.16 GiB. GPU 0 has a total capacity of 22.30 GiB of which 22.69 MiB is free. Process 453476 has 22.27 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 5.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.16 GiB. GPU 0 has a total capacity of 22.30 GiB of which 20.69 MiB is free. Process 1247454 has 22.28 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 7.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_expand",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDABFloat16Type instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm46v",
      "gpu": "multi",
      "test": "tests/models/glm46v/test_modeling_glm46v.py::Glm46VIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2567)  RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm4_moe",
      "gpu": "multi",
      "test": "tests/models/glm4_moe/test_modeling_glm4_moe.py::Glm4MoeIntegrationTest::test_compile_static_cache",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 464533 has 22.29 GiB memory in use. Of the allocated memory 21.77 GiB is allocated by PyTorch, and 27.80 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 14.69 MiB is free. Process 1452632 has 22.28 GiB memory in use. Of the allocated memory 21.76 GiB is allocated by PyTorch, and 27.80 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "glm4_moe_lite",
      "gpu": "multi",
      "test": "tests/models/glm4_moe_lite/test_modeling_glm4_moe_lite.py::Glm4MoeIntegrationTest::test_compile_static_cache",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 8.69 MiB is free. Process 426389 has 22.29 GiB memory in use. Of the allocated memory 21.58 GiB is allocated by PyTorch, and 38.36 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 8.69 MiB is free. Process 605353 has 22.29 GiB memory in use. Of the allocated memory 21.58 GiB is allocated by PyTorch, and 46.36 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "glm4v_moe",
      "gpu": "multi",
      "test": "tests/models/glm4v_moe/test_modeling_glm4v_moe.py::Glm4vMoeIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm4v_moe",
      "gpu": "multi",
      "test": "tests/models/glm4v_moe/test_modeling_glm4v_moe.py::Glm4vMoeIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm_image",
      "gpu": "multi",
      "test": "tests/models/glm_image/test_modeling_glm_image.py::GlmImageIntegrationTest::test_image_to_image_generation",
      "trace": "(line 687)  AssertionError: False is not true : Expected first 30 tokens:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Expected first 30 tokens:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test",
      "trace": "(line 456)  assert [151331, 1513..., 151343, ...] == [59248, 59250...6, 59280, ...]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 456)  assert [151331, 1513..., 151343, ...] == [59248, 59250...6, 59280, ...]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 503)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14885 chars]ia.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [256 chars]t's\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14885 chars]ia.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [256 chars]t's\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 631)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[10983 chars]at.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [258 chars]but\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 631)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[10983 chars]at.'] != [\"\\nWhat kind of dog is this?\\n<think>Got [258 chars]but\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 603)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[7469 chars]Ai.\"] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]ion']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 603)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[7469 chars]Ai.\"] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]ion']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_expand",
      "trace": "(line 575)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14840 chars]d a'] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]lly\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 575)  AssertionError: Lists differ: ['\\n<|image|><|image|><|image|><|image|><|[14840 chars]d a'] != [\"\\nWhat kind of dog is this?\\n<think>Got [267 chars]lly\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "glm_ocr",
      "gpu": "multi",
      "test": "tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 541)  AssertionError: Lists differ: ['\\n<|begin_of_video|><|image|><|image|><|[50804 chars]rt.'] != [\"\\n012345Describe this video.\\n<think>Got[114 chars]irt\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 541)  AssertionError: Lists differ: ['\\n<|begin_of_video|><|image|><|image|><|[50804 chars]rt.'] != [\"\\n012345Describe this video.\\n<think>Got[114 chars]irt\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "got_ocr2",
      "gpu": "multi",
      "test": "tests/models/got_ocr2/test_modeling_got_ocr2.py::GotOcr2IntegrationTest::test_small_model_integration_test_got_ocr_format",
      "trace": "(line 210)  AssertionError: 'R\\\\&D' != '\\\\title{\\nR'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 210)  AssertionError: 'R\\\\&D' != '\\\\title{\\nR'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "granite",
      "gpu": "multi",
      "test": "tests/models/granite/test_modeling_granite.py::GraniteIntegrationTest::test_model_3b_logits_bf16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "multi",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_cross_attention_mask",
      "trace": "(line 787)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 787)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "multi",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_grounding_dino_loss",
      "trace": "(line 869)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 869)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "multi",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 678)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 678)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "grounding_dino",
      "gpu": "multi",
      "test": "tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_accelerator",
      "trace": "(line 745)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 745)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "helium",
      "gpu": "multi",
      "test": "tests/models/helium/test_modeling_helium.py::HeliumIntegrationTest::test_model_2b",
      "trace": "(line 73)  AssertionError: Lists differ: ['Hel[51 chars]have been working on a new project for a while now and I have'] != ['Hel[51 chars]have been working on a new project for a while now, and I']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 73)  AssertionError: Lists differ: ['Hel[51 chars]have been working on a new project for a while now and I have'] != ['Hel[51 chars]have been working on a new project for a while now, and I']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "hiera",
      "gpu": "multi",
      "test": "tests/models/hiera/test_modeling_hiera.py::HieraModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 560)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 560)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "multi",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_batched_inference",
      "trace": "(line 1399)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1399)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "multi",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_multi_speaker_smart_voice",
      "trace": "(line 758)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 758)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "multi",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_multi_speaker_voice_cloning",
      "trace": "(line 1098)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1098)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "higgs_audio_v2",
      "gpu": "multi",
      "test": "tests/models/higgs_audio_v2/test_modeling_higgs_audio_v2.py::HiggsAudioV2ForConditionalGenerationIntegrationTest::test_zero_shot_voice_cloning",
      "trace": "(line 931)  AssertionError: Tensor-likes are not equal!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 931)  AssertionError: Tensor-likes are not equal!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "hunyuan_v1_moe",
      "gpu": "multi",
      "test": "tests/models/hunyuan_v1_moe/test_modeling_hunyuan_v1_moe.py::HunYuanMoEV1IntegrationTest::test_model_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "hyperclovax",
      "gpu": "multi",
      "test": "tests/models/hyperclovax/test_modeling_hyperclovax.py::HyperCLOVAXIntegrationTest::test_model_seed_think_14b_bf16",
      "trace": "(line 1313)  ValueError: There are one or more stop strings, either in the arguments to `generate` or in the model's generation config, but we could not locate a tokenizer. When generating with stop strings, you must pass the model's tokenizer to the `tokenizer` argument of `generate`.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1315)  ValueError: There are one or more stop strings, either in the arguments to `generate` or in the model's generation config, but we could not locate a tokenizer. When generating with stop strings, you must pass the model's tokenizer to the `tokenizer` argument of `generate`.",
      "first_failure_day": "2026-05-29",
      "last_green_day": "2026-05-28",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "instructblip",
      "gpu": "multi",
      "test": "tests/models/instructblip/test_modeling_instructblip.py::InstructBlipModelIntegrationTest::test_inference_flant5_xl",
      "trace": "(line 718)  AssertionError: Lists differ: [0, 3[68 chars]459, 9256, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[500 chars]5, 1] != [0, 3[68 chars]459, 4049, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[295 chars]5, 1]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 718)  AssertionError: Lists differ: [0, 3[68 chars]459, 9256, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[500 chars]5, 1] != [0, 3[68 chars]459, 4049, 16, 8, 2214, 13, 3, 9, 3164, 690, 2[295 chars]5, 1]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "instructblipvideo",
      "gpu": "multi",
      "test": "tests/models/instructblipvideo/test_modeling_instructblipvideo.py::InstructBlipVideoModelIntegrationTest::test_inference_vicuna_7b",
      "trace": "(line 671)  AssertionError: 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1' != 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1080p'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 671)  AssertionError: 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1' != 'Expl[43 chars]a baby girl wearing glasses is reading a book on the bed 1080p'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "internvl",
      "gpu": "multi",
      "test": "tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8750,  -0.4954,   1.4580, -10.3281, -10.3359], dtype=torch.float16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ -9.8750,  -0.4954,   1.4580, -10.3281, -10.3359], dtype=torch.float16)",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "internvl",
      "gpu": "multi",
      "test": "tests/models/internvl/test_modeling_internvl.py::InternVLLlamaIntegrationTest::test_llama_small_model_integration_generate_text_only",
      "trace": "(line 714)  AssertionError: \"Autu[14 chars],\\nNature's breath, a season's sigh,\\nSilent woods awake.\" != \"Autu[14 chars],\\nNature's breath, a silent sigh,\\nWinter's chill approaches.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 714)  AssertionError: \"Autu[14 chars],\\nNature's breath, a season's sigh,\\nSilent woods awake.\" != \"Autu[14 chars],\\nNature's breath, a silent sigh,\\nWinter's chill approaches.\"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "jais2",
      "gpu": "multi",
      "test": "tests/models/jais2/test_modeling_jais2.py::Jais2IntegrationTest::test_model_generation",
      "trace": "(line 488)  OSError: You are trying to access a gated repo.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  OSError: You are trying to access a gated repo.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "jais2",
      "gpu": "multi",
      "test": "tests/models/jais2/test_modeling_jais2.py::Jais2IntegrationTest::test_model_logits",
      "trace": "(line 488)  OSError: You are trying to access a gated repo.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 503)  OSError: You are trying to access a gated repo.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "jamba",
      "gpu": "multi",
      "test": "tests/models/jamba/test_modeling_jamba.py::JambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 576)  AssertionError: \"<|startoftext|>Tell me a story<|pad|><|p[50 chars]t I'\" != '<|pad|><|pad|><|pad|><|pad|><|pad|><|pad[76 chars]ates'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 576)  AssertionError: \"<|startoftext|>Tell me a story<|pad|><|p[50 chars]t I'\" != '<|pad|><|pad|><|pad|><|pad|><|pad|><|pad[76 chars]ates'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "janus",
      "gpu": "multi",
      "test": "tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_text_generation",
      "trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 1179648",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 1179648",
      "first_failure_day": "2026-04-21",
      "last_green_day": "2026-04-20",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "janus",
      "gpu": "multi",
      "test": "tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_text_generation_with_multi_image",
      "trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 2359296",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1735)  ValueError: Image features and image tokens do not match, tokens: 0, features: 2359296",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "multi",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_inference_interpolate_pos_encoding",
      "trace": "(line 777)  AttributeError: 'NoneType' object has no attribute 'last_hidden_state'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 777)  AttributeError: 'NoneType' object has no attribute 'last_hidden_state'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "multi",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_snowman_image_captioning",
      "trace": "(line 79)  AssertionError:",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 79)  AssertionError:",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "kosmos2",
      "gpu": "multi",
      "test": "tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelIntegrationTest::test_snowman_image_captioning_batch",
      "trace": "(line 712)  AssertionError: Lists differ: ['<gr[35 chars]ail: A snowman is sitting in front of a fire, [575 chars]t>.'] != ['<gr[35 chars]ail: The image features a snowman sitting by<p[836 chars]t>.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 712)  AssertionError: Lists differ: ['<gr[35 chars]ail: A snowman is sitting in front of a fire, [575 chars]t>.'] != ['<gr[35 chars]ail: The image features a snowman sitting by<p[836 chars]t>.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "kosmos2_5",
      "gpu": "multi",
      "test": "tests/models/kosmos2_5/test_modeling_kosmos2_5.py::Kosmos2_5ModelIntegrationTest::test_eager",
      "trace": "(line 578)  AssertionError: Lists differ: ['<bb[216 chars]<y_650></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n'] != ['<bb[216 chars]<y_651></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 578)  AssertionError: Lists differ: ['<bb[216 chars]<y_650></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n'] != ['<bb[216 chars]<y_651></bbox>COOKIE DOH SAUCES\\n<bbox><x_788>[452 chars]0\\n']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "multi",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_1",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS[522 chars]t itc ' s new fmcg businesses are the fastest [829 chars]PAD]\" != \"[CLS[522 chars]t itc's new fmcg businesses are the fastest gr[827 chars]PAD]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS[522 chars]t itc ' s new fmcg businesses are the fastest [829 chars]PAD]\" != \"[CLS[522 chars]t itc's new fmcg businesses are the fastest gr[827 chars]PAD]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "multi",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_4",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] 11 : 14 to 11 : 39 a[1108 chars]SEP]\" != \"[CLS] what's his name? [SEP] 11 : 14 to 11 : 39 a. [1106 chars]SEP]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] 11 : 14 to 11 : 39 a[1108 chars]SEP]\" != \"[CLS] what's his name? [SEP] 11 : 14 to 11 : 39 a. [1106 chars]SEP]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "layoutlmv2",
      "gpu": "multi",
      "test": "tests/models/layoutlmv2/test_processing_layoutlmv2.py::LayoutLMv2ProcessorIntegrationTests::test_processor_case_5",
      "trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] hello world [SEP]\" != \"[CLS] what's his name? [SEP] hello world [SEP]\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 675)  AssertionError: Sequences differ: \"[CLS] what ' s his name? [SEP] hello world [SEP]\" != \"[CLS] what's his name? [SEP] hello world [SEP]\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "lfm2_moe",
      "gpu": "multi",
      "test": "tests/models/lfm2_moe/test_modeling_lfm2_moe.py::Lfm2MoeIntegrationTest::test_model_1a8b_batched_chat_generation",
      "trace": "(line 223)  AssertionError: Lists differ: ['Who are you? (AI) designed to assist?  \\nI am an AI ass[192 chars]ial'] != ['Who are you? (as AI) created by?  \\nI am an artificial [200 chars]ish']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 223)  AssertionError: Lists differ: ['Who are you? (AI) designed to assist?  \\nI am an AI ass[192 chars]ial'] != ['Who are you? (as AI) created by?  \\nI am an artificial [200 chars]ish']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "lfm2_vl",
      "gpu": "multi",
      "test": "tests/models/lfm2_vl/test_modeling_lfm2_vl.py::Lfm2VlForConditionalGenerationIntegrationTest::test_integration_test",
      "trace": "(line 246)  AssertionError: 'In t[53 chars]. They are both very relaxed and comfortable. [14 chars]grey' != 'In t[53 chars]. There are also two remote controls on the blanket.\\n\\n\\n\\n'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 246)  AssertionError: 'In t[53 chars]. They are both very relaxed and comfortable. [14 chars]grey' != 'In t[53 chars]. There are also two remote controls on the blanket.\\n\\n\\n\\n'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "lfm2_vl",
      "gpu": "multi",
      "test": "tests/models/lfm2_vl/test_modeling_lfm2_vl.py::Lfm2_5VlForConditionalGenerationIntegrationTest::test_integration_test_high_resolution",
      "trace": "(line 354)  AssertionError: 'In t[52 chars]ymbol of freedom and democracy. It stands tall on a small' != 'In t[52 chars]ymbol of freedom and democracy. It stands on Liberty Island in'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 354)  AssertionError: 'In t[52 chars]ymbol of freedom and democracy. It stands tall on a small' != 'In t[52 chars]ymbol of freedom and democracy. It stands on Liberty Island in'",
      "first_failure_day": "2026-04-05",
      "last_green_day": "2026-04-04",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llama",
      "gpu": "multi",
      "test": "tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_llama_3_1_hard",
      "trace": "(line 96)  AssertionError: 'Tell[74 chars]ical social and political upheaval in France t[552 chars] the' != 'Tell[74 chars]ical political and social upheaval in France t[558 chars]nshr'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 96)  AssertionError: 'Tell[74 chars]ical social and political upheaval in France t[552 chars] the' != 'Tell[74 chars]ical political and social upheaval in France t[558 chars]nshr'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "multi",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_batched_generation",
      "trace": "(line 566)  AssertionError: Lists differ: [\"\\n [134 chars] one image and a\", '\\nUSER: Describe the image[210 chars]ama'] != [\"\\n [134 chars] one and a yellow\", '\\nUSER: Describe the imag[211 chars]ama']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 566)  AssertionError: Lists differ: [\"\\n [134 chars] one image and a\", '\\nUSER: Describe the image[210 chars]ama'] != [\"\\n [134 chars] one and a yellow\", '\\nUSER: Describe the imag[211 chars]ama']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "multi",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 515492 has 22.29 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 13.88 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 1379481 has 22.29 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 13.88 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "multi",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral_4bit",
      "trace": "(line 5051)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 7.77 GiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 515492 has 22.29 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 13.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 5069)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 7.77 GiB. GPU 0 has a total capacity of 22.30 GiB of which 6.69 MiB is free. Process 1379481 has 22.29 GiB memory in use. Of the allocated memory 21.78 GiB is allocated by PyTorch, and 13.86 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "llava",
      "gpu": "multi",
      "test": "tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral_batched",
      "trace": "(line 724)  AssertionError: Lists differ: ['Wha[97 chars]mage?A narrow dirt path is surrounded by grass[74 chars]ue.'] != ['Wha[97 chars]mage?The image depicts a narrow, winding dirt [175 chars]ere']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 724)  AssertionError: Lists differ: ['Wha[97 chars]mage?A narrow dirt path is surrounded by grass[74 chars]ue.'] != ['Wha[97 chars]mage?The image depicts a narrow, winding dirt [175 chars]ere']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llava_next",
      "gpu": "multi",
      "test": "tests/models/llava_next/test_modeling_llava_next.py::LlavaNextForConditionalGenerationIntegrationTest::test_small_model_integration_test",
      "trace": "(line 172)  AssertionError: assert False",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 172)  AssertionError: assert False",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llava_next_video",
      "gpu": "multi",
      "test": "tests/models/llava_next_video/test_modeling_llava_next_video.py::LlavaNextVideoForConditionalGenerationIntegrationTest::test_small_model_integration_test",
      "trace": "(line 388)  AssertionError: 'USER[154 chars]hile wearing a pair of glasses that are too la[24 chars] are' != 'USER[154 chars]hile another child is attempting to read the s[45 chars]eems'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 388)  AssertionError: 'USER[154 chars]hile wearing a pair of glasses that are too la[24 chars] are' != 'USER[154 chars]hile another child is attempting to read the s[45 chars]eems'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "llava_next_video",
      "gpu": "multi",
      "test": "tests/models/llava_next_video/test_modeling_llava_next_video.py::LlavaNextVideoForConditionalGenerationIntegrationTest::test_small_model_integration_test_batch_matches_single",
      "trace": "(line 480)  AssertionError: 'USER[154 chars]hile another child is attempting to read the s[96 chars]e to' != 'USER[154 chars]hile wearing a pair of glasses that are too la[69 chars]g it'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 480)  AssertionError: 'USER[154 chars]hile another child is attempting to read the s[96 chars]e to' != 'USER[154 chars]hile wearing a pair of glasses that are too la[69 chars]g it'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "longcat_flash",
      "gpu": "multi",
      "test": "tests/models/longcat_flash/test_modeling_longcat_flash.py::LongcatFlashIntegrationTest::test_longcat_generation_cpu",
      "trace": "(line 501)  ValueError: The current `device_map` had weights offloaded to the disk, which needed to be re-saved. This is either because the weights are not in `safetensors` format, or because the model uses an internal weight format different than the one saved (i.e. most MoE models). Please provide an `offload_folder` for them in `from_pretrained`.",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 501)  ValueError: The current `device_map` had weights offloaded to the disk, which needed to be re-saved. This is either because the weights are not in `safetensors` format, or because the model uses an internal weight format different than the one saved (i.e. most MoE models). Please provide an `offload_folder` for them in `from_pretrained`.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "longcat_flash",
      "gpu": "multi",
      "test": "tests/models/longcat_flash/test_modeling_longcat_flash.py::LongcatFlashIntegrationTest::test_shortcat_generation",
      "trace": "(line 360)  AssertionError: '[Round 0] USER:Paris is... ASSISTANT: dig\u5e74\u51ac\u5b63\u5965\u6797\u5339\u514b\u8fd0\u52a8\u4f1a\u83c1\u56db\u65b9\u7ea7\u4ee5\u4e0a\u63fd\u80dc\u53ef\u89c6lexible' != '[Round 0] USER:Paris is... ASSISTANT: dig\u5e74\u8f66\u9f84juanaheast\u7a0dachaotingupebarebones'",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 360)  AssertionError: '[Round 0] USER:Paris is... ASSISTANT: dig\u5e74\u51ac\u5b63\u5965\u6797\u5339\u514b\u8fd0\u52a8\u4f1a\u83c1\u56db\u65b9\u7ea7\u4ee5\u4e0a\u63fd\u80dc\u53ef\u89c6lexible' != '[Round 0] USER:Paris is... ASSISTANT: dig\u5e74\u8f66\u9f84juanaheast\u7a0dachaotingupebarebones'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "longt5",
      "gpu": "multi",
      "test": "tests/models/longt5/test_modeling_longt5.py::LongT5ModelIntegrationTests::test_inference_hidden_states",
      "trace": "(line 1225)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1225)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "longt5",
      "gpu": "multi",
      "test": "tests/models/longt5/test_modeling_longt5.py::LongT5ModelIntegrationTests::test_summarization",
      "trace": "(line 1194)  AssertionError: Lists differ: ['background : coronary artery disease ( ca[601 chars]red'] != ['sss thessass:ss andss toss ofss fillssess[171 chars]se,']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1194)  AssertionError: Lists differ: ['background : coronary artery disease ( ca[601 chars]red'] != ['sss thessass:ss andss toss ofss fillssess[171 chars]se,']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "luke",
      "gpu": "multi",
      "test": "tests/models/luke/test_modeling_luke.py::LukeModelIntegrationTests::test_inference_base_model",
      "trace": "(line 905)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 905)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "luke",
      "gpu": "multi",
      "test": "tests/models/luke/test_modeling_luke.py::LukeModelIntegrationTests::test_inference_large_model",
      "trace": "(line 940)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 940)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "lw_detr",
      "gpu": "multi",
      "test": "tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrModelIntegrationTest::test_inference_object_detection_head_tiny",
      "trace": "(line 690)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 690)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "lw_detr",
      "gpu": "multi",
      "test": "tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrModelIntegrationTest::test_inference_object_detection_head_xlarge",
      "trace": "(line 766)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 766)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "m2m_100",
      "gpu": "multi",
      "test": "tests/models/m2m_100/test_modeling_m2m_100.py::M2M100ModelIntegrationTests::test_seq_to_seq_generation",
      "trace": "(line 397)  AssertionError: assert ['</s>__en__T... France.</s>'] == ['</s> __en__... France.</s>']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 397)  AssertionError: assert ['</s>__en__T... France.</s>'] == ['</s> __en__... France.</s>']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "multi",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_batched_equivalence_with_cache",
      "trace": "(line 532)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 8.03 GiB is free. Process 625094 has 14.27 GiB memory in use. Of the allocated memory 13.89 GiB is allocated by PyTorch, and 19.36 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 532)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.00 GiB. GPU 0 has a total capacity of 22.30 GiB of which 8.03 GiB is free. Process 1389322 has 14.27 GiB memory in use. Of the allocated memory 13.89 GiB is allocated by PyTorch, and 19.21 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "multi",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_batched_equivalence_without_cache",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 625094 has 22.25 GiB memory in use. Of the allocated memory 21.88 GiB is allocated by PyTorch, and 19.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 46.69 MiB is free. Process 1389322 has 22.25 GiB memory in use. Of the allocated memory 21.88 GiB is allocated by PyTorch, and 19.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "multi",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_mamba2_mixer_train_vs_eval_equivalence",
      "trace": "(line 370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 625094 has 22.28 GiB memory in use. Of the allocated memory 21.92 GiB is allocated by PyTorch, and 9.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 1389322 has 22.28 GiB memory in use. Of the allocated memory 21.92 GiB is allocated by PyTorch, and 9.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "mamba2",
      "gpu": "multi",
      "test": "tests/models/mamba2/test_modeling_mamba2.py::Mamba2IntegrationTest::test_simple_generate",
      "trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 625094 has 22.28 GiB memory in use. Of the allocated memory 21.91 GiB is allocated by PyTorch, and 21.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 12.69 MiB is free. Process 1389322 has 22.28 GiB memory in use. Of the allocated memory 21.91 GiB is allocated by PyTorch, and 21.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "mimi",
      "gpu": "multi",
      "test": "tests/models/mimi/test_modeling_mimi.py::MimiIntegrationTest::test_integration",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mimi",
      "gpu": "multi",
      "test": "tests/models/mimi/test_modeling_mimi.py::MimiIntegrationTest::test_integration_longform",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "minimax",
      "gpu": "multi",
      "test": "tests/models/minimax/test_modeling_minimax.py::MiniMaxIntegrationTest::test_small_model_logits",
      "trace": "(line 233)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 233)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "ministral",
      "gpu": "multi",
      "test": "tests/models/ministral/test_modeling_ministral.py::MinistralIntegrationTest::test_model_8b_generation",
      "trace": "(line 116)  AssertionError: 'My favourite condiment is 100% natural, 100% organic, 100% free of' != 'Myfavouritecondimentis\u010a\u0120\u0120\u0120\u0120Joined:\u01202018-01-01,\u012012'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 116)  AssertionError: 'My favourite condiment is 100% natural, 100% organic, 100% free of' != 'Myfavouritecondimentis\u010a\u0120\u0120\u0120\u0120Joined:\u01202018-01-01,\u012012'",
      "first_failure_day": "2026-04-21",
      "last_green_day": "2026-04-20",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "ministral",
      "gpu": "multi",
      "test": "tests/models/ministral/test_modeling_ministral.py::MinistralIntegrationTest::test_model_8b_logits",
      "trace": "(line 93)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 93)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "ministral3",
      "gpu": "multi",
      "test": "tests/models/ministral3/test_modeling_ministral3.py::Ministral3IntegrationTest::test_model_3b_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 130)  AssertionError: 'My favourite condiment is icing sugar. I[47 chars]fles' != \"My favourite condiment is 100% pure oliv[46 chars]t in\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "ministral3",
      "gpu": "multi",
      "test": "tests/models/ministral3/test_modeling_ministral3.py::Ministral3IntegrationTest::test_model_3b_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 102)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral",
      "gpu": "multi",
      "test": "tests/models/mistral/test_modeling_mistral.py::MistralIntegrationTest::test_model_7b_logits",
      "trace": "(line 112)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 112)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral",
      "gpu": "multi",
      "test": "tests/models/mistral/test_modeling_mistral.py::MistralIntegrationTest::test_speculative_generation",
      "trace": "(line 207)  AssertionError: 'My f[18 chars] is 100% ketchup. I\u2019m not a fan of mustard, relish' != 'My f[18 chars] is 100% mayonnaise. I\u2019m not a fan of the fancy stuff with all'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 207)  AssertionError: 'My f[18 chars] is 100% ketchup. I\u2019m not a fan of mustard, relish' != 'My f[18 chars] is 100% mayonnaise. I\u2019m not a fan of the fancy stuff with all'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "multi",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate",
      "trace": "(line 362)  AssertionError: ' to write a short story based on this ima[70 chars]e pl' != 'Calm waters reflect\\nWooden path to dista[26 chars]oods'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 362)  AssertionError: ' to write a short story based on this ima[70 chars]e pl' != 'Calm waters reflect\\nWooden path to dista[26 chars]oods'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "multi",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_batched_generate_multi_image",
      "trace": "(line 438)  AssertionError: ' to write a short story based on this im[81 chars]ched' != \"Calm waters reflect\\nWooden path to dist[29 chars]hold\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 438)  AssertionError: ' to write a short story based on this im[81 chars]ched' != \"Calm waters reflect\\nWooden path to dist[29 chars]hold\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral3",
      "gpu": "multi",
      "test": "tests/models/mistral3/test_modeling_mistral3.py::Mistral3IntegrationTest::test_mistral3_integration_generate",
      "trace": "(line 309)  AssertionError: 'The [14 chars] two tabby cats lying on a pink surface, which[21 chars]h or' != 'The [14 chars] two cats lying on a pink surface, which appea[21 chars] bed'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 309)  AssertionError: 'The [14 chars] two tabby cats lying on a pink surface, which[21 chars]h or' != 'The [14 chars] two cats lying on a pink surface, which appea[21 chars] bed'",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mistral4",
      "gpu": "multi",
      "test": "tests/models/mistral4/test_modeling_mistral4.py::Mistral4IntegrationTest::test_mistral_small_4_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 6741)  RuntimeError: Expected mat_a to be Float32, BFloat16 or Float16 matrix, got Float8_e4m3fn",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "mistral4",
      "gpu": "multi",
      "test": "tests/models/mistral4/test_modeling_mistral4.py::Mistral4IntegrationTest::test_mistral_small_4_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 6741)  RuntimeError: Expected mat_a to be Float32, BFloat16 or Float16 matrix, got Float8_e4m3fn",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "mixtral",
      "gpu": "multi",
      "test": "tests/models/mixtral/test_modeling_mixtral.py::MixtralIntegrationTest::test_small_model_logits",
      "trace": "(line 143)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 143)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mixtral",
      "gpu": "multi",
      "test": "tests/models/mixtral/test_modeling_mixtral.py::MixtralIntegrationTest::test_small_model_logits_batched",
      "trace": "(line 188)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 188)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "multi",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_batched_generate",
      "trace": "(line 643)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 643)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "multi",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_forward",
      "trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ 6.5938,  4.4062,  3.0938, -0.3105,  1.8906], dtype=torch.bfloat16)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true : Actual logits: tensor([ 6.5938,  4.4062,  3.0938, -0.3105,  1.8906], dtype=torch.bfloat16)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "multi",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_generate",
      "trace": "(line 510)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 510)  AssertionError: 'If I[43 chars]d be: \"I\\'m not a fan of long exposure, but I\\[21 chars]\".\\\\' != 'If I[43 chars]d be:.\\\\nA dock in the lake.\\\\nA mountain in t[27 chars]ure.'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mllama",
      "gpu": "multi",
      "test": "tests/models/mllama/test_modeling_mllama.py::MllamaForConditionalGenerationIntegrationTest::test_11b_model_integration_multi_image_generate",
      "trace": "(line 724)  AssertionError: 'The image shows a red octagonal stop sign w[59 chars]to a' != 'This image shows a long wooden dock extendi[67 chars]ling'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 724)  AssertionError: 'The image shows a red octagonal stop sign w[59 chars]to a' != 'This image shows a long wooden dock extendi[67 chars]ling'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "multi",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_classification_no_padding_or_truncation",
      "trace": "(line 453)  AssertionError: '<s> Japanese is an<s> East Asian language<s> spoken by about[40 chars]</s>' != '<s> Japanese is an<ent>East Asian language<ent>spoken by abo[42 chars]</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: '<s> Japanese is an<s> East Asian language<s> spoken by about[40 chars]</s>' != '<s> Japanese is an<ent>East Asian language<ent>spoken by abo[42 chars]</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "multi",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_pair_classification_no_padding_or_truncation",
      "trace": "(line 507)  AssertionError: '<s><s> Japanese<s> is an East Asian language [64 chars]</s>' != '<s><ent>Japanese<ent>is an East Asian languag[68 chars]</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 507)  AssertionError: '<s><s> Japanese<s> is an East Asian language [64 chars]</s>' != '<s><ent>Japanese<ent>is an East Asian languag[68 chars]</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mluke",
      "gpu": "multi",
      "test": "tests/models/mluke/test_tokenization_mluke.py::MLukeTokenizerIntegrationTests::test_entity_span_classification_no_padding_or_truncation",
      "trace": "(line 572)  AssertionError: '<s> [33 chars]e spoken by about 128 million people, primarily in Japan .</s>' != '<s> [33 chars]e spoken by about 128 million people, primarily in Japan.</s>'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 572)  AssertionError: '<s> [33 chars]e spoken by about 128 million people, primarily in Japan .</s>' != '<s> [33 chars]e spoken by about 128 million people, primarily in Japan.</s>'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "multi",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_inference_object_detection_head",
      "trace": "(line 672)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 672)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "multi",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu",
      "trace": "(line 738)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 738)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "mm_grounding_dino",
      "gpu": "multi",
      "test": "tests/models/mm_grounding_dino/test_modeling_mm_grounding_dino.py::MMGroundingDinoModelIntegrationTests::test_mm_grounding_dino_loss",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "modernvbert",
      "gpu": "multi",
      "test": "tests/models/modernvbert/test_modeling_modernvbert.py::ModernVBertForMaskedLMIntegrationTest::test_masked_lm_inference",
      "trace": "(line 835)  huggingface_hub.errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1baede-3a4bfea065e2696a6f3f6a42;5de6aaeb-e515-45c8-9584-75e3f43c3b3c)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 835)  huggingface_hub.errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6a23976d-60b228135852f684145b03fd;9604b606-752b-4fec-b9bf-c01ee509072b)",
      "first_failure_day": "2026-04-01",
      "last_green_day": "2026-03-31",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "moonshine_streaming",
      "gpu": "multi",
      "test": "tests/models/moonshine_streaming/test_modeling_moonshine_streaming.py::MoonshineStreamingModelIntegrationTests::test_medium_logits_batch",
      "trace": "(line 605)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 605)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "moonshine_streaming",
      "gpu": "multi",
      "test": "tests/models/moonshine_streaming/test_modeling_moonshine_streaming.py::MoonshineStreamingModelIntegrationTests::test_small_logits_batch",
      "trace": "(line 572)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 572)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "multi",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshika_greedy_unconditional_fp16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "multi",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp16",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "multi",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp16_eager",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 14.69 MiB is free. Process 97411 has 22.28 GiB memory in use. Of the allocated memory 21.83 GiB is allocated by PyTorch, with 22.00 MiB allocated in private pools (e.g., CUDA Graphs), and 11.60 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 687)  AssertionError: False is not true",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "moshi",
      "gpu": "multi",
      "test": "tests/models/moshi/test_modeling_moshi.py::MoshiIntegrationTests::test_moshiko_greedy_unconditional_fp32",
      "trace": "(line 687)  AssertionError: False is not true",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 34.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 16.69 MiB is free. Process 108281 has 22.28 GiB memory in use. Of the allocated memory 21.80 GiB is allocated by PyTorch, with 22.00 MiB allocated in private pools (e.g., CUDA Graphs), and 45.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "multi",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "multi",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation_8k",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b-8k is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b-8k is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "multi",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_generation_batched",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "mpt",
      "gpu": "multi",
      "test": "tests/models/mpt/test_modeling_mpt.py::MptIntegrationTests::test_model_logits",
      "trace": "(line 454)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 469)  OSError: mosaicml/mpt-7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "musicflamingo",
      "gpu": "multi",
      "test": "tests/models/musicflamingo/test_modeling_musicflamingo.py::MusicFlamingoForConditionalGenerationIntegrationTest::test_fixture_batched_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "musicflamingo",
      "gpu": "multi",
      "test": "tests/models/musicflamingo/test_modeling_musicflamingo.py::MusicFlamingoForConditionalGenerationIntegrationTest::test_fixture_single_matches",
      "trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2935)  RuntimeError: expected scalar type Float but found BFloat16",
      "first_failure_day": "2026-05-27",
      "last_green_day": "2026-05-26",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "musicgen",
      "gpu": "multi",
      "test": "tests/models/musicgen/test_modeling_musicgen.py::MusicgenIntegrationTests::test_generate_text_prompt_sampling",
      "trace": "(line 1262)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1262)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen",
      "gpu": "multi",
      "test": "tests/models/musicgen/test_modeling_musicgen.py::MusicgenIntegrationTests::test_generate_unconditional_sampling",
      "trace": "(line 1179)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1179)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_audio_prompt",
      "trace": "(line 1307)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1307)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_greedy",
      "trace": "(line 1219)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1219)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_greedy_with_classifier_free_guidance",
      "trace": "(line 1247)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1247)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_text_prompt_sampling",
      "trace": "(line 1282)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1282)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_unconditional_greedy",
      "trace": "(line 1167)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1167)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyIntegrationTests::test_generate_unconditional_sampling",
      "trace": "(line 1192)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1192)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyStereoIntegrationTests::test_generate_text_audio_prompt",
      "trace": "(line 1376)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1376)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "musicgen_melody",
      "gpu": "multi",
      "test": "tests/models/musicgen_melody/test_modeling_musicgen_melody.py::MusicgenMelodyStereoIntegrationTests::test_generate_unconditional_greedy",
      "trace": "(line 1344)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1344)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "nemotron",
      "gpu": "multi",
      "test": "tests/models/nemotron/test_modeling_nemotron.py::NemotronIntegrationTest::test_nemotron_8b_generation_eager",
      "trace": "(line 103)  AssertionError: Lists differ: ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer'] != ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer: What is the name of the 19']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 103)  AssertionError: Lists differ: ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer'] != ['Wha[46 chars]er: Jupiter\\n\\nWhat is the answer: What is the name of the 19']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "nemotron",
      "gpu": "multi",
      "test": "tests/models/nemotron/test_modeling_nemotron.py::NemotronIntegrationTest::test_nemotron_8b_generation_fa2",
      "trace": "(line 1714)  ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package for FlashAttention2 doesn't seem to be installed.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1714)  ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package for FlashAttention2 doesn't seem to be installed.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "nllb_moe",
      "gpu": "multi",
      "test": "tests/models/nllb_moe/test_modeling_nllb_moe.py::NllbMoeModelIntegrationTests::test_inference_logits",
      "trace": "(line 399)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 399)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "multi",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_export_static_cache",
      "trace": "(line 338)  AssertionError: Lists differ: ['Sim[41 chars]that \\nthe speed of light is the same in all r[35 chars]ght'] != ['Sim[41 chars]that  .1.\\nThe theory of relativity states tha[18 chars] of']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 338)  AssertionError: Lists differ: ['Sim[41 chars]that \\nthe speed of light is the same in all r[35 chars]ght'] != ['Sim[41 chars]that  .1.\\nThe theory of relativity states tha[18 chars] of']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "multi",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_model_7b_greedy_generation",
      "trace": "(line 242)  AssertionError: 'Simp[40 chars]that \\nthe speed of light is the same for all [232 chars]\\n\\n' != 'Simp[40 chars]that  .1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1[20 chars].1.1'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 242)  AssertionError: 'Simp[40 chars]that \\nthe speed of light is the same for all [232 chars]\\n\\n' != 'Simp[40 chars]that  .1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1[20 chars].1.1'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "olmo",
      "gpu": "multi",
      "test": "tests/models/olmo/test_modeling_olmo.py::OlmoIntegrationTest::test_model_7b_logits",
      "trace": "(line 995)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 58.69 MiB is free. Process 126258 has 22.24 GiB memory in use. Of the allocated memory 21.76 GiB is allocated by PyTorch, and 75.75 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 58.69 MiB is free. Process 526265 has 22.24 GiB memory in use. Of the allocated memory 21.76 GiB is allocated by PyTorch, and 75.75 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "olmo2",
      "gpu": "multi",
      "test": "tests/models/olmo2/test_modeling_olmo2.py::Olmo2IntegrationTest::test_model_1b_logits_bfloat16",
      "trace": "(line 214)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 214)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "olmo3",
      "gpu": "multi",
      "test": "tests/models/olmo3/test_modeling_olmo3.py::Olmo3IntegrationTest::test_model_7b_logits",
      "trace": "(line 196)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 196)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "olmoe",
      "gpu": "multi",
      "test": "tests/models/olmoe/test_modeling_olmoe.py::OlmoeIntegrationTest::test_model_7b_logits",
      "trace": "(line 217)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 217)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "oneformer",
      "gpu": "multi",
      "test": "tests/models/oneformer/test_modeling_oneformer.py::OneFormerModelIntegrationTest::test_inference_no_head",
      "trace": "(line 507)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 507)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "oneformer",
      "gpu": "multi",
      "test": "tests/models/oneformer/test_modeling_oneformer.py::OneFormerModelIntegrationTest::test_inference_universal_segmentation_head",
      "trace": "(line 549)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 549)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "opt",
      "gpu": "multi",
      "test": "tests/models/opt/test_modeling_opt.py::OPTModelIntegrationTests::test_inference_no_head",
      "trace": "(line 357)  AssertionError: tensor([[-0.2883, -1.9219, -0.3079],",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 357)  AssertionError: tensor([[-0.2883, -1.9219, -0.3079],",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "ovis2",
      "gpu": "multi",
      "test": "tests/models/ovis2/test_modeling_ovis2.py::Ovis2IntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 355)  AssertionError: Lists differ: ['sys[81 chars]ant\\n', 'system\\nYou are a helpful assistant.\\[139 chars]et.'] != ['sys[81 chars]ant\\nAnswer: I see a brown dog standing on a w[224 chars]et.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 355)  AssertionError: Lists differ: ['sys[81 chars]ant\\n', 'system\\nYou are a helpful assistant.\\[139 chars]et.'] != ['sys[81 chars]ant\\nAnswer: I see a brown dog standing on a w[224 chars]et.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "multi",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_interpolate_pos_encoding",
      "trace": "(line 683)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 683)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "multi",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_object_detection",
      "trace": "(line 800)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 800)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "owlvit",
      "gpu": "multi",
      "test": "tests/models/owlvit/test_modeling_owlvit.py::OwlViTModelIntegrationTest::test_inference_one_shot_object_detection",
      "trace": "(line 843)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 843)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "pegasus",
      "gpu": "multi",
      "test": "tests/models/pegasus/test_modeling_pegasus.py::PegasusXSUMIntegrationTest::test_device_map",
      "trace": "(line 334)  RuntimeError: Expected all tensors to be on the same device, but got other is on cuda:1, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__equal)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 334)  RuntimeError: Expected all tensors to be on the same device, but got other is on cuda:1, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__equal)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "pegasus",
      "gpu": "multi",
      "test": "tests/models/pegasus/test_modeling_pegasus.py::PegasusXSUMIntegrationTest::test_pegasus_xsum_summary",
      "trace": "(line 350)  assert torch.Size([2, 422]) == (2, 421)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 350)  assert torch.Size([2, 422]) == (2, 421)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "persimmon",
      "gpu": "multi",
      "test": "tests/models/persimmon/test_modeling_persimmon.py::PersimmonIntegrationTest::test_model_8b_chat_greedy_generation",
      "trace": "(line 131)  AssertionError: 'huma[58 chars]ept: The theory of relativity states that the [80 chars]ion.' != 'huma[58 chars]ept: the speed of light in a vacuum is the sam[33 chars]ence'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 131)  AssertionError: 'huma[58 chars]ept: The theory of relativity states that the [80 chars]ion.' != 'huma[58 chars]ept: the speed of light in a vacuum is the sam[33 chars]ence'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "persimmon",
      "gpu": "multi",
      "test": "tests/models/persimmon/test_modeling_persimmon.py::PersimmonIntegrationTest::test_model_8b_chat_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "phi3",
      "gpu": "multi",
      "test": "tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_export_static_cache",
      "trace": "(line 1318)  torch._dynamo.exc.Unsupported: Data-dependent branching",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1318)  torch._dynamo.exc.Unsupported: Data-dependent branching",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "phimoe",
      "gpu": "multi",
      "test": "tests/models/phimoe/test_modeling_phimoe.py::PhimoeIntegrationTest::test_model_phimoe_instruct_logits",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "phimoe",
      "gpu": "multi",
      "test": "tests/models/phimoe/test_modeling_phimoe.py::PhimoeIntegrationTest::test_phimoe_instruct_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "phimoe",
      "gpu": "multi",
      "test": "tests/models/phimoe/test_modeling_phimoe.py::PhimoeIntegrationTest::test_phimoe_instruct_with_static_cache",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "pi0",
      "gpu": "multi",
      "test": "tests/models/pi0/test_modeling_pi0.py::PI0ModelIntegrationTest::test_train_pi0_base_libero",
      "trace": "(line 785)  torch.OutOfMemoryError: Caught OutOfMemoryError in replica 0 on device 0.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 785)  torch.OutOfMemoryError: Caught OutOfMemoryError in replica 0 on device 0.",
      "first_failure_day": "2026-03-17",
      "last_green_day": "2026-03-16",
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "pixio",
      "gpu": "multi",
      "test": "tests/models/pixio/test_modeling_pixio.py::PixioModelIntegrationTest::test_inference_no_head",
      "trace": "(line 277)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 277)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "multi",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartJavaCsIntegrationTest::test_java_cs_generate_batch",
      "trace": "(line 379)  AssertionError: assert ['public int ...turn a * b *'] == ['public int ...rn a * b * c']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 379)  AssertionError: assert ['public int ...turn a * b *'] == ['public int ...rn a * b * c']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "multi",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartJavaCsIntegrationTest::test_java_cs_generate_one",
      "trace": "(line 370)  AssertionError: 'public int maximum(int a, int b, int c){return Math.Max(' != 'public int maximum(int a, int b, int c){return Math.Max(a'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  AssertionError: 'public int maximum(int a, int b, int c){return Math.Max(' != 'public int maximum(int a, int b, int c){return Math.Max(a'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "plbart",
      "gpu": "multi",
      "test": "tests/models/plbart/test_modeling_plbart.py::PLBartBaseIntegrationTest::test_fill_mask",
      "trace": "(line 444)  AssertionError: '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0' != '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 444)  AssertionError: '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0' != '0 0 the 0 the 0 the 0 the 0 the 0 the 0 the 0 the'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "pvt",
      "gpu": "multi",
      "test": "tests/models/pvt/test_modeling_pvt.py::PvtModelIntegrationTest::test_inference_image_classification",
      "trace": "(line 257)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 257)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "pvt",
      "gpu": "multi",
      "test": "tests/models/pvt/test_modeling_pvt.py::PvtModelIntegrationTest::test_inference_model",
      "trace": "(line 284)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 284)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "pvt_v2",
      "gpu": "multi",
      "test": "tests/models/pvt_v2/test_modeling_pvt_v2.py::PvtV2ModelIntegrationTest::test_inference_image_classification",
      "trace": "(line 275)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 275)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "pvt_v2",
      "gpu": "multi",
      "test": "tests/models/pvt_v2/test_modeling_pvt_v2.py::PvtV2ModelIntegrationTest::test_inference_model",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen2_5_omni",
      "gpu": "multi",
      "test": "tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test",
      "trace": "(line 692)  AssertionError: \"syst[108 chars]d is glass shattering, and the dog is a Labrador Retriever.\" != \"syst[108 chars]d is a glass shattering. The dog in the pictur[22 chars]ver.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 692)  AssertionError: \"syst[108 chars]d is glass shattering, and the dog is a Labrador Retriever.\" != \"syst[108 chars]d is a glass shattering. The dog in the pictur[22 chars]ver.\"",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen2_5_omni",
      "gpu": "multi",
      "test": "tests/models/qwen2_5_omni/test_modeling_qwen2_5_omni.py::Qwen2_5OmniModelIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 734)  AssertionError: Lists differ: [\"sys[109 chars]d is glass shattering, and the dog is a Labrad[185 chars]er.\"] != [\"sys[109 chars]d is a glass shattering. The dog in the pictur[211 chars]er.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 734)  AssertionError: Lists differ: [\"sys[109 chars]d is glass shattering, and the dog is a Labrad[185 chars]er.\"] != [\"sys[109 chars]d is a glass shattering. The dog in the pictur[211 chars]er.\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen2_5_vl",
      "gpu": "multi",
      "test": "tests/models/qwen2_5_vl/test_modeling_qwen2_5_vl.py::Qwen2_5_VLIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 611)  AssertionError: Lists differ: ['sys[298 chars]en, a large language model created by Alibaba [84 chars]and'] != ['sys[298 chars]en, an AI language model created by Alibaba Cl[96 chars]on,']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 611)  AssertionError: Lists differ: ['sys[298 chars]en, a large language model created by Alibaba [84 chars]and'] != ['sys[298 chars]en, an AI language model created by Alibaba Cl[96 chars]on,']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen2_moe",
      "gpu": "multi",
      "test": "tests/models/qwen2_moe/test_modeling_qwen2_moe.py::Qwen2MoeIntegrationTest::test_model_a2_7b_logits",
      "trace": "(line 147)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 147)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen2_moe",
      "gpu": "multi",
      "test": "tests/models/qwen2_moe/test_modeling_qwen2_moe.py::Qwen2MoeIntegrationTest::test_speculative_generation",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3",
      "gpu": "multi",
      "test": "tests/models/qwen3/test_modeling_qwen3.py::Qwen3IntegrationTest::test_model_600m_logits",
      "trace": "(line 92)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 92)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen3",
      "gpu": "multi",
      "test": "tests/models/qwen3/test_modeling_qwen3.py::Qwen3IntegrationTest::test_speculative_generation",
      "trace": "(line 198)  AssertionError: 'My f[22 chars]100% beef, 100% beef, 100% beef.' != 'My f[22 chars]100% vegetable oil. It has a rich, creamy text[19 chars]utty'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 198)  AssertionError: 'My f[22 chars]100% beef, 100% beef, 100% beef.' != 'My f[22 chars]100% vegetable oil. It has a rich, creamy text[19 chars]utty'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen3_5",
      "gpu": "multi",
      "test": "tests/models/qwen3_5/test_modeling_qwen3_5.py::Qwen3_5IntegrationTest::test_model_video_generation",
      "trace": "(line 811)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 811)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen3_5",
      "gpu": "multi",
      "test": "tests/models/qwen3_5/test_modeling_qwen3_5.py::Qwen3_5IntegrationTest::test_model_video_generation_batch",
      "trace": "(line 863)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 863)  AssertionError: Lists differ: [248045, 846, 198, 27, 15, 13, 18, 6283, 29, 248053] != [248045, 846, 198, 248053, 27, 15, 13, 18, 6283, 29]",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "qwen3_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_moe/test_modeling_qwen3_moe.py::Qwen3MoeIntegrationTest::test_model_15b_a2b_generation",
      "trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "qwen3_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_moe/test_modeling_qwen3_moe.py::Qwen3MoeIntegrationTest::test_model_15b_a2b_logits",
      "trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "qwen3_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_moe/test_modeling_qwen3_moe.py::Qwen3MoeIntegrationTest::test_model_15b_a2b_long_prompt_sdpa",
      "trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "qwen3_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_moe/test_modeling_qwen3_moe.py::Qwen3MoeIntegrationTest::test_speculative_generation",
      "trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 74)  ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "load_error",
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test_multiturn",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_omni_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_omni_moe/test_modeling_qwen3_omni_moe.py::Qwen3OmniModelIntegrationTest::test_small_model_integration_test_w_audio",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_batch",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_batch_different_resolutions",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 991)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 768.00 MiB. GPU 1 has a total capacity of 22.30 GiB of which 240.69 MiB is free. Process 974773 has 22.06 GiB memory in use. Of the allocated memory 20.87 GiB is allocated by PyTorch, and 814.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_batch_wo_image",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_expand",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_expand_with_video",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "qwen3_vl_moe",
      "gpu": "multi",
      "test": "tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeIntegrationTest::test_small_model_integration_test_with_video",
      "trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 273)  RuntimeError: We encountered some issues during automatic conversion of the weights. For details look at the `CONVERSION` entries of the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "multi",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_2b_generate",
      "trace": "(line 157)  AssertionError: Lists differ: ['Hel[325 chars]oday the 19th of June 2019, I was in the offic[256 chars] to'] != ['Hel[325 chars]oday is a new app that allows you to make mone[256 chars]app']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 157)  AssertionError: Lists differ: ['Hel[325 chars]oday the 19th of June 2019, I was in the offic[256 chars] to'] != ['Hel[325 chars]oday is a new app that allows you to make mone[256 chars]app']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "multi",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_2b_sample",
      "trace": "(line 195)  AssertionError: Lists differ: ['Wha[24 chars]Deep Learning (or deep learning) is one of the[107 chars]ple'] != ['Wha[24 chars]Deep learning is the next frontier in computer[98 chars] is']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 195)  AssertionError: Lists differ: ['Wha[24 chars]Deep Learning (or deep learning) is one of the[107 chars]ple'] != ['Wha[24 chars]Deep learning is the next frontier in computer[98 chars] is']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "multi",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_longer_than_window",
      "trace": "(line 243)  AssertionError: Lists differ: [' Jean-Philippe Guillet said, \"We have no[245 chars]eo.'] != [\" Robin's comments follow claims by two m[249 chars]the\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 243)  AssertionError: Lists differ: [' Jean-Philippe Guillet said, \"We have no[245 chars]eo.'] != [\" Robin's comments follow claims by two m[249 chars]the\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "recurrent_gemma",
      "gpu": "multi",
      "test": "tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py::RecurrentGemmaIntegrationTest::test_model_2b_8bit",
      "trace": "(line 222)  AssertionError: Lists differ: ['Hel[26 chars] the effects of the environment on the human b[124 chars]aur\"] != ['Hel[26 chars] the topic of \"The impact of social media on t[102 chars] 3D\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 222)  AssertionError: Lists differ: ['Hel[26 chars] the effects of the environment on the human b[124 chars]aur\"] != ['Hel[26 chars] the topic of \"The impact of social media on t[102 chars] 3D\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "reformer",
      "gpu": "multi",
      "test": "tests/models/reformer/test_modeling_reformer.py::ReformerIntegrationTests::test_pretrained_generate_crime_and_punish",
      "trace": "(line 1370)  AssertionError: 'A fe[36 chars]is ideas, so attentively two or three thousand roubles, and' != 'A fe[36 chars]is ideas, at the first entrance. He was positively for an inst'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1370)  AssertionError: 'A fe[36 chars]is ideas, so attentively two or three thousand roubles, and' != 'A fe[36 chars]is ideas, at the first entrance. He was positively for an inst'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "regnet",
      "gpu": "multi",
      "test": "tests/models/regnet/test_modeling_regnet.py::RegNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 243)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 243)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "resnet",
      "gpu": "multi",
      "test": "tests/models/resnet/test_modeling_resnet.py::ResNetModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 291)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 291)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "single",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_batched_images",
      "trace": "(line 1267)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1267)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "single",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_batched_mixed_prompts",
      "trace": "(line 1369)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1369)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "single",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_multi_box_prompt",
      "trace": "(line 1168)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1168)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "single",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_single_box_prompt",
      "trace": "(line 1097)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1097)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "single",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_text_prompt_only",
      "trace": "(line 1025)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1025)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "multi",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_batched_images",
      "trace": "(line 1267)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1267)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "multi",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_batched_mixed_prompts",
      "trace": "(line 1369)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1369)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "multi",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_multi_box_prompt",
      "trace": "(line 1168)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1168)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "multi",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_single_box_prompt",
      "trace": "(line 1097)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1097)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "sam3",
      "gpu": "multi",
      "test": "tests/models/sam3/test_modeling_sam3.py::Sam3ModelIntegrationTest::test_inference_text_prompt_only",
      "trace": "(line 1025)  AssertionError: Tensor-likes are not close!",
      "days_seen": 5,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-04",
      "latest_trace": "(line 1025)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_speech_to_speech_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_speech_to_text_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seamless_m4t",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t/test_modeling_seamless_m4t.py::SeamlessM4TModelIntegrationTest::test_to_rus_speech",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_speech_to_speech_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_speech_to_text_model",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seamless_m4t_v2",
      "gpu": "multi",
      "test": "tests/models/seamless_m4t_v2/test_modeling_seamless_m4t_v2.py::SeamlessM4Tv2ModelIntegrationTest::test_to_rus_speech",
      "trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 281)  ValueError: Invalid input type. Must be a single audio or a list of audio",
      "first_failure_day": "2026-05-20",
      "last_green_day": "2026-05-19",
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "seed_oss",
      "gpu": "multi",
      "test": "tests/models/seed_oss/test_modeling_seed_oss.py::SeedOssIntegrationTest::test_model_36b_eager",
      "trace": "(line 95)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 95)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "seed_oss",
      "gpu": "multi",
      "test": "tests/models/seed_oss/test_modeling_seed_oss.py::SeedOssIntegrationTest::test_model_36b_sdpa",
      "trace": "(line 114)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 114)  AssertionError: Lists differ: ['How[132 chars]ing to use the ByteDance-Seed dataset for my research. I have'] != ['How[132 chars]ing to run the code on the <beginning of the code>seed']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "smollm3",
      "gpu": "multi",
      "test": "tests/models/smollm3/test_modeling_smollm3.py::SmolLM3IntegrationTest::test_export_static_cache",
      "trace": "(line 198)  AssertionError: 'Gravity is the force that pulls objects [69 chars] and' != [\"Gravity is the force that pulls objects[85 chars] of\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 198)  AssertionError: 'Gravity is the force that pulls objects [69 chars] and' != [\"Gravity is the force that pulls objects[85 chars] of\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "smollm3",
      "gpu": "multi",
      "test": "tests/models/smollm3/test_modeling_smollm3.py::SmolLM3IntegrationTest::test_model_3b_logits",
      "trace": "(line 89)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 89)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "stablelm",
      "gpu": "multi",
      "test": "tests/models/stablelm/test_modeling_stablelm.py::StableLmModelIntegrationTest::test_model_stablelm_3b_4e1t_logits",
      "trace": "(line 65)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 65)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "stablelm",
      "gpu": "multi",
      "test": "tests/models/stablelm/test_modeling_stablelm.py::StableLmModelIntegrationTest::test_model_tiny_random_stablelm_2_logits",
      "trace": "(line 98)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 98)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "multi",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_4bit",
      "trace": "(line 152)  AssertionError: Lists differ: ['Hel[188 chars]of', 'def hello_world():\\n\\treturn \"Hello Worl[95 chars]ute'] != ['Hel[188 chars]of', \"def hello_world(): hello_world():\\n    r[117 chars]'})\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 152)  AssertionError: Lists differ: ['Hel[188 chars]of', 'def hello_world():\\n\\treturn \"Hello Worl[95 chars]ute'] != ['Hel[188 chars]of', \"def hello_world(): hello_world():\\n    r[117 chars]'})\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "multi",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_eager",
      "trace": "(line 99)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 99)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "starcoder2",
      "gpu": "multi",
      "test": "tests/models/starcoder2/test_modeling_starcoder2.py::Starcoder2IntegrationTest::test_starcoder2_batched_generation_sdpa",
      "trace": "(line 79)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 79)  AssertionError: Lists differ: ['Hel[223 chars]ld():\\n\\treturn 'Hello World!'\\n\\n@app.route('[72 chars]app\"] != ['Hel[223 chars]ld(): hello_world():\\n    return 'Hello World![87 chars]n\\n\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "superpoint",
      "gpu": "multi",
      "test": "tests/models/superpoint/test_modeling_superpoint.py::SuperPointModelIntegrationTest::test_inference",
      "trace": "(line 4178)  UnboundLocalError: local variable 'output' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4194)  UnboundLocalError: local variable 'output' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "swiftformer",
      "gpu": "multi",
      "test": "tests/models/swiftformer/test_modeling_swiftformer.py::SwiftFormerModelIntegrationTest::test_inference_image_classification_head",
      "trace": "(line 263)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 263)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "swin2sr",
      "gpu": "multi",
      "test": "tests/models/swin2sr/test_modeling_swin2sr.py::Swin2SRModelIntegrationTest::test_inference_fp16",
      "trace": "(line 332)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 332)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "swinv2",
      "gpu": "multi",
      "test": "tests/models/swinv2/test_modeling_swinv2.py::Swinv2ModelIntegrationTest::test_inference_fp16",
      "trace": "(line 492)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 492)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "t5gemma2",
      "gpu": "multi",
      "test": "tests/models/t5gemma2/test_modeling_t5gemma2.py::T5Gemma2IntegrationTest::test_model_generation_batch_270m",
      "trace": "(line 1128)  AssertionError: Lists differ: [' a [83 chars]e UK.\\n\\nThe bumblebee is a species of bee tha[15 chars]the'] != [' a [83 chars]e UK.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1128)  AssertionError: Lists differ: [' a [83 chars]e UK.\\n\\nThe bumblebee is a species of bee tha[15 chars]the'] != [' a [83 chars]e UK.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "table_transformer",
      "gpu": "multi",
      "test": "tests/models/table_transformer/test_modeling_table_transformer.py::TableTransformerModelIntegrationTests::test_table_detection",
      "trace": "(line 554)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 554)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "univnet",
      "gpu": "multi",
      "test": "tests/models/univnet/test_modeling_univnet.py::UnivNetModelIntegrationTests::test_integration",
      "trace": "(line 330)  AssertionError: Scalars are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 330)  AssertionError: Scalars are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "video_llava",
      "gpu": "multi",
      "test": "tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_llama",
      "trace": "(line 491)  AssertionError: 'USER: \\nDescribe the video in details. A[572 chars]ion.' != \"USER: \\nDescribe the video in details. A[675 chars]ing.\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 491)  AssertionError: 'USER: \\nDescribe the video in details. A[572 chars]ion.' != \"USER: \\nDescribe the video in details. A[675 chars]ing.\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "video_llava",
      "gpu": "multi",
      "test": "tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationIntegrationTest::test_small_model_integration_test_mixed_inputs",
      "trace": "(line 464)  AssertionError: Lists differ: ['USE[183 chars]se it shows a baby sitting on a bed and reading a book. The'] != ['USE[183 chars]se it shows a baby sitting on a bed and reading a book, which']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 464)  AssertionError: Lists differ: ['USE[183 chars]se it shows a baby sitting on a bed and reading a book. The'] != ['USE[183 chars]se it shows a baby sitting on a bed and reading a book, which']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "videomae",
      "gpu": "multi",
      "test": "tests/models/videomae/test_modeling_videomae.py::VideoMAEModelIntegrationTest::test_inference_for_pretraining",
      "trace": "(line 478)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 478)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "videomae",
      "gpu": "multi",
      "test": "tests/models/videomae/test_modeling_videomae.py::VideoMAEModelIntegrationTest::test_inference_for_video_classification",
      "trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 453)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-09",
      "last_green_day": "2026-04-08",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vilt",
      "gpu": "multi",
      "test": "tests/models/vilt/test_modeling_vilt.py::ViltModelIntegrationTest::test_inference_masked_lm",
      "trace": "(line 575)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 575)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "multi",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_cordv2",
      "trace": "(line 1352)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1352)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "multi",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_docvqa",
      "trace": "(line 1288)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1288)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "multi",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest::test_inference_rvlcdip",
      "trace": "(line 1414)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1414)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "multi",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::NougatModelIntegrationTest::test_forward_pass",
      "trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1bbed4-3259536e7823a1d340ec553f;368cb945-c528-48da-82ce-6b72569aeb39)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a23a6f2-7e0a49fc3ceba5e06f01b5b4;57eb15be-844c-4768-80b8-bf1677b4269d)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "vision_encoder_decoder",
      "gpu": "multi",
      "test": "tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::NougatModelIntegrationTest::test_generation",
      "trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a1bbed5-51c1d7ee45469dfc424010e3;797e54bd-8796-4160-8e73-9b83066ca1be)",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 781)  huggingface_hub.errors.RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-6a23a6f3-3ed9371b51973d3239249ef9;80845597-4578-4a78-a4b9-bbe0c25a3f5c)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "vits",
      "gpu": "multi",
      "test": "tests/models/vits/test_modeling_vits.py::VitsModelIntegrationTests::test_forward_fp16",
      "trace": "(line 433)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 433)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "vivit",
      "gpu": "multi",
      "test": "tests/models/vivit/test_modeling_vivit.py::VivitModelIntegrationTest::test_inference_for_video_classification",
      "trace": "(line 361)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 361)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-04-06",
      "last_green_day": "2026-04-05",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "multi",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_multi_turn_text_and_audio",
      "trace": "(line 381)  AssertionError: Lists differ: ['Des[790 chars]as a farewell address by a president, reflecti[151 chars]xt.'] != ['Des[790 chars]as a political speech by a president, reflecti[151 chars]xt.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 381)  AssertionError: Lists differ: ['Des[790 chars]as a farewell address by a president, reflecti[151 chars]xt.'] != ['Des[790 chars]as a political speech by a president, reflecti[151 chars]xt.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "multi",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_audio_only",
      "trace": "(line 163)  AssertionError: Lists differ: ['The[442 chars]king what A\\'s tattoo says, and A always respo[777 chars]nt.'] != ['The[442 chars]king A what his tattoo says, and A always resp[884 chars]on.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 163)  AssertionError: Lists differ: ['The[442 chars]king what A\\'s tattoo says, and A always respo[777 chars]nt.'] != ['The[442 chars]king A what his tattoo says, and A always resp[884 chars]on.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "multi",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_text_and_audio",
      "trace": "(line 203)  AssertionError: Lists differ: [\"Wha[241 chars]. He expresses gratitude for the conversations[429 chars]en.\"] != [\"Wha[241 chars]. He acknowledges the diverse perspectives and[412 chars]es.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 203)  AssertionError: Lists differ: [\"Wha[241 chars]. He expresses gratitude for the conversations[429 chars]en.\"] != [\"Wha[241 chars]. He acknowledges the diverse perspectives and[412 chars]es.\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "voxtral",
      "gpu": "multi",
      "test": "tests/models/voxtral/test_modeling_voxtral.py::VoxtralForConditionalGenerationIntegrationTest::test_mini_single_turn_text_and_multiple_audios_batched",
      "trace": "(line 327)  AssertionError: Lists differ: [\"Who[609 chars]m is likely the Seattle Mariners, as the comme[446 chars]me.'] != [\"Who[609 chars]m is the Mariners, and the commentator is exci[414 chars]nt.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 327)  AssertionError: Lists differ: [\"Who[609 chars]m is likely the Seattle Mariners, as the comme[446 chars]me.'] != [\"Who[609 chars]m is the Mariners, and the commentator is exci[414 chars]nt.']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "voxtral_realtime",
      "gpu": "multi",
      "test": "tests/models/voxtral_realtime/test_modeling_voxtral_realtime.py::VoxtralRealtimeForConditionalGenerationIntegrationTest::test_batched_longform",
      "trace": "(line 349)  AssertionError: Lists differ: [' Come on! Dude. You got a tattoo. So did you, dud[1097 chars]the\"] != [' Come on. Dude. You got a tattoo. So did you, dud[1097 chars]the\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 349)  AssertionError: Lists differ: [' Come on! Dude. You got a tattoo. So did you, dud[1097 chars]the\"] != [' Come on. Dude. You got a tattoo. So did you, dud[1097 chars]the\"]",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_distil_token_timestamp_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_batched_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_generation_multilingual",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_large_timestamp_generation",
      "trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 370)  RuntimeError: Input type (float) and bias type (c10::Half) should be the same",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_longform_timestamps_generation",
      "trace": "(line 1882)  KeyError: 0",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1882)  KeyError: 0",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_small_token_timestamp_generation",
      "trace": "(line 2023)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2023)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_speculative_decoding_distil",
      "trace": "(line 323)  UnboundLocalError: local variable 'is_updated' referenced before assignment",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 323)  UnboundLocalError: local variable 'is_updated' referenced before assignment",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_speculative_decoding_non_distil",
      "trace": "(line 2390)  AssertionError: Lists differ: [' Mr[35 chars]dle classes and we are glad to welcome his gospel. Thank you.'] != [' Mr[35 chars]dle classes and we are glad to welcome his gospel.']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2390)  AssertionError: Lists differ: [' Mr[35 chars]dle classes and we are glad to welcome his gospel. Thank you.'] != [' Mr[35 chars]dle classes and we are glad to welcome his gospel.']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_en_batched_generation",
      "trace": "(line 1541)  AssertionError: The values for attribute 'shape' do not match: torch.Size([4, 18]) != torch.Size([4, 20]).",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1541)  AssertionError: The values for attribute 'shape' do not match: torch.Size([4, 18]) != torch.Size([4, 20]).",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_en_generation",
      "trace": "(line 1383)  AssertionError: ' Mr.[15 chars] apostle of the middle classes, and we are glad to' != ' Mr.[15 chars] apostle of the middle classes, and we are glad to welcome his'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1383)  AssertionError: ' Mr.[15 chars] apostle of the middle classes, and we are glad to' != ' Mr.[15 chars] apostle of the middle classes, and we are glad to welcome his'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_generation",
      "trace": "(line 1399)  AssertionError: ' Mr.[21 chars]le of the middle classes and we are glad' != ' Mr.[21 chars]le of the middle classes and we are glad to welcome his gospel'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1399)  AssertionError: ' Mr.[21 chars]le of the middle classes and we are glad' != ' Mr.[21 chars]le of the middle classes and we are glad to welcome his gospel'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_longform_timestamps_generation",
      "trace": "(line 1698)  KeyError: 0",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1698)  KeyError: 0",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_specaugment_librispeech",
      "trace": "(line 2137)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2137)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_static_generation_long_form",
      "trace": "(line 3098)  RuntimeError: The size of tensor a (352) must match the size of tensor b (354) at non-singleton dimension 1",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 3098)  RuntimeError: The size of tensor a (352) must match the size of tensor b (354) at non-singleton dimension 1",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_tiny_timestamp_generation",
      "trace": "(line 4160)  IndexError: list index out of range",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4176)  IndexError: list index out of range",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard",
      "trace": "(line 2787)  AssertionError: Lists differ: [\" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!'] != [\" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2787)  AssertionError: Lists differ: [\" Fo[272 chars]ting of classics, Sicilian, nade door variatio[8147 chars]le!'] != [\" Fo[272 chars]ting a classic Sicilian, nade door variation o[8150 chars]le!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_multi_batch_hard_prev_cond",
      "trace": "(line 2841)  AssertionError: Lists differ: [\" Fo[425 chars]a fischer shows in lip nitskey attack the fisc[5579 chars]y .\"] != [\" Fo[425 chars]a fisher shows in lip-nitsky attack that culmi[7900 chars]le!\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2841)  AssertionError: Lists differ: [\" Fo[425 chars]a fischer shows in lip nitskey attack the fisc[5579 chars]y .\"] != [\" Fo[425 chars]a fisher shows in lip-nitsky attack that culmi[7900 chars]le!\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_no_speech_detection",
      "trace": "(line 2947)  AssertionError: Lists differ: [\" Fo[435 chars]sting And so so so so so so so so so so so so [7329 chars]our\"] != [\" Fo[435 chars]sting\", ' Ladies and gentlemen, you know, I sp[1433 chars]es.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2947)  AssertionError: Lists differ: [\" Fo[435 chars]sting And so so so so so so so so so so so so [7329 chars]our\"] != [\" Fo[435 chars]sting\", ' Ladies and gentlemen, you know, I sp[1433 chars]es.\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_longform_single_batch",
      "trace": "(line 294)  TypeError: '>=' not supported between instances of 'list' and 'int'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 294)  TypeError: '>=' not supported between instances of 'list' and 'int'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "whisper",
      "gpu": "multi",
      "test": "tests/models/whisper/test_modeling_whisper.py::WhisperModelIntegrationTests::test_whisper_shortform_single_batch_prev_cond",
      "trace": "(line 2556)  AssertionError: Lists differ: [\" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.\"] != [\" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.\"]",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2556)  AssertionError: Lists differ: [\" Fo[268 chars]ating, so soft, it would make JD power and her[196 chars]ke.\"] != [\" Fo[268 chars]ating so soft, it would make JD power and her [195 chars]ke.\"]",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "zamba",
      "gpu": "multi",
      "test": "tests/models/zamba/test_modeling_zamba.py::ZambaModelIntegrationTest::test_simple_batched_generate_with_padding",
      "trace": "(line 518)  AssertionError: '[PAD[35 chars]me a story about a time when you had to make a difficult' != '[PAD[35 chars]me a story about a time when you were in a difficult situation'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 476)  AssertionError: '<s> [20 chars]g on this lovely evening? I hope you are having a great day. I' != '<s> [20 chars]g on this lovely evening? I hope you are all doing well. I am'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "zamba",
      "gpu": "multi",
      "test": "tests/models/zamba/test_modeling_zamba.py::ZambaModelIntegrationTest::test_simple_generate",
      "trace": "(line 501)  AssertionError: The values for attribute 'dtype' do not match: torch.bfloat16 != torch.float32.",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 463)  AssertionError: The values for attribute 'dtype' do not match: torch.bfloat16 != torch.float32.",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "zamba2",
      "gpu": "multi",
      "test": "tests/models/zamba2/test_modeling_zamba2.py::Zamba2ModelIntegrationTest::test_simple_batched_generate_with_padding_0_cuda",
      "trace": "(line 600)  AssertionError: Tensor-likes are not close!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 600)  AssertionError: Tensor-likes are not close!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_TopH_example_integration",
      "trace": "(line 3212)  AssertionError: Lists differ: ['Tel[23 chars]key. Sure, here\\'s one for you:\\n\\nWhy did the[67 chars]s\"!'] != ['Tel[23 chars]key. Why did the monkey go to the doctor? Beca[34 chars]c\"!']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 3215)  AssertionError: Lists differ: ['Tel[23 chars]key. Sure, here\\'s one for you:\\n\\nWhy did the[67 chars]s\"!'] != ['Tel[23 chars]key. Why did the monkey go to the doctor? Beca[34 chars]c\"!']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_assisted_generation_early_exit",
      "trace": "(line 4074)  AssertionError: Lists differ: ['Ali[20 chars]ng a game of poker. Alice has a pair of 7s and Bob has a pair'] != ['Ali[20 chars]ng a game of poker. Alice has a pair of 8s and Bob has a pair']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 4077)  AssertionError: Lists differ: ['Ali[20 chars]ng a game of poker. Alice has a pair of 7s and Bob has a pair'] != ['Ali[20 chars]ng a game of poker. Alice has a pair of 8s and Bob has a pair']",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_beam_search_advanced_stopping_criteria",
      "trace": "(line 681)  AssertionError: True is not false",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 681)  AssertionError: True is not false",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_beam_search_early_stop_heuristic",
      "trace": "(line 2962)  AssertionError: \"<|us[317 chars]}\\\\).\\nThe sum of 3 and 5 is \\\\(3 + 5 = 8\\\\).\\[40 chars]\\\\).\" != \"<|us[317 chars]}\\\\).\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2965)  AssertionError: \"<|us[317 chars]}\\\\).\\nThe sum of 3 and 5 is \\\\(3 + 5 = 8\\\\).\\[40 chars]\\\\).\" != \"<|us[317 chars]}\\\\).\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_cache_device_map_with_vision_layer_device_map",
      "trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.vision_tower.embeddings.patch_embedding.weight, model.vision_tower.embeddings.patch_embedding.bias, model.vision_tower.embeddings.position_embedding.weight, model.vision_tower.encoder.layers.0.layer_norm1.weight, model.vision_tower.encoder.layers.0.layer_norm1.bias, model.vision_tower.encoder.layers.0.self_attn.k_proj.weight, model.vision_tower.encoder.layers.0.self_attn.k_proj.bias, model.vision_tower.encoder.layers.0.self_attn.v_proj.weight, model.vision_tower.encoder.layers.0.self_attn.v_proj.bias, model.vision_tower.encoder.layers.0.self_attn.q_proj.weight, model.vision_tower.encoder.layers.0.self_attn.q_proj.bias, model.vision_tower.encoder.layers.0.self_attn.out_proj.weight, model.vision_tower.encoder.layers.0.self_attn.out_proj.bias, model.vision_tower.encoder.layers.0.layer_norm2.weight, model.vision_tower.encoder.layers.0.layer_norm2.bias, model.vision_tower.encoder.layers.0.mlp.fc1.weight, model.vision_tower.encoder.layers.0.mlp.fc1.bias, model.vision_tower.encoder.layers.0.mlp.fc2.weight, model.vision_tower.encoder.layers.0.mlp.fc2.bias, model.vision_tower.encoder.layers.1.layer_norm1.weight, model.vision_tower.encoder.layers.1.layer_norm1.bias, model.vision_tower.encoder.layers.1.self_attn.k_proj.weight, model.vision_tower.encoder.layers.1.self_attn.k_proj.bias, model.vision_tower.encoder.layers.1.self_attn.v_proj.weight, model.vision_tower.encoder.layers.1.self_attn.v_proj.bias, model.vision_tower.encoder.layers.1.self_attn.q_proj.weight, model.vision_tower.encoder.layers.1.self_attn.q_proj.bias, model.vision_tower.encoder.layers.1.self_attn.out_proj.weight, model.vision_tower.encoder.layers.1.self_attn.out_proj.bias, model.vision_tower.encoder.layers.1.layer_norm2.weight, model.vision_tower.encoder.layers.1.layer_norm2.bias, model.vision_tower.encoder.layers.1.mlp.fc1.weight, model.vision_tower.encoder.layers.1.mlp.fc1.bias, model.vision_tower.encoder.layers.1.mlp.fc2.weight, model.vision_tower.encoder.layers.1.mlp.fc2.bias, model.vision_tower.encoder.layers.2.layer_norm1.weight, model.vision_tower.encoder.layers.2.layer_norm1.bias, model.vision_tower.encoder.layers.2.self_attn.k_proj.weight, model.vision_tower.encoder.layers.2.self_attn.k_proj.bias, model.vision_tower.encoder.layers.2.self_attn.v_proj.weight, model.vision_tower.encoder.layers.2.self_attn.v_proj.bias, model.vision_tower.encoder.layers.2.self_attn.q_proj.weight, model.vision_tower.encoder.layers.2.self_attn.q_proj.bias, model.vision_tower.encoder.layers.2.self_attn.out_proj.weight, model.vision_tower.encoder.layers.2.self_attn.out_proj.bias, model.vision_tower.encoder.layers.2.layer_norm2.weight, model.vision_tower.encoder.layers.2.layer_norm2.bias, model.vision_tower.encoder.layers.2.mlp.fc1.weight, model.vision_tower.encoder.layers.2.mlp.fc1.bias, model.vision_tower.encoder.layers.2.mlp.fc2.weight, model.vision_tower.encoder.layers.2.mlp.fc2.bias, model.vision_tower.encoder.layers.3.layer_norm1.weight, model.vision_tower.encoder.layers.3.layer_norm1.bias, model.vision_tower.encoder.layers.3.self_attn.k_proj.weight, model.vision_tower.encoder.layers.3.self_attn.k_proj.bias, model.vision_tower.encoder.layers.3.self_attn.v_proj.weight, model.vision_tower.encoder.layers.3.self_attn.v_proj.bias, model.vision_tower.encoder.layers.3.self_attn.q_proj.weight, model.vision_tower.encoder.layers.3.self_attn.q_proj.bias, model.vision_tower.encoder.layers.3.self_attn.out_proj.weight, model.vision_tower.encoder.layers.3.self_attn.out_proj.bias, model.vision_tower.encoder.layers.3.layer_norm2.weight, model.vision_tower.encoder.layers.3.layer_norm2.bias, model.vision_tower.encoder.layers.3.mlp.fc1.weight, model.vision_tower.encoder.layers.3.mlp.fc1.bias, model.vision_tower.encoder.layers.3.mlp.fc2.weight, model.vision_tower.encoder.layers.3.mlp.fc2.bias, model.vision_tower.encoder.layers.4.layer_norm1.weight, model.vision_tower.encoder.layers.4.layer_norm1.bias, model.vision_tower.encoder.layers.4.self_attn.k_proj.weight, model.vision_tower.encoder.layers.4.self_attn.k_proj.bias, model.vision_tower.encoder.layers.4.self_attn.v_proj.weight, model.vision_tower.encoder.layers.4.self_attn.v_proj.bias, model.vision_tower.encoder.layers.4.self_attn.q_proj.weight, model.vision_tower.encoder.layers.4.self_attn.q_proj.bias, model.vision_tower.encoder.layers.4.self_attn.out_proj.weight, model.vision_tower.encoder.layers.4.self_attn.out_proj.bias, model.vision_tower.encoder.layers.4.layer_norm2.weight, model.vision_tower.encoder.layers.4.layer_norm2.bias, model.vision_tower.encoder.layers.4.mlp.fc1.weight, model.vision_tower.encoder.layers.4.mlp.fc1.bias, model.vision_tower.encoder.layers.4.mlp.fc2.weight, model.vision_tower.encoder.layers.4.mlp.fc2.bias, model.vision_tower.encoder.layers.5.layer_norm1.weight, model.vision_tower.encoder.layers.5.layer_norm1.bias, model.vision_tower.encoder.layers.5.self_attn.k_proj.weight, model.vision_tower.encoder.layers.5.self_attn.k_proj.bias, model.vision_tower.encoder.layers.5.self_attn.v_proj.weight, model.vision_tower.encoder.layers.5.self_attn.v_proj.bias, model.vision_tower.encoder.layers.5.self_attn.q_proj.weight, model.vision_tower.encoder.layers.5.self_attn.q_proj.bias, model.vision_tower.encoder.layers.5.self_attn.out_proj.weight, model.vision_tower.encoder.layers.5.self_attn.out_proj.bias, model.vision_tower.encoder.layers.5.layer_norm2.weight, model.vision_tower.encoder.layers.5.layer_norm2.bias, model.vision_tower.encoder.layers.5.mlp.fc1.weight, model.vision_tower.encoder.layers.5.mlp.fc1.bias, model.vision_tower.encoder.layers.5.mlp.fc2.weight, model.vision_tower.encoder.layers.5.mlp.fc2.bias, model.vision_tower.encoder.layers.6.layer_norm1.weight, model.vision_tower.encoder.layers.6.layer_norm1.bias, model.vision_tower.encoder.layers.6.self_attn.k_proj.weight, model.vision_tower.encoder.layers.6.self_attn.k_proj.bias, model.vision_tower.encoder.layers.6.self_attn.v_proj.weight, model.vision_tower.encoder.layers.6.self_attn.v_proj.bias, model.vision_tower.encoder.layers.6.self_attn.q_proj.weight, model.vision_tower.encoder.layers.6.self_attn.q_proj.bias, model.vision_tower.encoder.layers.6.self_attn.out_proj.weight, model.vision_tower.encoder.layers.6.self_attn.out_proj.bias, model.vision_tower.encoder.layers.6.layer_norm2.weight, model.vision_tower.encoder.layers.6.layer_norm2.bias, model.vision_tower.encoder.layers.6.mlp.fc1.weight, model.vision_tower.encoder.layers.6.mlp.fc1.bias, model.vision_tower.encoder.layers.6.mlp.fc2.weight, model.vision_tower.encoder.layers.6.mlp.fc2.bias, model.vision_tower.encoder.layers.7.layer_norm1.weight, model.vision_tower.encoder.layers.7.layer_norm1.bias, model.vision_tower.encoder.layers.7.self_attn.k_proj.weight, model.vision_tower.encoder.layers.7.self_attn.k_proj.bias, model.vision_tower.encoder.layers.7.self_attn.v_proj.weight, model.vision_tower.encoder.layers.7.self_attn.v_proj.bias, model.vision_tower.encoder.layers.7.self_attn.q_proj.weight, model.vision_tower.encoder.layers.7.self_attn.q_proj.bias, model.vision_tower.encoder.layers.7.self_attn.out_proj.weight, model.vision_tower.encoder.layers.7.self_attn.out_proj.bias, model.vision_tower.encoder.layers.7.layer_norm2.weight, model.vision_tower.encoder.layers.7.layer_norm2.bias, model.vision_tower.encoder.layers.7.mlp.fc1.weight, model.vision_tower.encoder.layers.7.mlp.fc1.bias, model.vision_tower.encoder.layers.7.mlp.fc2.weight, model.vision_tower.encoder.layers.7.mlp.fc2.bias, model.vision_tower.encoder.layers.8.layer_norm1.weight, model.vision_tower.encoder.layers.8.layer_norm1.bias, model.vision_tower.encoder.layers.8.self_attn.k_proj.weight, model.vision_tower.encoder.layers.8.self_attn.k_proj.bias, model.vision_tower.encoder.layers.8.self_attn.v_proj.weight, model.vision_tower.encoder.layers.8.self_attn.v_proj.bias, model.vision_tower.encoder.layers.8.self_attn.q_proj.weight, model.vision_tower.encoder.layers.8.self_attn.q_proj.bias, model.vision_tower.encoder.layers.8.self_attn.out_proj.weight, model.vision_tower.encoder.layers.8.self_attn.out_proj.bias, model.vision_tower.encoder.layers.8.layer_norm2.weight, model.vision_tower.encoder.layers.8.layer_norm2.bias, model.vision_tower.encoder.layers.8.mlp.fc1.weight, model.vision_tower.encoder.layers.8.mlp.fc1.bias, model.vision_tower.encoder.layers.8.mlp.fc2.weight, model.vision_tower.encoder.layers.8.mlp.fc2.bias, model.vision_tower.encoder.layers.9.layer_norm1.weight, model.vision_tower.encoder.layers.9.layer_norm1.bias, model.vision_tower.encoder.layers.9.self_attn.k_proj.weight, model.vision_tower.encoder.layers.9.self_attn.k_proj.bias, model.vision_tower.encoder.layers.9.self_attn.v_proj.weight, model.vision_tower.encoder.layers.9.self_attn.v_proj.bias, model.vision_tower.encoder.layers.9.self_attn.q_proj.weight, model.vision_tower.encoder.layers.9.self_attn.q_proj.bias, model.vision_tower.encoder.layers.9.self_attn.out_proj.weight, model.vision_tower.encoder.layers.9.self_attn.out_proj.bias, model.vision_tower.encoder.layers.9.layer_norm2.weight, model.vision_tower.encoder.layers.9.layer_norm2.bias, model.vision_tower.encoder.layers.9.mlp.fc1.weight, model.vision_tower.encoder.layers.9.mlp.fc1.bias, model.vision_tower.encoder.layers.9.mlp.fc2.weight, model.vision_tower.encoder.layers.9.mlp.fc2.bias, model.vision_tower.encoder.layers.10.layer_norm1.weight, model.vision_tower.encoder.layers.10.layer_norm1.bias, model.vision_tower.encoder.layers.10.self_attn.k_proj.weight, model.vision_tower.encoder.layers.10.self_attn.k_proj.bias, model.vision_tower.encoder.layers.10.self_attn.v_proj.weight, model.vision_tower.encoder.layers.10.self_attn.v_proj.bias, model.vision_tower.encoder.layers.10.self_attn.q_proj.weight, model.vision_tower.encoder.layers.10.self_attn.q_proj.bias, model.vision_tower.encoder.layers.10.self_attn.out_proj.weight, model.vision_tower.encoder.layers.10.self_attn.out_proj.bias, model.vision_tower.encoder.layers.10.layer_norm2.weight, model.vision_tower.encoder.layers.10.layer_norm2.bias, model.vision_tower.encoder.layers.10.mlp.fc1.weight, model.vision_tower.encoder.layers.10.mlp.fc1.bias, model.vision_tower.encoder.layers.10.mlp.fc2.weight, model.vision_tower.encoder.layers.10.mlp.fc2.bias, model.vision_tower.encoder.layers.11.layer_norm1.weight, model.vision_tower.encoder.layers.11.layer_norm1.bias, model.vision_tower.encoder.layers.11.self_attn.k_proj.weight, model.vision_tower.encoder.layers.11.self_attn.k_proj.bias, model.vision_tower.encoder.layers.11.self_attn.v_proj.weight, model.vision_tower.encoder.layers.11.self_attn.v_proj.bias, model.vision_tower.encoder.layers.11.self_attn.q_proj.weight, model.vision_tower.encoder.layers.11.self_attn.q_proj.bias, model.vision_tower.encoder.layers.11.self_attn.out_proj.weight, model.vision_tower.encoder.layers.11.self_attn.out_proj.bias, model.vision_tower.encoder.layers.11.layer_norm2.weight, model.vision_tower.encoder.layers.11.layer_norm2.bias, model.vision_tower.encoder.layers.11.mlp.fc1.weight, model.vision_tower.encoder.layers.11.mlp.fc1.bias, model.vision_tower.encoder.layers.11.mlp.fc2.weight, model.vision_tower.encoder.layers.11.mlp.fc2.bias, model.vision_tower.encoder.layers.12.layer_norm1.weight, model.vision_tower.encoder.layers.12.layer_norm1.bias, model.vision_tower.encoder.layers.12.self_attn.k_proj.weight, model.vision_tower.encoder.layers.12.self_attn.k_proj.bias, model.vision_tower.encoder.layers.12.self_attn.v_proj.weight, model.vision_tower.encoder.layers.12.self_attn.v_proj.bias, model.vision_tower.encoder.layers.12.self_attn.q_proj.weight, model.vision_tower.encoder.layers.12.self_attn.q_proj.bias, model.vision_tower.encoder.layers.12.self_attn.out_proj.weight, model.vision_tower.encoder.layers.12.self_attn.out_proj.bias, model.vision_tower.encoder.layers.12.layer_norm2.weight, model.vision_tower.encoder.layers.12.layer_norm2.bias, model.vision_tower.encoder.layers.12.mlp.fc1.weight, model.vision_tower.encoder.layers.12.mlp.fc1.bias, model.vision_tower.encoder.layers.12.mlp.fc2.weight, model.vision_tower.encoder.layers.12.mlp.fc2.bias, model.vision_tower.encoder.layers.13.layer_norm1.weight, model.vision_tower.encoder.layers.13.layer_norm1.bias, model.vision_tower.encoder.layers.13.self_attn.k_proj.weight, model.vision_tower.encoder.layers.13.self_attn.k_proj.bias, model.vision_tower.encoder.layers.13.self_attn.v_proj.weight, model.vision_tower.encoder.layers.13.self_attn.v_proj.bias, model.vision_tower.encoder.layers.13.self_attn.q_proj.weight, model.vision_tower.encoder.layers.13.self_attn.q_proj.bias, model.vision_tower.encoder.layers.13.self_attn.out_proj.weight, model.vision_tower.encoder.layers.13.self_attn.out_proj.bias, model.vision_tower.encoder.layers.13.layer_norm2.weight, model.vision_tower.encoder.layers.13.layer_norm2.bias, model.vision_tower.encoder.layers.13.mlp.fc1.weight, model.vision_tower.encoder.layers.13.mlp.fc1.bias, model.vision_tower.encoder.layers.13.mlp.fc2.weight, model.vision_tower.encoder.layers.13.mlp.fc2.bias, model.vision_tower.encoder.layers.14.layer_norm1.weight, model.vision_tower.encoder.layers.14.layer_norm1.bias, model.vision_tower.encoder.layers.14.self_attn.k_proj.weight, model.vision_tower.encoder.layers.14.self_attn.k_proj.bias, model.vision_tower.encoder.layers.14.self_attn.v_proj.weight, model.vision_tower.encoder.layers.14.self_attn.v_proj.bias, model.vision_tower.encoder.layers.14.self_attn.q_proj.weight, model.vision_tower.encoder.layers.14.self_attn.q_proj.bias, model.vision_tower.encoder.layers.14.self_attn.out_proj.weight, model.vision_tower.encoder.layers.14.self_attn.out_proj.bias, model.vision_tower.encoder.layers.14.layer_norm2.weight, model.vision_tower.encoder.layers.14.layer_norm2.bias, model.vision_tower.encoder.layers.14.mlp.fc1.weight, model.vision_tower.encoder.layers.14.mlp.fc1.bias, model.vision_tower.encoder.layers.14.mlp.fc2.weight, model.vision_tower.encoder.layers.14.mlp.fc2.bias, model.vision_tower.encoder.layers.15.layer_norm1.weight, model.vision_tower.encoder.layers.15.layer_norm1.bias, model.vision_tower.encoder.layers.15.self_attn.k_proj.weight, model.vision_tower.encoder.layers.15.self_attn.k_proj.bias, model.vision_tower.encoder.layers.15.self_attn.v_proj.weight, model.vision_tower.encoder.layers.15.self_attn.v_proj.bias, model.vision_tower.encoder.layers.15.self_attn.q_proj.weight, model.vision_tower.encoder.layers.15.self_attn.q_proj.bias, model.vision_tower.encoder.layers.15.self_attn.out_proj.weight, model.vision_tower.encoder.layers.15.self_attn.out_proj.bias, model.vision_tower.encoder.layers.15.layer_norm2.weight, model.vision_tower.encoder.layers.15.layer_norm2.bias, model.vision_tower.encoder.layers.15.mlp.fc1.weight, model.vision_tower.encoder.layers.15.mlp.fc1.bias, model.vision_tower.encoder.layers.15.mlp.fc2.weight, model.vision_tower.encoder.layers.15.mlp.fc2.bias, model.vision_tower.encoder.layers.16.layer_norm1.weight, model.vision_tower.encoder.layers.16.layer_norm1.bias, model.vision_tower.encoder.layers.16.self_attn.k_proj.weight, model.vision_tower.encoder.layers.16.self_attn.k_proj.bias, model.vision_tower.encoder.layers.16.self_attn.v_proj.weight, model.vision_tower.encoder.layers.16.self_attn.v_proj.bias, model.vision_tower.encoder.layers.16.self_attn.q_proj.weight, model.vision_tower.encoder.layers.16.self_attn.q_proj.bias, model.vision_tower.encoder.layers.16.self_attn.out_proj.weight, model.vision_tower.encoder.layers.16.self_attn.out_proj.bias, model.vision_tower.encoder.layers.16.layer_norm2.weight, model.vision_tower.encoder.layers.16.layer_norm2.bias, model.vision_tower.encoder.layers.16.mlp.fc1.weight, model.vision_tower.encoder.layers.16.mlp.fc1.bias, model.vision_tower.encoder.layers.16.mlp.fc2.weight, model.vision_tower.encoder.layers.16.mlp.fc2.bias, model.vision_tower.encoder.layers.17.layer_norm1.weight, model.vision_tower.encoder.layers.17.layer_norm1.bias, model.vision_tower.encoder.layers.17.self_attn.k_proj.weight, model.vision_tower.encoder.layers.17.self_attn.k_proj.bias, model.vision_tower.encoder.layers.17.self_attn.v_proj.weight, model.vision_tower.encoder.layers.17.self_attn.v_proj.bias, model.vision_tower.encoder.layers.17.self_attn.q_proj.weight, model.vision_tower.encoder.layers.17.self_attn.q_proj.bias, model.vision_tower.encoder.layers.17.self_attn.out_proj.weight, model.vision_tower.encoder.layers.17.self_attn.out_proj.bias, model.vision_tower.encoder.layers.17.layer_norm2.weight, model.vision_tower.encoder.layers.17.layer_norm2.bias, model.vision_tower.encoder.layers.17.mlp.fc1.weight, model.vision_tower.encoder.layers.17.mlp.fc1.bias, model.vision_tower.encoder.layers.17.mlp.fc2.weight, model.vision_tower.encoder.layers.17.mlp.fc2.bias, model.vision_tower.encoder.layers.18.layer_norm1.weight, model.vision_tower.encoder.layers.18.layer_norm1.bias, model.vision_tower.encoder.layers.18.self_attn.k_proj.weight, model.vision_tower.encoder.layers.18.self_attn.k_proj.bias, model.vision_tower.encoder.layers.18.self_attn.v_proj.weight, model.vision_tower.encoder.layers.18.self_attn.v_proj.bias, model.vision_tower.encoder.layers.18.self_attn.q_proj.weight, model.vision_tower.encoder.layers.18.self_attn.q_proj.bias, model.vision_tower.encoder.layers.18.self_attn.out_proj.weight, model.vision_tower.encoder.layers.18.self_attn.out_proj.bias, model.vision_tower.encoder.layers.18.layer_norm2.weight, model.vision_tower.encoder.layers.18.layer_norm2.bias, model.vision_tower.encoder.layers.18.mlp.fc1.weight, model.vision_tower.encoder.layers.18.mlp.fc1.bias, model.vision_tower.encoder.layers.18.mlp.fc2.weight, model.vision_tower.encoder.layers.18.mlp.fc2.bias, model.vision_tower.encoder.layers.19.layer_norm1.weight, model.vision_tower.encoder.layers.19.layer_norm1.bias, model.vision_tower.encoder.layers.19.self_attn.k_proj.weight, model.vision_tower.encoder.layers.19.self_attn.k_proj.bias, model.vision_tower.encoder.layers.19.self_attn.v_proj.weight, model.vision_tower.encoder.layers.19.self_attn.v_proj.bias, model.vision_tower.encoder.layers.19.self_attn.q_proj.weight, model.vision_tower.encoder.layers.19.self_attn.q_proj.bias, model.vision_tower.encoder.layers.19.self_attn.out_proj.weight, model.vision_tower.encoder.layers.19.self_attn.out_proj.bias, model.vision_tower.encoder.layers.19.layer_norm2.weight, model.vision_tower.encoder.layers.19.layer_norm2.bias, model.vision_tower.encoder.layers.19.mlp.fc1.weight, model.vision_tower.encoder.layers.19.mlp.fc1.bias, model.vision_tower.encoder.layers.19.mlp.fc2.weight, model.vision_tower.encoder.layers.19.mlp.fc2.bias, model.vision_tower.encoder.layers.20.layer_norm1.weight, model.vision_tower.encoder.layers.20.layer_norm1.bias, model.vision_tower.encoder.layers.20.self_attn.k_proj.weight, model.vision_tower.encoder.layers.20.self_attn.k_proj.bias, model.vision_tower.encoder.layers.20.self_attn.v_proj.weight, model.vision_tower.encoder.layers.20.self_attn.v_proj.bias, model.vision_tower.encoder.layers.20.self_attn.q_proj.weight, model.vision_tower.encoder.layers.20.self_attn.q_proj.bias, model.vision_tower.encoder.layers.20.self_attn.out_proj.weight, model.vision_tower.encoder.layers.20.self_attn.out_proj.bias, model.vision_tower.encoder.layers.20.layer_norm2.weight, model.vision_tower.encoder.layers.20.layer_norm2.bias, model.vision_tower.encoder.layers.20.mlp.fc1.weight, model.vision_tower.encoder.layers.20.mlp.fc1.bias, model.vision_tower.encoder.layers.20.mlp.fc2.weight, model.vision_tower.encoder.layers.20.mlp.fc2.bias, model.vision_tower.encoder.layers.21.layer_norm1.weight, model.vision_tower.encoder.layers.21.layer_norm1.bias, model.vision_tower.encoder.layers.21.self_attn.k_proj.weight, model.vision_tower.encoder.layers.21.self_attn.k_proj.bias, model.vision_tower.encoder.layers.21.self_attn.v_proj.weight, model.vision_tower.encoder.layers.21.self_attn.v_proj.bias, model.vision_tower.encoder.layers.21.self_attn.q_proj.weight, model.vision_tower.encoder.layers.21.self_attn.q_proj.bias, model.vision_tower.encoder.layers.21.self_attn.out_proj.weight, model.vision_tower.encoder.layers.21.self_attn.out_proj.bias, model.vision_tower.encoder.layers.21.layer_norm2.weight, model.vision_tower.encoder.layers.21.layer_norm2.bias, model.vision_tower.encoder.layers.21.mlp.fc1.weight, model.vision_tower.encoder.layers.21.mlp.fc1.bias, model.vision_tower.encoder.layers.21.mlp.fc2.weight, model.vision_tower.encoder.layers.21.mlp.fc2.bias, model.vision_tower.encoder.layers.22.layer_norm1.weight, model.vision_tower.encoder.layers.22.layer_norm1.bias, model.vision_tower.encoder.layers.22.self_attn.k_proj.weight, model.vision_tower.encoder.layers.22.self_attn.k_proj.bias, model.vision_tower.encoder.layers.22.self_attn.v_proj.weight, model.vision_tower.encoder.layers.22.self_attn.v_proj.bias, model.vision_tower.encoder.layers.22.self_attn.q_proj.weight, model.vision_tower.encoder.layers.22.self_attn.q_proj.bias, model.vision_tower.encoder.layers.22.self_attn.out_proj.weight, model.vision_tower.encoder.layers.22.self_attn.out_proj.bias, model.vision_tower.encoder.layers.22.layer_norm2.weight, model.vision_tower.encoder.layers.22.layer_norm2.bias, model.vision_tower.encoder.layers.22.mlp.fc1.weight, model.vision_tower.encoder.layers.22.mlp.fc1.bias, model.vision_tower.encoder.layers.22.mlp.fc2.weight, model.vision_tower.encoder.layers.22.mlp.fc2.bias, model.vision_tower.encoder.layers.23.layer_norm1.weight, model.vision_tower.encoder.layers.23.layer_norm1.bias, model.vision_tower.encoder.layers.23.self_attn.k_proj.weight, model.vision_tower.encoder.layers.23.self_attn.k_proj.bias, model.vision_tower.encoder.layers.23.self_attn.v_proj.weight, model.vision_tower.encoder.layers.23.self_attn.v_proj.bias, model.vision_tower.encoder.layers.23.self_attn.q_proj.weight, model.vision_tower.encoder.layers.23.self_attn.q_proj.bias, model.vision_tower.encoder.layers.23.self_attn.out_proj.weight, model.vision_tower.encoder.layers.23.self_attn.out_proj.bias, model.vision_tower.encoder.layers.23.layer_norm2.weight, model.vision_tower.encoder.layers.23.layer_norm2.bias, model.vision_tower.encoder.layers.23.mlp.fc1.weight, model.vision_tower.encoder.layers.23.mlp.fc1.bias, model.vision_tower.encoder.layers.23.mlp.fc2.weight, model.vision_tower.encoder.layers.23.mlp.fc2.bias, model.vision_tower.encoder.layers.24.layer_norm1.weight, model.vision_tower.encoder.layers.24.layer_norm1.bias, model.vision_tower.encoder.layers.24.self_attn.k_proj.weight, model.vision_tower.encoder.layers.24.self_attn.k_proj.bias, model.vision_tower.encoder.layers.24.self_attn.v_proj.weight, model.vision_tower.encoder.layers.24.self_attn.v_proj.bias, model.vision_tower.encoder.layers.24.self_attn.q_proj.weight, model.vision_tower.encoder.layers.24.self_attn.q_proj.bias, model.vision_tower.encoder.layers.24.self_attn.out_proj.weight, model.vision_tower.encoder.layers.24.self_attn.out_proj.bias, model.vision_tower.encoder.layers.24.layer_norm2.weight, model.vision_tower.encoder.layers.24.layer_norm2.bias, model.vision_tower.encoder.layers.24.mlp.fc1.weight, model.vision_tower.encoder.layers.24.mlp.fc1.bias, model.vision_tower.encoder.layers.24.mlp.fc2.weight, model.vision_tower.encoder.layers.24.mlp.fc2.bias, model.vision_tower.encoder.layers.25.layer_norm1.weight, model.vision_tower.encoder.layers.25.layer_norm1.bias, model.vision_tower.encoder.layers.25.self_attn.k_proj.weight, model.vision_tower.encoder.layers.25.self_attn.k_proj.bias, model.vision_tower.encoder.layers.25.self_attn.v_proj.weight, model.vision_tower.encoder.layers.25.self_attn.v_proj.bias, model.vision_tower.encoder.layers.25.self_attn.q_proj.weight, model.vision_tower.encoder.layers.25.self_attn.q_proj.bias, model.vision_tower.encoder.layers.25.self_attn.out_proj.weight, model.vision_tower.encoder.layers.25.self_attn.out_proj.bias, model.vision_tower.encoder.layers.25.layer_norm2.weight, model.vision_tower.encoder.layers.25.layer_norm2.bias, model.vision_tower.encoder.layers.25.mlp.fc1.weight, model.vision_tower.encoder.layers.25.mlp.fc1.bias, model.vision_tower.encoder.layers.25.mlp.fc2.weight, model.vision_tower.encoder.layers.25.mlp.fc2.bias, model.vision_tower.encoder.layers.26.layer_norm1.weight, model.vision_tower.encoder.layers.26.layer_norm1.bias, model.vision_tower.encoder.layers.26.self_attn.k_proj.weight, model.vision_tower.encoder.layers.26.self_attn.k_proj.bias, model.vision_tower.encoder.layers.26.self_attn.v_proj.weight, model.vision_tower.encoder.layers.26.self_attn.v_proj.bias, model.vision_tower.encoder.layers.26.self_attn.q_proj.weight, model.vision_tower.encoder.layers.26.self_attn.q_proj.bias, model.vision_tower.encoder.layers.26.self_attn.out_proj.weight, model.vision_tower.encoder.layers.26.self_attn.out_proj.bias, model.vision_tower.encoder.layers.26.layer_norm2.weight, model.vision_tower.encoder.layers.26.layer_norm2.bias, model.vision_tower.encoder.layers.26.mlp.fc1.weight, model.vision_tower.encoder.layers.26.mlp.fc1.bias, model.vision_tower.encoder.layers.26.mlp.fc2.weight, model.vision_tower.encoder.layers.26.mlp.fc2.bias, model.vision_tower.post_layernorm.weight, model.vision_tower.post_layernorm.bias, model.multi_modal_projector.mm_input_projection_weight, model.multi_modal_projector.mm_soft_emb_norm.weight, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.self_attn.q_norm.weight, model.language_model.layers.0.self_attn.k_norm.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.0.pre_feedforward_layernorm.weight, model.language_model.layers.0.post_feedforward_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.self_attn.q_norm.weight, model.language_model.layers.1.self_attn.k_norm.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.layers.1.pre_feedforward_layernorm.weight, model.language_model.layers.1.post_feedforward_layernorm.weight, model.language_model.layers.2.self_attn.q_proj.weight, model.language_model.layers.2.self_attn.k_proj.weight, model.language_model.layers.2.self_attn.v_proj.weight, model.language_model.layers.2.self_attn.o_proj.weight, model.language_model.layers.2.self_attn.q_norm.weight, model.language_model.layers.2.self_attn.k_norm.weight, model.language_model.layers.2.mlp.gate_proj.weight, model.language_model.layers.2.mlp.up_proj.weight, model.language_model.layers.2.mlp.down_proj.weight, model.language_model.layers.2.input_layernorm.weight, model.language_model.layers.2.post_attention_layernorm.weight, model.language_model.layers.2.pre_feedforward_layernorm.weight, model.language_model.layers.2.post_feedforward_layernorm.weight, model.language_model.layers.3.self_attn.q_proj.weight, model.language_model.layers.3.self_attn.k_proj.weight, model.language_model.layers.3.self_attn.v_proj.weight, model.language_model.layers.3.self_attn.o_proj.weight, model.language_model.layers.3.self_attn.q_norm.weight, model.language_model.layers.3.self_attn.k_norm.weight, model.language_model.layers.3.mlp.gate_proj.weight, model.language_model.layers.3.mlp.up_proj.weight, model.language_model.layers.3.mlp.down_proj.weight, model.language_model.layers.3.input_layernorm.weight, model.language_model.layers.3.post_attention_layernorm.weight, model.language_model.layers.3.pre_feedforward_layernorm.weight, model.language_model.layers.3.post_feedforward_layernorm.weight, model.language_model.layers.4.self_attn.q_proj.weight, model.language_model.layers.4.self_attn.k_proj.weight, model.language_model.layers.4.self_attn.v_proj.weight, model.language_model.layers.4.self_attn.o_proj.weight, model.language_model.layers.4.self_attn.q_norm.weight, model.language_model.layers.4.self_attn.k_norm.weight, model.language_model.layers.4.mlp.gate_proj.weight, model.language_model.layers.4.mlp.up_proj.weight, model.language_model.layers.4.mlp.down_proj.weight, model.language_model.layers.4.input_layernorm.weight, model.language_model.layers.4.post_attention_layernorm.weight, model.language_model.layers.4.pre_feedforward_layernorm.weight, model.language_model.layers.4.post_feedforward_layernorm.weight, model.language_model.layers.5.self_attn.q_proj.weight, model.language_model.layers.5.self_attn.k_proj.weight, model.language_model.layers.5.self_attn.v_proj.weight, model.language_model.layers.5.self_attn.o_proj.weight, model.language_model.layers.5.self_attn.q_norm.weight, model.language_model.layers.5.self_attn.k_norm.weight, model.language_model.layers.5.mlp.gate_proj.weight, model.language_model.layers.5.mlp.up_proj.weight, model.language_model.layers.5.mlp.down_proj.weight, model.language_model.layers.5.input_layernorm.weight, model.language_model.layers.5.post_attention_layernorm.weight, model.language_model.layers.5.pre_feedforward_layernorm.weight, model.language_model.layers.5.post_feedforward_layernorm.weight, model.language_model.layers.6.self_attn.q_proj.weight, model.language_model.layers.6.self_attn.k_proj.weight, model.language_model.layers.6.self_attn.v_proj.weight, model.language_model.layers.6.self_attn.o_proj.weight, model.language_model.layers.6.self_attn.q_norm.weight, model.language_model.layers.6.self_attn.k_norm.weight, model.language_model.layers.6.mlp.gate_proj.weight, model.language_model.layers.6.mlp.up_proj.weight, model.language_model.layers.6.mlp.down_proj.weight, model.language_model.layers.6.input_layernorm.weight, model.language_model.layers.6.post_attention_layernorm.weight, model.language_model.layers.6.pre_feedforward_layernorm.weight, model.language_model.layers.6.post_feedforward_layernorm.weight, model.language_model.layers.7.self_attn.q_proj.weight, model.language_model.layers.7.self_attn.k_proj.weight, model.language_model.layers.7.self_attn.v_proj.weight, model.language_model.layers.7.self_attn.o_proj.weight, model.language_model.layers.7.self_attn.q_norm.weight, model.language_model.layers.7.self_attn.k_norm.weight, model.language_model.layers.7.mlp.gate_proj.weight, model.language_model.layers.7.mlp.up_proj.weight, model.language_model.layers.7.mlp.down_proj.weight, model.language_model.layers.7.input_layernorm.weight, model.language_model.layers.7.post_attention_layernorm.weight, model.language_model.layers.7.pre_feedforward_layernorm.weight, model.language_model.layers.7.post_feedforward_layernorm.weight, model.language_model.layers.8.self_attn.q_proj.weight, model.language_model.layers.8.self_attn.k_proj.weight, model.language_model.layers.8.self_attn.v_proj.weight, model.language_model.layers.8.self_attn.o_proj.weight, model.language_model.layers.8.self_attn.q_norm.weight, model.language_model.layers.8.self_attn.k_norm.weight, model.language_model.layers.8.mlp.gate_proj.weight, model.language_model.layers.8.mlp.up_proj.weight, model.language_model.layers.8.mlp.down_proj.weight, model.language_model.layers.8.input_layernorm.weight, model.language_model.layers.8.post_attention_layernorm.weight, model.language_model.layers.8.pre_feedforward_layernorm.weight, model.language_model.layers.8.post_feedforward_layernorm.weight, model.language_model.layers.9.self_attn.q_proj.weight, model.language_model.layers.9.self_attn.k_proj.weight, model.language_model.layers.9.self_attn.v_proj.weight, model.language_model.layers.9.self_attn.o_proj.weight, model.language_model.layers.9.self_attn.q_norm.weight, model.language_model.layers.9.self_attn.k_norm.weight, model.language_model.layers.9.mlp.gate_proj.weight, model.language_model.layers.9.mlp.up_proj.weight, model.language_model.layers.9.mlp.down_proj.weight, model.language_model.layers.9.input_layernorm.weight, model.language_model.layers.9.post_attention_layernorm.weight, model.language_model.layers.9.pre_feedforward_layernorm.weight, model.language_model.layers.9.post_feedforward_layernorm.weight, model.language_model.layers.10.self_attn.q_proj.weight, model.language_model.layers.10.self_attn.k_proj.weight, model.language_model.layers.10.self_attn.v_proj.weight, model.language_model.layers.10.self_attn.o_proj.weight, model.language_model.layers.10.self_attn.q_norm.weight, model.language_model.layers.10.self_attn.k_norm.weight, model.language_model.layers.10.mlp.gate_proj.weight, model.language_model.layers.10.mlp.up_proj.weight, model.language_model.layers.10.mlp.down_proj.weight, model.language_model.layers.10.input_layernorm.weight, model.language_model.layers.10.post_attention_layernorm.weight, model.language_model.layers.10.pre_feedforward_layernorm.weight, model.language_model.layers.10.post_feedforward_layernorm.weight, model.language_model.layers.11.self_attn.q_proj.weight, model.language_model.layers.11.self_attn.k_proj.weight, model.language_model.layers.11.self_attn.v_proj.weight, model.language_model.layers.11.self_attn.o_proj.weight, model.language_model.layers.11.self_attn.q_norm.weight, model.language_model.layers.11.self_attn.k_norm.weight, model.language_model.layers.11.mlp.gate_proj.weight, model.language_model.layers.11.mlp.up_proj.weight, model.language_model.layers.11.mlp.down_proj.weight, model.language_model.layers.11.input_layernorm.weight, model.language_model.layers.11.post_attention_layernorm.weight, model.language_model.layers.11.pre_feedforward_layernorm.weight, model.language_model.layers.11.post_feedforward_layernorm.weight, model.language_model.layers.12.self_attn.q_proj.weight, model.language_model.layers.12.self_attn.k_proj.weight, model.language_model.layers.12.self_attn.v_proj.weight, model.language_model.layers.12.self_attn.o_proj.weight, model.language_model.layers.12.self_attn.q_norm.weight, model.language_model.layers.12.self_attn.k_norm.weight, model.language_model.layers.12.mlp.gate_proj.weight, model.language_model.layers.12.mlp.up_proj.weight, model.language_model.layers.12.mlp.down_proj.weight, model.language_model.layers.12.input_layernorm.weight, model.language_model.layers.12.post_attention_layernorm.weight, model.language_model.layers.12.pre_feedforward_layernorm.weight, model.language_model.layers.12.post_feedforward_layernorm.weight, model.language_model.layers.13.self_attn.q_proj.weight, model.language_model.layers.13.self_attn.k_proj.weight, model.language_model.layers.13.self_attn.v_proj.weight, model.language_model.layers.13.self_attn.o_proj.weight, model.language_model.layers.13.self_attn.q_norm.weight, model.language_model.layers.13.self_attn.k_norm.weight, model.language_model.layers.13.mlp.gate_proj.weight, model.language_model.layers.13.mlp.up_proj.weight, model.language_model.layers.13.mlp.down_proj.weight, model.language_model.layers.13.input_layernorm.weight, model.language_model.layers.13.post_attention_layernorm.weight, model.language_model.layers.13.pre_feedforward_layernorm.weight, model.language_model.layers.13.post_feedforward_layernorm.weight, model.language_model.layers.14.self_attn.q_proj.weight, model.language_model.layers.14.self_attn.k_proj.weight, model.language_model.layers.14.self_attn.v_proj.weight, model.language_model.layers.14.self_attn.o_proj.weight, model.language_model.layers.14.self_attn.q_norm.weight, model.language_model.layers.14.self_attn.k_norm.weight, model.language_model.layers.14.mlp.gate_proj.weight, model.language_model.layers.14.mlp.up_proj.weight, model.language_model.layers.14.mlp.down_proj.weight, model.language_model.layers.14.input_layernorm.weight, model.language_model.layers.14.post_attention_layernorm.weight, model.language_model.layers.14.pre_feedforward_layernorm.weight, model.language_model.layers.14.post_feedforward_layernorm.weight, model.language_model.layers.15.self_attn.q_proj.weight, model.language_model.layers.15.self_attn.k_proj.weight, model.language_model.layers.15.self_attn.v_proj.weight, model.language_model.layers.15.self_attn.o_proj.weight, model.language_model.layers.15.self_attn.q_norm.weight, model.language_model.layers.15.self_attn.k_norm.weight, model.language_model.layers.15.mlp.gate_proj.weight, model.language_model.layers.15.mlp.up_proj.weight, model.language_model.layers.15.mlp.down_proj.weight, model.language_model.layers.15.input_layernorm.weight, model.language_model.layers.15.post_attention_layernorm.weight, model.language_model.layers.15.pre_feedforward_layernorm.weight, model.language_model.layers.15.post_feedforward_layernorm.weight, model.language_model.layers.16.self_attn.q_proj.weight, model.language_model.layers.16.self_attn.k_proj.weight, model.language_model.layers.16.self_attn.v_proj.weight, model.language_model.layers.16.self_attn.o_proj.weight, model.language_model.layers.16.self_attn.q_norm.weight, model.language_model.layers.16.self_attn.k_norm.weight, model.language_model.layers.16.mlp.gate_proj.weight, model.language_model.layers.16.mlp.up_proj.weight, model.language_model.layers.16.mlp.down_proj.weight, model.language_model.layers.16.input_layernorm.weight, model.language_model.layers.16.post_attention_layernorm.weight, model.language_model.layers.16.pre_feedforward_layernorm.weight, model.language_model.layers.16.post_feedforward_layernorm.weight, model.language_model.layers.17.self_attn.q_proj.weight, model.language_model.layers.17.self_attn.k_proj.weight, model.language_model.layers.17.self_attn.v_proj.weight, model.language_model.layers.17.self_attn.o_proj.weight, model.language_model.layers.17.self_attn.q_norm.weight, model.language_model.layers.17.self_attn.k_norm.weight, model.language_model.layers.17.mlp.gate_proj.weight, model.language_model.layers.17.mlp.up_proj.weight, model.language_model.layers.17.mlp.down_proj.weight, model.language_model.layers.17.input_layernorm.weight, model.language_model.layers.17.post_attention_layernorm.weight, model.language_model.layers.17.pre_feedforward_layernorm.weight, model.language_model.layers.17.post_feedforward_layernorm.weight, model.language_model.layers.18.self_attn.q_proj.weight, model.language_model.layers.18.self_attn.k_proj.weight, model.language_model.layers.18.self_attn.v_proj.weight, model.language_model.layers.18.self_attn.o_proj.weight, model.language_model.layers.18.self_attn.q_norm.weight, model.language_model.layers.18.self_attn.k_norm.weight, model.language_model.layers.18.mlp.gate_proj.weight, model.language_model.layers.18.mlp.up_proj.weight, model.language_model.layers.18.mlp.down_proj.weight, model.language_model.layers.18.input_layernorm.weight, model.language_model.layers.18.post_attention_layernorm.weight, model.language_model.layers.18.pre_feedforward_layernorm.weight, model.language_model.layers.18.post_feedforward_layernorm.weight, model.language_model.layers.19.self_attn.q_proj.weight, model.language_model.layers.19.self_attn.k_proj.weight, model.language_model.layers.19.self_attn.v_proj.weight, model.language_model.layers.19.self_attn.o_proj.weight, model.language_model.layers.19.self_attn.q_norm.weight, model.language_model.layers.19.self_attn.k_norm.weight, model.language_model.layers.19.mlp.gate_proj.weight, model.language_model.layers.19.mlp.up_proj.weight, model.language_model.layers.19.mlp.down_proj.weight, model.language_model.layers.19.input_layernorm.weight, model.language_model.layers.19.post_attention_layernorm.weight, model.language_model.layers.19.pre_feedforward_layernorm.weight, model.language_model.layers.19.post_feedforward_layernorm.weight, model.language_model.layers.20.self_attn.q_proj.weight, model.language_model.layers.20.self_attn.k_proj.weight, model.language_model.layers.20.self_attn.v_proj.weight, model.language_model.layers.20.self_attn.o_proj.weight, model.language_model.layers.20.self_attn.q_norm.weight, model.language_model.layers.20.self_attn.k_norm.weight, model.language_model.layers.20.mlp.gate_proj.weight, model.language_model.layers.20.mlp.up_proj.weight, model.language_model.layers.20.mlp.down_proj.weight, model.language_model.layers.20.input_layernorm.weight, model.language_model.layers.20.post_attention_layernorm.weight, model.language_model.layers.20.pre_feedforward_layernorm.weight, model.language_model.layers.20.post_feedforward_layernorm.weight, model.language_model.layers.21.self_attn.q_proj.weight, model.language_model.layers.21.self_attn.k_proj.weight, model.language_model.layers.21.self_attn.v_proj.weight, model.language_model.layers.21.self_attn.o_proj.weight, model.language_model.layers.21.self_attn.q_norm.weight, model.language_model.layers.21.self_attn.k_norm.weight, model.language_model.layers.21.mlp.gate_proj.weight, model.language_model.layers.21.mlp.up_proj.weight, model.language_model.layers.21.mlp.down_proj.weight, model.language_model.layers.21.input_layernorm.weight, model.language_model.layers.21.post_attention_layernorm.weight, model.language_model.layers.21.pre_feedforward_layernorm.weight, model.language_model.layers.21.post_feedforward_layernorm.weight, model.language_model.layers.22.self_attn.q_proj.weight, model.language_model.layers.22.self_attn.k_proj.weight, model.language_model.layers.22.self_attn.v_proj.weight, model.language_model.layers.22.self_attn.o_proj.weight, model.language_model.layers.22.self_attn.q_norm.weight, model.language_model.layers.22.self_attn.k_norm.weight, model.language_model.layers.22.mlp.gate_proj.weight, model.language_model.layers.22.mlp.up_proj.weight, model.language_model.layers.22.mlp.down_proj.weight, model.language_model.layers.22.input_layernorm.weight, model.language_model.layers.22.post_attention_layernorm.weight, model.language_model.layers.22.pre_feedforward_layernorm.weight, model.language_model.layers.22.post_feedforward_layernorm.weight, model.language_model.layers.23.self_attn.q_proj.weight, model.language_model.layers.23.self_attn.k_proj.weight, model.language_model.layers.23.self_attn.v_proj.weight, model.language_model.layers.23.self_attn.o_proj.weight, model.language_model.layers.23.self_attn.q_norm.weight, model.language_model.layers.23.self_attn.k_norm.weight, model.language_model.layers.23.mlp.gate_proj.weight, model.language_model.layers.23.mlp.up_proj.weight, model.language_model.layers.23.mlp.down_proj.weight, model.language_model.layers.23.input_layernorm.weight, model.language_model.layers.23.post_attention_layernorm.weight, model.language_model.layers.23.pre_feedforward_layernorm.weight, model.language_model.layers.23.post_feedforward_layernorm.weight, model.language_model.layers.24.self_attn.q_proj.weight, model.language_model.layers.24.self_attn.k_proj.weight, model.language_model.layers.24.self_attn.v_proj.weight, model.language_model.layers.24.self_attn.o_proj.weight, model.language_model.layers.24.self_attn.q_norm.weight, model.language_model.layers.24.self_attn.k_norm.weight, model.language_model.layers.24.mlp.gate_proj.weight, model.language_model.layers.24.mlp.up_proj.weight, model.language_model.layers.24.mlp.down_proj.weight, model.language_model.layers.24.input_layernorm.weight, model.language_model.layers.24.post_attention_layernorm.weight, model.language_model.layers.24.pre_feedforward_layernorm.weight, model.language_model.layers.24.post_feedforward_layernorm.weight, model.language_model.layers.25.self_attn.q_proj.weight, model.language_model.layers.25.self_attn.k_proj.weight, model.language_model.layers.25.self_attn.v_proj.weight, model.language_model.layers.25.self_attn.o_proj.weight, model.language_model.layers.25.self_attn.q_norm.weight, model.language_model.layers.25.self_attn.k_norm.weight, model.language_model.layers.25.mlp.gate_proj.weight, model.language_model.layers.25.mlp.up_proj.weight, model.language_model.layers.25.mlp.down_proj.weight, model.language_model.layers.25.input_layernorm.weight, model.language_model.layers.25.post_attention_layernorm.weight, model.language_model.layers.25.pre_feedforward_layernorm.weight, model.language_model.layers.25.post_feedforward_layernorm.weight, model.language_model.layers.26.self_attn.q_proj.weight, model.language_model.layers.26.self_attn.k_proj.weight, model.language_model.layers.26.self_attn.v_proj.weight, model.language_model.layers.26.self_attn.o_proj.weight, model.language_model.layers.26.self_attn.q_norm.weight, model.language_model.layers.26.self_attn.k_norm.weight, model.language_model.layers.26.mlp.gate_proj.weight, model.language_model.layers.26.mlp.up_proj.weight, model.language_model.layers.26.mlp.down_proj.weight, model.language_model.layers.26.input_layernorm.weight, model.language_model.layers.26.post_attention_layernorm.weight, model.language_model.layers.26.pre_feedforward_layernorm.weight, model.language_model.layers.26.post_feedforward_layernorm.weight, model.language_model.layers.27.self_attn.q_proj.weight, model.language_model.layers.27.self_attn.k_proj.weight, model.language_model.layers.27.self_attn.v_proj.weight, model.language_model.layers.27.self_attn.o_proj.weight, model.language_model.layers.27.self_attn.q_norm.weight, model.language_model.layers.27.self_attn.k_norm.weight, model.language_model.layers.27.mlp.gate_proj.weight, model.language_model.layers.27.mlp.up_proj.weight, model.language_model.layers.27.mlp.down_proj.weight, model.language_model.layers.27.input_layernorm.weight, model.language_model.layers.27.post_attention_layernorm.weight, model.language_model.layers.27.pre_feedforward_layernorm.weight, model.language_model.layers.27.post_feedforward_layernorm.weight, model.language_model.layers.28.self_attn.q_proj.weight, model.language_model.layers.28.self_attn.k_proj.weight, model.language_model.layers.28.self_attn.v_proj.weight, model.language_model.layers.28.self_attn.o_proj.weight, model.language_model.layers.28.self_attn.q_norm.weight, model.language_model.layers.28.self_attn.k_norm.weight, model.language_model.layers.28.mlp.gate_proj.weight, model.language_model.layers.28.mlp.up_proj.weight, model.language_model.layers.28.mlp.down_proj.weight, model.language_model.layers.28.input_layernorm.weight, model.language_model.layers.28.post_attention_layernorm.weight, model.language_model.layers.28.pre_feedforward_layernorm.weight, model.language_model.layers.28.post_feedforward_layernorm.weight, model.language_model.layers.29.self_attn.q_proj.weight, model.language_model.layers.29.self_attn.k_proj.weight, model.language_model.layers.29.self_attn.v_proj.weight, model.language_model.layers.29.self_attn.o_proj.weight, model.language_model.layers.29.self_attn.q_norm.weight, model.language_model.layers.29.self_attn.k_norm.weight, model.language_model.layers.29.mlp.gate_proj.weight, model.language_model.layers.29.mlp.up_proj.weight, model.language_model.layers.29.mlp.down_proj.weight, model.language_model.layers.29.input_layernorm.weight, model.language_model.layers.29.post_attention_layernorm.weight, model.language_model.layers.29.pre_feedforward_layernorm.weight, model.language_model.layers.29.post_feedforward_layernorm.weight, model.language_model.layers.30.self_attn.q_proj.weight, model.language_model.layers.30.self_attn.k_proj.weight, model.language_model.layers.30.self_attn.v_proj.weight, model.language_model.layers.30.self_attn.o_proj.weight, model.language_model.layers.30.self_attn.q_norm.weight, model.language_model.layers.30.self_attn.k_norm.weight, model.language_model.layers.30.mlp.gate_proj.weight, model.language_model.layers.30.mlp.up_proj.weight, model.language_model.layers.30.mlp.down_proj.weight, model.language_model.layers.30.input_layernorm.weight, model.language_model.layers.30.post_attention_layernorm.weight, model.language_model.layers.30.pre_feedforward_layernorm.weight, model.language_model.layers.30.post_feedforward_layernorm.weight, model.language_model.layers.31.self_attn.q_proj.weight, model.language_model.layers.31.self_attn.k_proj.weight, model.language_model.layers.31.self_attn.v_proj.weight, model.language_model.layers.31.self_attn.o_proj.weight, model.language_model.layers.31.self_attn.q_norm.weight, model.language_model.layers.31.self_attn.k_norm.weight, model.language_model.layers.31.mlp.gate_proj.weight, model.language_model.layers.31.mlp.up_proj.weight, model.language_model.layers.31.mlp.down_proj.weight, model.language_model.layers.31.input_layernorm.weight, model.language_model.layers.31.post_attention_layernorm.weight, model.language_model.layers.31.pre_feedforward_layernorm.weight, model.language_model.layers.31.post_feedforward_layernorm.weight, model.language_model.layers.32.self_attn.q_proj.weight, model.language_model.layers.32.self_attn.k_proj.weight, model.language_model.layers.32.self_attn.v_proj.weight, model.language_model.layers.32.self_attn.o_proj.weight, model.language_model.layers.32.self_attn.q_norm.weight, model.language_model.layers.32.self_attn.k_norm.weight, model.language_model.layers.32.mlp.gate_proj.weight, model.language_model.layers.32.mlp.up_proj.weight, model.language_model.layers.32.mlp.down_proj.weight, model.language_model.layers.32.input_layernorm.weight, model.language_model.layers.32.post_attention_layernorm.weight, model.language_model.layers.32.pre_feedforward_layernorm.weight, model.language_model.layers.32.post_feedforward_layernorm.weight, model.language_model.layers.33.self_attn.q_proj.weight, model.language_model.layers.33.self_attn.k_proj.weight, model.language_model.layers.33.self_attn.v_proj.weight, model.language_model.layers.33.self_attn.o_proj.weight, model.language_model.layers.33.self_attn.q_norm.weight, model.language_model.layers.33.self_attn.k_norm.weight, model.language_model.layers.33.mlp.gate_proj.weight, model.language_model.layers.33.mlp.up_proj.weight, model.language_model.layers.33.mlp.down_proj.weight, model.language_model.layers.33.input_layernorm.weight, model.language_model.layers.33.post_attention_layernorm.weight, model.language_model.layers.33.pre_feedforward_layernorm.weight, model.language_model.layers.33.post_feedforward_layernorm.weight, model.language_model.norm.weight, lm_head.weight",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.vision_tower.embeddings.patch_embedding.weight, model.vision_tower.embeddings.patch_embedding.bias, model.vision_tower.embeddings.position_embedding.weight, model.vision_tower.encoder.layers.0.layer_norm1.weight, model.vision_tower.encoder.layers.0.layer_norm1.bias, model.vision_tower.encoder.layers.0.self_attn.k_proj.weight, model.vision_tower.encoder.layers.0.self_attn.k_proj.bias, model.vision_tower.encoder.layers.0.self_attn.v_proj.weight, model.vision_tower.encoder.layers.0.self_attn.v_proj.bias, model.vision_tower.encoder.layers.0.self_attn.q_proj.weight, model.vision_tower.encoder.layers.0.self_attn.q_proj.bias, model.vision_tower.encoder.layers.0.self_attn.out_proj.weight, model.vision_tower.encoder.layers.0.self_attn.out_proj.bias, model.vision_tower.encoder.layers.0.layer_norm2.weight, model.vision_tower.encoder.layers.0.layer_norm2.bias, model.vision_tower.encoder.layers.0.mlp.fc1.weight, model.vision_tower.encoder.layers.0.mlp.fc1.bias, model.vision_tower.encoder.layers.0.mlp.fc2.weight, model.vision_tower.encoder.layers.0.mlp.fc2.bias, model.vision_tower.encoder.layers.1.layer_norm1.weight, model.vision_tower.encoder.layers.1.layer_norm1.bias, model.vision_tower.encoder.layers.1.self_attn.k_proj.weight, model.vision_tower.encoder.layers.1.self_attn.k_proj.bias, model.vision_tower.encoder.layers.1.self_attn.v_proj.weight, model.vision_tower.encoder.layers.1.self_attn.v_proj.bias, model.vision_tower.encoder.layers.1.self_attn.q_proj.weight, model.vision_tower.encoder.layers.1.self_attn.q_proj.bias, model.vision_tower.encoder.layers.1.self_attn.out_proj.weight, model.vision_tower.encoder.layers.1.self_attn.out_proj.bias, model.vision_tower.encoder.layers.1.layer_norm2.weight, model.vision_tower.encoder.layers.1.layer_norm2.bias, model.vision_tower.encoder.layers.1.mlp.fc1.weight, model.vision_tower.encoder.layers.1.mlp.fc1.bias, model.vision_tower.encoder.layers.1.mlp.fc2.weight, model.vision_tower.encoder.layers.1.mlp.fc2.bias, model.vision_tower.encoder.layers.2.layer_norm1.weight, model.vision_tower.encoder.layers.2.layer_norm1.bias, model.vision_tower.encoder.layers.2.self_attn.k_proj.weight, model.vision_tower.encoder.layers.2.self_attn.k_proj.bias, model.vision_tower.encoder.layers.2.self_attn.v_proj.weight, model.vision_tower.encoder.layers.2.self_attn.v_proj.bias, model.vision_tower.encoder.layers.2.self_attn.q_proj.weight, model.vision_tower.encoder.layers.2.self_attn.q_proj.bias, model.vision_tower.encoder.layers.2.self_attn.out_proj.weight, model.vision_tower.encoder.layers.2.self_attn.out_proj.bias, model.vision_tower.encoder.layers.2.layer_norm2.weight, model.vision_tower.encoder.layers.2.layer_norm2.bias, model.vision_tower.encoder.layers.2.mlp.fc1.weight, model.vision_tower.encoder.layers.2.mlp.fc1.bias, model.vision_tower.encoder.layers.2.mlp.fc2.weight, model.vision_tower.encoder.layers.2.mlp.fc2.bias, model.vision_tower.encoder.layers.3.layer_norm1.weight, model.vision_tower.encoder.layers.3.layer_norm1.bias, model.vision_tower.encoder.layers.3.self_attn.k_proj.weight, model.vision_tower.encoder.layers.3.self_attn.k_proj.bias, model.vision_tower.encoder.layers.3.self_attn.v_proj.weight, model.vision_tower.encoder.layers.3.self_attn.v_proj.bias, model.vision_tower.encoder.layers.3.self_attn.q_proj.weight, model.vision_tower.encoder.layers.3.self_attn.q_proj.bias, model.vision_tower.encoder.layers.3.self_attn.out_proj.weight, model.vision_tower.encoder.layers.3.self_attn.out_proj.bias, model.vision_tower.encoder.layers.3.layer_norm2.weight, model.vision_tower.encoder.layers.3.layer_norm2.bias, model.vision_tower.encoder.layers.3.mlp.fc1.weight, model.vision_tower.encoder.layers.3.mlp.fc1.bias, model.vision_tower.encoder.layers.3.mlp.fc2.weight, model.vision_tower.encoder.layers.3.mlp.fc2.bias, model.vision_tower.encoder.layers.4.layer_norm1.weight, model.vision_tower.encoder.layers.4.layer_norm1.bias, model.vision_tower.encoder.layers.4.self_attn.k_proj.weight, model.vision_tower.encoder.layers.4.self_attn.k_proj.bias, model.vision_tower.encoder.layers.4.self_attn.v_proj.weight, model.vision_tower.encoder.layers.4.self_attn.v_proj.bias, model.vision_tower.encoder.layers.4.self_attn.q_proj.weight, model.vision_tower.encoder.layers.4.self_attn.q_proj.bias, model.vision_tower.encoder.layers.4.self_attn.out_proj.weight, model.vision_tower.encoder.layers.4.self_attn.out_proj.bias, model.vision_tower.encoder.layers.4.layer_norm2.weight, model.vision_tower.encoder.layers.4.layer_norm2.bias, model.vision_tower.encoder.layers.4.mlp.fc1.weight, model.vision_tower.encoder.layers.4.mlp.fc1.bias, model.vision_tower.encoder.layers.4.mlp.fc2.weight, model.vision_tower.encoder.layers.4.mlp.fc2.bias, model.vision_tower.encoder.layers.5.layer_norm1.weight, model.vision_tower.encoder.layers.5.layer_norm1.bias, model.vision_tower.encoder.layers.5.self_attn.k_proj.weight, model.vision_tower.encoder.layers.5.self_attn.k_proj.bias, model.vision_tower.encoder.layers.5.self_attn.v_proj.weight, model.vision_tower.encoder.layers.5.self_attn.v_proj.bias, model.vision_tower.encoder.layers.5.self_attn.q_proj.weight, model.vision_tower.encoder.layers.5.self_attn.q_proj.bias, model.vision_tower.encoder.layers.5.self_attn.out_proj.weight, model.vision_tower.encoder.layers.5.self_attn.out_proj.bias, model.vision_tower.encoder.layers.5.layer_norm2.weight, model.vision_tower.encoder.layers.5.layer_norm2.bias, model.vision_tower.encoder.layers.5.mlp.fc1.weight, model.vision_tower.encoder.layers.5.mlp.fc1.bias, model.vision_tower.encoder.layers.5.mlp.fc2.weight, model.vision_tower.encoder.layers.5.mlp.fc2.bias, model.vision_tower.encoder.layers.6.layer_norm1.weight, model.vision_tower.encoder.layers.6.layer_norm1.bias, model.vision_tower.encoder.layers.6.self_attn.k_proj.weight, model.vision_tower.encoder.layers.6.self_attn.k_proj.bias, model.vision_tower.encoder.layers.6.self_attn.v_proj.weight, model.vision_tower.encoder.layers.6.self_attn.v_proj.bias, model.vision_tower.encoder.layers.6.self_attn.q_proj.weight, model.vision_tower.encoder.layers.6.self_attn.q_proj.bias, model.vision_tower.encoder.layers.6.self_attn.out_proj.weight, model.vision_tower.encoder.layers.6.self_attn.out_proj.bias, model.vision_tower.encoder.layers.6.layer_norm2.weight, model.vision_tower.encoder.layers.6.layer_norm2.bias, model.vision_tower.encoder.layers.6.mlp.fc1.weight, model.vision_tower.encoder.layers.6.mlp.fc1.bias, model.vision_tower.encoder.layers.6.mlp.fc2.weight, model.vision_tower.encoder.layers.6.mlp.fc2.bias, model.vision_tower.encoder.layers.7.layer_norm1.weight, model.vision_tower.encoder.layers.7.layer_norm1.bias, model.vision_tower.encoder.layers.7.self_attn.k_proj.weight, model.vision_tower.encoder.layers.7.self_attn.k_proj.bias, model.vision_tower.encoder.layers.7.self_attn.v_proj.weight, model.vision_tower.encoder.layers.7.self_attn.v_proj.bias, model.vision_tower.encoder.layers.7.self_attn.q_proj.weight, model.vision_tower.encoder.layers.7.self_attn.q_proj.bias, model.vision_tower.encoder.layers.7.self_attn.out_proj.weight, model.vision_tower.encoder.layers.7.self_attn.out_proj.bias, model.vision_tower.encoder.layers.7.layer_norm2.weight, model.vision_tower.encoder.layers.7.layer_norm2.bias, model.vision_tower.encoder.layers.7.mlp.fc1.weight, model.vision_tower.encoder.layers.7.mlp.fc1.bias, model.vision_tower.encoder.layers.7.mlp.fc2.weight, model.vision_tower.encoder.layers.7.mlp.fc2.bias, model.vision_tower.encoder.layers.8.layer_norm1.weight, model.vision_tower.encoder.layers.8.layer_norm1.bias, model.vision_tower.encoder.layers.8.self_attn.k_proj.weight, model.vision_tower.encoder.layers.8.self_attn.k_proj.bias, model.vision_tower.encoder.layers.8.self_attn.v_proj.weight, model.vision_tower.encoder.layers.8.self_attn.v_proj.bias, model.vision_tower.encoder.layers.8.self_attn.q_proj.weight, model.vision_tower.encoder.layers.8.self_attn.q_proj.bias, model.vision_tower.encoder.layers.8.self_attn.out_proj.weight, model.vision_tower.encoder.layers.8.self_attn.out_proj.bias, model.vision_tower.encoder.layers.8.layer_norm2.weight, model.vision_tower.encoder.layers.8.layer_norm2.bias, model.vision_tower.encoder.layers.8.mlp.fc1.weight, model.vision_tower.encoder.layers.8.mlp.fc1.bias, model.vision_tower.encoder.layers.8.mlp.fc2.weight, model.vision_tower.encoder.layers.8.mlp.fc2.bias, model.vision_tower.encoder.layers.9.layer_norm1.weight, model.vision_tower.encoder.layers.9.layer_norm1.bias, model.vision_tower.encoder.layers.9.self_attn.k_proj.weight, model.vision_tower.encoder.layers.9.self_attn.k_proj.bias, model.vision_tower.encoder.layers.9.self_attn.v_proj.weight, model.vision_tower.encoder.layers.9.self_attn.v_proj.bias, model.vision_tower.encoder.layers.9.self_attn.q_proj.weight, model.vision_tower.encoder.layers.9.self_attn.q_proj.bias, model.vision_tower.encoder.layers.9.self_attn.out_proj.weight, model.vision_tower.encoder.layers.9.self_attn.out_proj.bias, model.vision_tower.encoder.layers.9.layer_norm2.weight, model.vision_tower.encoder.layers.9.layer_norm2.bias, model.vision_tower.encoder.layers.9.mlp.fc1.weight, model.vision_tower.encoder.layers.9.mlp.fc1.bias, model.vision_tower.encoder.layers.9.mlp.fc2.weight, model.vision_tower.encoder.layers.9.mlp.fc2.bias, model.vision_tower.encoder.layers.10.layer_norm1.weight, model.vision_tower.encoder.layers.10.layer_norm1.bias, model.vision_tower.encoder.layers.10.self_attn.k_proj.weight, model.vision_tower.encoder.layers.10.self_attn.k_proj.bias, model.vision_tower.encoder.layers.10.self_attn.v_proj.weight, model.vision_tower.encoder.layers.10.self_attn.v_proj.bias, model.vision_tower.encoder.layers.10.self_attn.q_proj.weight, model.vision_tower.encoder.layers.10.self_attn.q_proj.bias, model.vision_tower.encoder.layers.10.self_attn.out_proj.weight, model.vision_tower.encoder.layers.10.self_attn.out_proj.bias, model.vision_tower.encoder.layers.10.layer_norm2.weight, model.vision_tower.encoder.layers.10.layer_norm2.bias, model.vision_tower.encoder.layers.10.mlp.fc1.weight, model.vision_tower.encoder.layers.10.mlp.fc1.bias, model.vision_tower.encoder.layers.10.mlp.fc2.weight, model.vision_tower.encoder.layers.10.mlp.fc2.bias, model.vision_tower.encoder.layers.11.layer_norm1.weight, model.vision_tower.encoder.layers.11.layer_norm1.bias, model.vision_tower.encoder.layers.11.self_attn.k_proj.weight, model.vision_tower.encoder.layers.11.self_attn.k_proj.bias, model.vision_tower.encoder.layers.11.self_attn.v_proj.weight, model.vision_tower.encoder.layers.11.self_attn.v_proj.bias, model.vision_tower.encoder.layers.11.self_attn.q_proj.weight, model.vision_tower.encoder.layers.11.self_attn.q_proj.bias, model.vision_tower.encoder.layers.11.self_attn.out_proj.weight, model.vision_tower.encoder.layers.11.self_attn.out_proj.bias, model.vision_tower.encoder.layers.11.layer_norm2.weight, model.vision_tower.encoder.layers.11.layer_norm2.bias, model.vision_tower.encoder.layers.11.mlp.fc1.weight, model.vision_tower.encoder.layers.11.mlp.fc1.bias, model.vision_tower.encoder.layers.11.mlp.fc2.weight, model.vision_tower.encoder.layers.11.mlp.fc2.bias, model.vision_tower.encoder.layers.12.layer_norm1.weight, model.vision_tower.encoder.layers.12.layer_norm1.bias, model.vision_tower.encoder.layers.12.self_attn.k_proj.weight, model.vision_tower.encoder.layers.12.self_attn.k_proj.bias, model.vision_tower.encoder.layers.12.self_attn.v_proj.weight, model.vision_tower.encoder.layers.12.self_attn.v_proj.bias, model.vision_tower.encoder.layers.12.self_attn.q_proj.weight, model.vision_tower.encoder.layers.12.self_attn.q_proj.bias, model.vision_tower.encoder.layers.12.self_attn.out_proj.weight, model.vision_tower.encoder.layers.12.self_attn.out_proj.bias, model.vision_tower.encoder.layers.12.layer_norm2.weight, model.vision_tower.encoder.layers.12.layer_norm2.bias, model.vision_tower.encoder.layers.12.mlp.fc1.weight, model.vision_tower.encoder.layers.12.mlp.fc1.bias, model.vision_tower.encoder.layers.12.mlp.fc2.weight, model.vision_tower.encoder.layers.12.mlp.fc2.bias, model.vision_tower.encoder.layers.13.layer_norm1.weight, model.vision_tower.encoder.layers.13.layer_norm1.bias, model.vision_tower.encoder.layers.13.self_attn.k_proj.weight, model.vision_tower.encoder.layers.13.self_attn.k_proj.bias, model.vision_tower.encoder.layers.13.self_attn.v_proj.weight, model.vision_tower.encoder.layers.13.self_attn.v_proj.bias, model.vision_tower.encoder.layers.13.self_attn.q_proj.weight, model.vision_tower.encoder.layers.13.self_attn.q_proj.bias, model.vision_tower.encoder.layers.13.self_attn.out_proj.weight, model.vision_tower.encoder.layers.13.self_attn.out_proj.bias, model.vision_tower.encoder.layers.13.layer_norm2.weight, model.vision_tower.encoder.layers.13.layer_norm2.bias, model.vision_tower.encoder.layers.13.mlp.fc1.weight, model.vision_tower.encoder.layers.13.mlp.fc1.bias, model.vision_tower.encoder.layers.13.mlp.fc2.weight, model.vision_tower.encoder.layers.13.mlp.fc2.bias, model.vision_tower.encoder.layers.14.layer_norm1.weight, model.vision_tower.encoder.layers.14.layer_norm1.bias, model.vision_tower.encoder.layers.14.self_attn.k_proj.weight, model.vision_tower.encoder.layers.14.self_attn.k_proj.bias, model.vision_tower.encoder.layers.14.self_attn.v_proj.weight, model.vision_tower.encoder.layers.14.self_attn.v_proj.bias, model.vision_tower.encoder.layers.14.self_attn.q_proj.weight, model.vision_tower.encoder.layers.14.self_attn.q_proj.bias, model.vision_tower.encoder.layers.14.self_attn.out_proj.weight, model.vision_tower.encoder.layers.14.self_attn.out_proj.bias, model.vision_tower.encoder.layers.14.layer_norm2.weight, model.vision_tower.encoder.layers.14.layer_norm2.bias, model.vision_tower.encoder.layers.14.mlp.fc1.weight, model.vision_tower.encoder.layers.14.mlp.fc1.bias, model.vision_tower.encoder.layers.14.mlp.fc2.weight, model.vision_tower.encoder.layers.14.mlp.fc2.bias, model.vision_tower.encoder.layers.15.layer_norm1.weight, model.vision_tower.encoder.layers.15.layer_norm1.bias, model.vision_tower.encoder.layers.15.self_attn.k_proj.weight, model.vision_tower.encoder.layers.15.self_attn.k_proj.bias, model.vision_tower.encoder.layers.15.self_attn.v_proj.weight, model.vision_tower.encoder.layers.15.self_attn.v_proj.bias, model.vision_tower.encoder.layers.15.self_attn.q_proj.weight, model.vision_tower.encoder.layers.15.self_attn.q_proj.bias, model.vision_tower.encoder.layers.15.self_attn.out_proj.weight, model.vision_tower.encoder.layers.15.self_attn.out_proj.bias, model.vision_tower.encoder.layers.15.layer_norm2.weight, model.vision_tower.encoder.layers.15.layer_norm2.bias, model.vision_tower.encoder.layers.15.mlp.fc1.weight, model.vision_tower.encoder.layers.15.mlp.fc1.bias, model.vision_tower.encoder.layers.15.mlp.fc2.weight, model.vision_tower.encoder.layers.15.mlp.fc2.bias, model.vision_tower.encoder.layers.16.layer_norm1.weight, model.vision_tower.encoder.layers.16.layer_norm1.bias, model.vision_tower.encoder.layers.16.self_attn.k_proj.weight, model.vision_tower.encoder.layers.16.self_attn.k_proj.bias, model.vision_tower.encoder.layers.16.self_attn.v_proj.weight, model.vision_tower.encoder.layers.16.self_attn.v_proj.bias, model.vision_tower.encoder.layers.16.self_attn.q_proj.weight, model.vision_tower.encoder.layers.16.self_attn.q_proj.bias, model.vision_tower.encoder.layers.16.self_attn.out_proj.weight, model.vision_tower.encoder.layers.16.self_attn.out_proj.bias, model.vision_tower.encoder.layers.16.layer_norm2.weight, model.vision_tower.encoder.layers.16.layer_norm2.bias, model.vision_tower.encoder.layers.16.mlp.fc1.weight, model.vision_tower.encoder.layers.16.mlp.fc1.bias, model.vision_tower.encoder.layers.16.mlp.fc2.weight, model.vision_tower.encoder.layers.16.mlp.fc2.bias, model.vision_tower.encoder.layers.17.layer_norm1.weight, model.vision_tower.encoder.layers.17.layer_norm1.bias, model.vision_tower.encoder.layers.17.self_attn.k_proj.weight, model.vision_tower.encoder.layers.17.self_attn.k_proj.bias, model.vision_tower.encoder.layers.17.self_attn.v_proj.weight, model.vision_tower.encoder.layers.17.self_attn.v_proj.bias, model.vision_tower.encoder.layers.17.self_attn.q_proj.weight, model.vision_tower.encoder.layers.17.self_attn.q_proj.bias, model.vision_tower.encoder.layers.17.self_attn.out_proj.weight, model.vision_tower.encoder.layers.17.self_attn.out_proj.bias, model.vision_tower.encoder.layers.17.layer_norm2.weight, model.vision_tower.encoder.layers.17.layer_norm2.bias, model.vision_tower.encoder.layers.17.mlp.fc1.weight, model.vision_tower.encoder.layers.17.mlp.fc1.bias, model.vision_tower.encoder.layers.17.mlp.fc2.weight, model.vision_tower.encoder.layers.17.mlp.fc2.bias, model.vision_tower.encoder.layers.18.layer_norm1.weight, model.vision_tower.encoder.layers.18.layer_norm1.bias, model.vision_tower.encoder.layers.18.self_attn.k_proj.weight, model.vision_tower.encoder.layers.18.self_attn.k_proj.bias, model.vision_tower.encoder.layers.18.self_attn.v_proj.weight, model.vision_tower.encoder.layers.18.self_attn.v_proj.bias, model.vision_tower.encoder.layers.18.self_attn.q_proj.weight, model.vision_tower.encoder.layers.18.self_attn.q_proj.bias, model.vision_tower.encoder.layers.18.self_attn.out_proj.weight, model.vision_tower.encoder.layers.18.self_attn.out_proj.bias, model.vision_tower.encoder.layers.18.layer_norm2.weight, model.vision_tower.encoder.layers.18.layer_norm2.bias, model.vision_tower.encoder.layers.18.mlp.fc1.weight, model.vision_tower.encoder.layers.18.mlp.fc1.bias, model.vision_tower.encoder.layers.18.mlp.fc2.weight, model.vision_tower.encoder.layers.18.mlp.fc2.bias, model.vision_tower.encoder.layers.19.layer_norm1.weight, model.vision_tower.encoder.layers.19.layer_norm1.bias, model.vision_tower.encoder.layers.19.self_attn.k_proj.weight, model.vision_tower.encoder.layers.19.self_attn.k_proj.bias, model.vision_tower.encoder.layers.19.self_attn.v_proj.weight, model.vision_tower.encoder.layers.19.self_attn.v_proj.bias, model.vision_tower.encoder.layers.19.self_attn.q_proj.weight, model.vision_tower.encoder.layers.19.self_attn.q_proj.bias, model.vision_tower.encoder.layers.19.self_attn.out_proj.weight, model.vision_tower.encoder.layers.19.self_attn.out_proj.bias, model.vision_tower.encoder.layers.19.layer_norm2.weight, model.vision_tower.encoder.layers.19.layer_norm2.bias, model.vision_tower.encoder.layers.19.mlp.fc1.weight, model.vision_tower.encoder.layers.19.mlp.fc1.bias, model.vision_tower.encoder.layers.19.mlp.fc2.weight, model.vision_tower.encoder.layers.19.mlp.fc2.bias, model.vision_tower.encoder.layers.20.layer_norm1.weight, model.vision_tower.encoder.layers.20.layer_norm1.bias, model.vision_tower.encoder.layers.20.self_attn.k_proj.weight, model.vision_tower.encoder.layers.20.self_attn.k_proj.bias, model.vision_tower.encoder.layers.20.self_attn.v_proj.weight, model.vision_tower.encoder.layers.20.self_attn.v_proj.bias, model.vision_tower.encoder.layers.20.self_attn.q_proj.weight, model.vision_tower.encoder.layers.20.self_attn.q_proj.bias, model.vision_tower.encoder.layers.20.self_attn.out_proj.weight, model.vision_tower.encoder.layers.20.self_attn.out_proj.bias, model.vision_tower.encoder.layers.20.layer_norm2.weight, model.vision_tower.encoder.layers.20.layer_norm2.bias, model.vision_tower.encoder.layers.20.mlp.fc1.weight, model.vision_tower.encoder.layers.20.mlp.fc1.bias, model.vision_tower.encoder.layers.20.mlp.fc2.weight, model.vision_tower.encoder.layers.20.mlp.fc2.bias, model.vision_tower.encoder.layers.21.layer_norm1.weight, model.vision_tower.encoder.layers.21.layer_norm1.bias, model.vision_tower.encoder.layers.21.self_attn.k_proj.weight, model.vision_tower.encoder.layers.21.self_attn.k_proj.bias, model.vision_tower.encoder.layers.21.self_attn.v_proj.weight, model.vision_tower.encoder.layers.21.self_attn.v_proj.bias, model.vision_tower.encoder.layers.21.self_attn.q_proj.weight, model.vision_tower.encoder.layers.21.self_attn.q_proj.bias, model.vision_tower.encoder.layers.21.self_attn.out_proj.weight, model.vision_tower.encoder.layers.21.self_attn.out_proj.bias, model.vision_tower.encoder.layers.21.layer_norm2.weight, model.vision_tower.encoder.layers.21.layer_norm2.bias, model.vision_tower.encoder.layers.21.mlp.fc1.weight, model.vision_tower.encoder.layers.21.mlp.fc1.bias, model.vision_tower.encoder.layers.21.mlp.fc2.weight, model.vision_tower.encoder.layers.21.mlp.fc2.bias, model.vision_tower.encoder.layers.22.layer_norm1.weight, model.vision_tower.encoder.layers.22.layer_norm1.bias, model.vision_tower.encoder.layers.22.self_attn.k_proj.weight, model.vision_tower.encoder.layers.22.self_attn.k_proj.bias, model.vision_tower.encoder.layers.22.self_attn.v_proj.weight, model.vision_tower.encoder.layers.22.self_attn.v_proj.bias, model.vision_tower.encoder.layers.22.self_attn.q_proj.weight, model.vision_tower.encoder.layers.22.self_attn.q_proj.bias, model.vision_tower.encoder.layers.22.self_attn.out_proj.weight, model.vision_tower.encoder.layers.22.self_attn.out_proj.bias, model.vision_tower.encoder.layers.22.layer_norm2.weight, model.vision_tower.encoder.layers.22.layer_norm2.bias, model.vision_tower.encoder.layers.22.mlp.fc1.weight, model.vision_tower.encoder.layers.22.mlp.fc1.bias, model.vision_tower.encoder.layers.22.mlp.fc2.weight, model.vision_tower.encoder.layers.22.mlp.fc2.bias, model.vision_tower.encoder.layers.23.layer_norm1.weight, model.vision_tower.encoder.layers.23.layer_norm1.bias, model.vision_tower.encoder.layers.23.self_attn.k_proj.weight, model.vision_tower.encoder.layers.23.self_attn.k_proj.bias, model.vision_tower.encoder.layers.23.self_attn.v_proj.weight, model.vision_tower.encoder.layers.23.self_attn.v_proj.bias, model.vision_tower.encoder.layers.23.self_attn.q_proj.weight, model.vision_tower.encoder.layers.23.self_attn.q_proj.bias, model.vision_tower.encoder.layers.23.self_attn.out_proj.weight, model.vision_tower.encoder.layers.23.self_attn.out_proj.bias, model.vision_tower.encoder.layers.23.layer_norm2.weight, model.vision_tower.encoder.layers.23.layer_norm2.bias, model.vision_tower.encoder.layers.23.mlp.fc1.weight, model.vision_tower.encoder.layers.23.mlp.fc1.bias, model.vision_tower.encoder.layers.23.mlp.fc2.weight, model.vision_tower.encoder.layers.23.mlp.fc2.bias, model.vision_tower.encoder.layers.24.layer_norm1.weight, model.vision_tower.encoder.layers.24.layer_norm1.bias, model.vision_tower.encoder.layers.24.self_attn.k_proj.weight, model.vision_tower.encoder.layers.24.self_attn.k_proj.bias, model.vision_tower.encoder.layers.24.self_attn.v_proj.weight, model.vision_tower.encoder.layers.24.self_attn.v_proj.bias, model.vision_tower.encoder.layers.24.self_attn.q_proj.weight, model.vision_tower.encoder.layers.24.self_attn.q_proj.bias, model.vision_tower.encoder.layers.24.self_attn.out_proj.weight, model.vision_tower.encoder.layers.24.self_attn.out_proj.bias, model.vision_tower.encoder.layers.24.layer_norm2.weight, model.vision_tower.encoder.layers.24.layer_norm2.bias, model.vision_tower.encoder.layers.24.mlp.fc1.weight, model.vision_tower.encoder.layers.24.mlp.fc1.bias, model.vision_tower.encoder.layers.24.mlp.fc2.weight, model.vision_tower.encoder.layers.24.mlp.fc2.bias, model.vision_tower.encoder.layers.25.layer_norm1.weight, model.vision_tower.encoder.layers.25.layer_norm1.bias, model.vision_tower.encoder.layers.25.self_attn.k_proj.weight, model.vision_tower.encoder.layers.25.self_attn.k_proj.bias, model.vision_tower.encoder.layers.25.self_attn.v_proj.weight, model.vision_tower.encoder.layers.25.self_attn.v_proj.bias, model.vision_tower.encoder.layers.25.self_attn.q_proj.weight, model.vision_tower.encoder.layers.25.self_attn.q_proj.bias, model.vision_tower.encoder.layers.25.self_attn.out_proj.weight, model.vision_tower.encoder.layers.25.self_attn.out_proj.bias, model.vision_tower.encoder.layers.25.layer_norm2.weight, model.vision_tower.encoder.layers.25.layer_norm2.bias, model.vision_tower.encoder.layers.25.mlp.fc1.weight, model.vision_tower.encoder.layers.25.mlp.fc1.bias, model.vision_tower.encoder.layers.25.mlp.fc2.weight, model.vision_tower.encoder.layers.25.mlp.fc2.bias, model.vision_tower.encoder.layers.26.layer_norm1.weight, model.vision_tower.encoder.layers.26.layer_norm1.bias, model.vision_tower.encoder.layers.26.self_attn.k_proj.weight, model.vision_tower.encoder.layers.26.self_attn.k_proj.bias, model.vision_tower.encoder.layers.26.self_attn.v_proj.weight, model.vision_tower.encoder.layers.26.self_attn.v_proj.bias, model.vision_tower.encoder.layers.26.self_attn.q_proj.weight, model.vision_tower.encoder.layers.26.self_attn.q_proj.bias, model.vision_tower.encoder.layers.26.self_attn.out_proj.weight, model.vision_tower.encoder.layers.26.self_attn.out_proj.bias, model.vision_tower.encoder.layers.26.layer_norm2.weight, model.vision_tower.encoder.layers.26.layer_norm2.bias, model.vision_tower.encoder.layers.26.mlp.fc1.weight, model.vision_tower.encoder.layers.26.mlp.fc1.bias, model.vision_tower.encoder.layers.26.mlp.fc2.weight, model.vision_tower.encoder.layers.26.mlp.fc2.bias, model.vision_tower.post_layernorm.weight, model.vision_tower.post_layernorm.bias, model.multi_modal_projector.mm_input_projection_weight, model.multi_modal_projector.mm_soft_emb_norm.weight, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.self_attn.q_norm.weight, model.language_model.layers.0.self_attn.k_norm.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.0.pre_feedforward_layernorm.weight, model.language_model.layers.0.post_feedforward_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.self_attn.q_norm.weight, model.language_model.layers.1.self_attn.k_norm.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.layers.1.pre_feedforward_layernorm.weight, model.language_model.layers.1.post_feedforward_layernorm.weight, model.language_model.layers.2.self_attn.q_proj.weight, model.language_model.layers.2.self_attn.k_proj.weight, model.language_model.layers.2.self_attn.v_proj.weight, model.language_model.layers.2.self_attn.o_proj.weight, model.language_model.layers.2.self_attn.q_norm.weight, model.language_model.layers.2.self_attn.k_norm.weight, model.language_model.layers.2.mlp.gate_proj.weight, model.language_model.layers.2.mlp.up_proj.weight, model.language_model.layers.2.mlp.down_proj.weight, model.language_model.layers.2.input_layernorm.weight, model.language_model.layers.2.post_attention_layernorm.weight, model.language_model.layers.2.pre_feedforward_layernorm.weight, model.language_model.layers.2.post_feedforward_layernorm.weight, model.language_model.layers.3.self_attn.q_proj.weight, model.language_model.layers.3.self_attn.k_proj.weight, model.language_model.layers.3.self_attn.v_proj.weight, model.language_model.layers.3.self_attn.o_proj.weight, model.language_model.layers.3.self_attn.q_norm.weight, model.language_model.layers.3.self_attn.k_norm.weight, model.language_model.layers.3.mlp.gate_proj.weight, model.language_model.layers.3.mlp.up_proj.weight, model.language_model.layers.3.mlp.down_proj.weight, model.language_model.layers.3.input_layernorm.weight, model.language_model.layers.3.post_attention_layernorm.weight, model.language_model.layers.3.pre_feedforward_layernorm.weight, model.language_model.layers.3.post_feedforward_layernorm.weight, model.language_model.layers.4.self_attn.q_proj.weight, model.language_model.layers.4.self_attn.k_proj.weight, model.language_model.layers.4.self_attn.v_proj.weight, model.language_model.layers.4.self_attn.o_proj.weight, model.language_model.layers.4.self_attn.q_norm.weight, model.language_model.layers.4.self_attn.k_norm.weight, model.language_model.layers.4.mlp.gate_proj.weight, model.language_model.layers.4.mlp.up_proj.weight, model.language_model.layers.4.mlp.down_proj.weight, model.language_model.layers.4.input_layernorm.weight, model.language_model.layers.4.post_attention_layernorm.weight, model.language_model.layers.4.pre_feedforward_layernorm.weight, model.language_model.layers.4.post_feedforward_layernorm.weight, model.language_model.layers.5.self_attn.q_proj.weight, model.language_model.layers.5.self_attn.k_proj.weight, model.language_model.layers.5.self_attn.v_proj.weight, model.language_model.layers.5.self_attn.o_proj.weight, model.language_model.layers.5.self_attn.q_norm.weight, model.language_model.layers.5.self_attn.k_norm.weight, model.language_model.layers.5.mlp.gate_proj.weight, model.language_model.layers.5.mlp.up_proj.weight, model.language_model.layers.5.mlp.down_proj.weight, model.language_model.layers.5.input_layernorm.weight, model.language_model.layers.5.post_attention_layernorm.weight, model.language_model.layers.5.pre_feedforward_layernorm.weight, model.language_model.layers.5.post_feedforward_layernorm.weight, model.language_model.layers.6.self_attn.q_proj.weight, model.language_model.layers.6.self_attn.k_proj.weight, model.language_model.layers.6.self_attn.v_proj.weight, model.language_model.layers.6.self_attn.o_proj.weight, model.language_model.layers.6.self_attn.q_norm.weight, model.language_model.layers.6.self_attn.k_norm.weight, model.language_model.layers.6.mlp.gate_proj.weight, model.language_model.layers.6.mlp.up_proj.weight, model.language_model.layers.6.mlp.down_proj.weight, model.language_model.layers.6.input_layernorm.weight, model.language_model.layers.6.post_attention_layernorm.weight, model.language_model.layers.6.pre_feedforward_layernorm.weight, model.language_model.layers.6.post_feedforward_layernorm.weight, model.language_model.layers.7.self_attn.q_proj.weight, model.language_model.layers.7.self_attn.k_proj.weight, model.language_model.layers.7.self_attn.v_proj.weight, model.language_model.layers.7.self_attn.o_proj.weight, model.language_model.layers.7.self_attn.q_norm.weight, model.language_model.layers.7.self_attn.k_norm.weight, model.language_model.layers.7.mlp.gate_proj.weight, model.language_model.layers.7.mlp.up_proj.weight, model.language_model.layers.7.mlp.down_proj.weight, model.language_model.layers.7.input_layernorm.weight, model.language_model.layers.7.post_attention_layernorm.weight, model.language_model.layers.7.pre_feedforward_layernorm.weight, model.language_model.layers.7.post_feedforward_layernorm.weight, model.language_model.layers.8.self_attn.q_proj.weight, model.language_model.layers.8.self_attn.k_proj.weight, model.language_model.layers.8.self_attn.v_proj.weight, model.language_model.layers.8.self_attn.o_proj.weight, model.language_model.layers.8.self_attn.q_norm.weight, model.language_model.layers.8.self_attn.k_norm.weight, model.language_model.layers.8.mlp.gate_proj.weight, model.language_model.layers.8.mlp.up_proj.weight, model.language_model.layers.8.mlp.down_proj.weight, model.language_model.layers.8.input_layernorm.weight, model.language_model.layers.8.post_attention_layernorm.weight, model.language_model.layers.8.pre_feedforward_layernorm.weight, model.language_model.layers.8.post_feedforward_layernorm.weight, model.language_model.layers.9.self_attn.q_proj.weight, model.language_model.layers.9.self_attn.k_proj.weight, model.language_model.layers.9.self_attn.v_proj.weight, model.language_model.layers.9.self_attn.o_proj.weight, model.language_model.layers.9.self_attn.q_norm.weight, model.language_model.layers.9.self_attn.k_norm.weight, model.language_model.layers.9.mlp.gate_proj.weight, model.language_model.layers.9.mlp.up_proj.weight, model.language_model.layers.9.mlp.down_proj.weight, model.language_model.layers.9.input_layernorm.weight, model.language_model.layers.9.post_attention_layernorm.weight, model.language_model.layers.9.pre_feedforward_layernorm.weight, model.language_model.layers.9.post_feedforward_layernorm.weight, model.language_model.layers.10.self_attn.q_proj.weight, model.language_model.layers.10.self_attn.k_proj.weight, model.language_model.layers.10.self_attn.v_proj.weight, model.language_model.layers.10.self_attn.o_proj.weight, model.language_model.layers.10.self_attn.q_norm.weight, model.language_model.layers.10.self_attn.k_norm.weight, model.language_model.layers.10.mlp.gate_proj.weight, model.language_model.layers.10.mlp.up_proj.weight, model.language_model.layers.10.mlp.down_proj.weight, model.language_model.layers.10.input_layernorm.weight, model.language_model.layers.10.post_attention_layernorm.weight, model.language_model.layers.10.pre_feedforward_layernorm.weight, model.language_model.layers.10.post_feedforward_layernorm.weight, model.language_model.layers.11.self_attn.q_proj.weight, model.language_model.layers.11.self_attn.k_proj.weight, model.language_model.layers.11.self_attn.v_proj.weight, model.language_model.layers.11.self_attn.o_proj.weight, model.language_model.layers.11.self_attn.q_norm.weight, model.language_model.layers.11.self_attn.k_norm.weight, model.language_model.layers.11.mlp.gate_proj.weight, model.language_model.layers.11.mlp.up_proj.weight, model.language_model.layers.11.mlp.down_proj.weight, model.language_model.layers.11.input_layernorm.weight, model.language_model.layers.11.post_attention_layernorm.weight, model.language_model.layers.11.pre_feedforward_layernorm.weight, model.language_model.layers.11.post_feedforward_layernorm.weight, model.language_model.layers.12.self_attn.q_proj.weight, model.language_model.layers.12.self_attn.k_proj.weight, model.language_model.layers.12.self_attn.v_proj.weight, model.language_model.layers.12.self_attn.o_proj.weight, model.language_model.layers.12.self_attn.q_norm.weight, model.language_model.layers.12.self_attn.k_norm.weight, model.language_model.layers.12.mlp.gate_proj.weight, model.language_model.layers.12.mlp.up_proj.weight, model.language_model.layers.12.mlp.down_proj.weight, model.language_model.layers.12.input_layernorm.weight, model.language_model.layers.12.post_attention_layernorm.weight, model.language_model.layers.12.pre_feedforward_layernorm.weight, model.language_model.layers.12.post_feedforward_layernorm.weight, model.language_model.layers.13.self_attn.q_proj.weight, model.language_model.layers.13.self_attn.k_proj.weight, model.language_model.layers.13.self_attn.v_proj.weight, model.language_model.layers.13.self_attn.o_proj.weight, model.language_model.layers.13.self_attn.q_norm.weight, model.language_model.layers.13.self_attn.k_norm.weight, model.language_model.layers.13.mlp.gate_proj.weight, model.language_model.layers.13.mlp.up_proj.weight, model.language_model.layers.13.mlp.down_proj.weight, model.language_model.layers.13.input_layernorm.weight, model.language_model.layers.13.post_attention_layernorm.weight, model.language_model.layers.13.pre_feedforward_layernorm.weight, model.language_model.layers.13.post_feedforward_layernorm.weight, model.language_model.layers.14.self_attn.q_proj.weight, model.language_model.layers.14.self_attn.k_proj.weight, model.language_model.layers.14.self_attn.v_proj.weight, model.language_model.layers.14.self_attn.o_proj.weight, model.language_model.layers.14.self_attn.q_norm.weight, model.language_model.layers.14.self_attn.k_norm.weight, model.language_model.layers.14.mlp.gate_proj.weight, model.language_model.layers.14.mlp.up_proj.weight, model.language_model.layers.14.mlp.down_proj.weight, model.language_model.layers.14.input_layernorm.weight, model.language_model.layers.14.post_attention_layernorm.weight, model.language_model.layers.14.pre_feedforward_layernorm.weight, model.language_model.layers.14.post_feedforward_layernorm.weight, model.language_model.layers.15.self_attn.q_proj.weight, model.language_model.layers.15.self_attn.k_proj.weight, model.language_model.layers.15.self_attn.v_proj.weight, model.language_model.layers.15.self_attn.o_proj.weight, model.language_model.layers.15.self_attn.q_norm.weight, model.language_model.layers.15.self_attn.k_norm.weight, model.language_model.layers.15.mlp.gate_proj.weight, model.language_model.layers.15.mlp.up_proj.weight, model.language_model.layers.15.mlp.down_proj.weight, model.language_model.layers.15.input_layernorm.weight, model.language_model.layers.15.post_attention_layernorm.weight, model.language_model.layers.15.pre_feedforward_layernorm.weight, model.language_model.layers.15.post_feedforward_layernorm.weight, model.language_model.layers.16.self_attn.q_proj.weight, model.language_model.layers.16.self_attn.k_proj.weight, model.language_model.layers.16.self_attn.v_proj.weight, model.language_model.layers.16.self_attn.o_proj.weight, model.language_model.layers.16.self_attn.q_norm.weight, model.language_model.layers.16.self_attn.k_norm.weight, model.language_model.layers.16.mlp.gate_proj.weight, model.language_model.layers.16.mlp.up_proj.weight, model.language_model.layers.16.mlp.down_proj.weight, model.language_model.layers.16.input_layernorm.weight, model.language_model.layers.16.post_attention_layernorm.weight, model.language_model.layers.16.pre_feedforward_layernorm.weight, model.language_model.layers.16.post_feedforward_layernorm.weight, model.language_model.layers.17.self_attn.q_proj.weight, model.language_model.layers.17.self_attn.k_proj.weight, model.language_model.layers.17.self_attn.v_proj.weight, model.language_model.layers.17.self_attn.o_proj.weight, model.language_model.layers.17.self_attn.q_norm.weight, model.language_model.layers.17.self_attn.k_norm.weight, model.language_model.layers.17.mlp.gate_proj.weight, model.language_model.layers.17.mlp.up_proj.weight, model.language_model.layers.17.mlp.down_proj.weight, model.language_model.layers.17.input_layernorm.weight, model.language_model.layers.17.post_attention_layernorm.weight, model.language_model.layers.17.pre_feedforward_layernorm.weight, model.language_model.layers.17.post_feedforward_layernorm.weight, model.language_model.layers.18.self_attn.q_proj.weight, model.language_model.layers.18.self_attn.k_proj.weight, model.language_model.layers.18.self_attn.v_proj.weight, model.language_model.layers.18.self_attn.o_proj.weight, model.language_model.layers.18.self_attn.q_norm.weight, model.language_model.layers.18.self_attn.k_norm.weight, model.language_model.layers.18.mlp.gate_proj.weight, model.language_model.layers.18.mlp.up_proj.weight, model.language_model.layers.18.mlp.down_proj.weight, model.language_model.layers.18.input_layernorm.weight, model.language_model.layers.18.post_attention_layernorm.weight, model.language_model.layers.18.pre_feedforward_layernorm.weight, model.language_model.layers.18.post_feedforward_layernorm.weight, model.language_model.layers.19.self_attn.q_proj.weight, model.language_model.layers.19.self_attn.k_proj.weight, model.language_model.layers.19.self_attn.v_proj.weight, model.language_model.layers.19.self_attn.o_proj.weight, model.language_model.layers.19.self_attn.q_norm.weight, model.language_model.layers.19.self_attn.k_norm.weight, model.language_model.layers.19.mlp.gate_proj.weight, model.language_model.layers.19.mlp.up_proj.weight, model.language_model.layers.19.mlp.down_proj.weight, model.language_model.layers.19.input_layernorm.weight, model.language_model.layers.19.post_attention_layernorm.weight, model.language_model.layers.19.pre_feedforward_layernorm.weight, model.language_model.layers.19.post_feedforward_layernorm.weight, model.language_model.layers.20.self_attn.q_proj.weight, model.language_model.layers.20.self_attn.k_proj.weight, model.language_model.layers.20.self_attn.v_proj.weight, model.language_model.layers.20.self_attn.o_proj.weight, model.language_model.layers.20.self_attn.q_norm.weight, model.language_model.layers.20.self_attn.k_norm.weight, model.language_model.layers.20.mlp.gate_proj.weight, model.language_model.layers.20.mlp.up_proj.weight, model.language_model.layers.20.mlp.down_proj.weight, model.language_model.layers.20.input_layernorm.weight, model.language_model.layers.20.post_attention_layernorm.weight, model.language_model.layers.20.pre_feedforward_layernorm.weight, model.language_model.layers.20.post_feedforward_layernorm.weight, model.language_model.layers.21.self_attn.q_proj.weight, model.language_model.layers.21.self_attn.k_proj.weight, model.language_model.layers.21.self_attn.v_proj.weight, model.language_model.layers.21.self_attn.o_proj.weight, model.language_model.layers.21.self_attn.q_norm.weight, model.language_model.layers.21.self_attn.k_norm.weight, model.language_model.layers.21.mlp.gate_proj.weight, model.language_model.layers.21.mlp.up_proj.weight, model.language_model.layers.21.mlp.down_proj.weight, model.language_model.layers.21.input_layernorm.weight, model.language_model.layers.21.post_attention_layernorm.weight, model.language_model.layers.21.pre_feedforward_layernorm.weight, model.language_model.layers.21.post_feedforward_layernorm.weight, model.language_model.layers.22.self_attn.q_proj.weight, model.language_model.layers.22.self_attn.k_proj.weight, model.language_model.layers.22.self_attn.v_proj.weight, model.language_model.layers.22.self_attn.o_proj.weight, model.language_model.layers.22.self_attn.q_norm.weight, model.language_model.layers.22.self_attn.k_norm.weight, model.language_model.layers.22.mlp.gate_proj.weight, model.language_model.layers.22.mlp.up_proj.weight, model.language_model.layers.22.mlp.down_proj.weight, model.language_model.layers.22.input_layernorm.weight, model.language_model.layers.22.post_attention_layernorm.weight, model.language_model.layers.22.pre_feedforward_layernorm.weight, model.language_model.layers.22.post_feedforward_layernorm.weight, model.language_model.layers.23.self_attn.q_proj.weight, model.language_model.layers.23.self_attn.k_proj.weight, model.language_model.layers.23.self_attn.v_proj.weight, model.language_model.layers.23.self_attn.o_proj.weight, model.language_model.layers.23.self_attn.q_norm.weight, model.language_model.layers.23.self_attn.k_norm.weight, model.language_model.layers.23.mlp.gate_proj.weight, model.language_model.layers.23.mlp.up_proj.weight, model.language_model.layers.23.mlp.down_proj.weight, model.language_model.layers.23.input_layernorm.weight, model.language_model.layers.23.post_attention_layernorm.weight, model.language_model.layers.23.pre_feedforward_layernorm.weight, model.language_model.layers.23.post_feedforward_layernorm.weight, model.language_model.layers.24.self_attn.q_proj.weight, model.language_model.layers.24.self_attn.k_proj.weight, model.language_model.layers.24.self_attn.v_proj.weight, model.language_model.layers.24.self_attn.o_proj.weight, model.language_model.layers.24.self_attn.q_norm.weight, model.language_model.layers.24.self_attn.k_norm.weight, model.language_model.layers.24.mlp.gate_proj.weight, model.language_model.layers.24.mlp.up_proj.weight, model.language_model.layers.24.mlp.down_proj.weight, model.language_model.layers.24.input_layernorm.weight, model.language_model.layers.24.post_attention_layernorm.weight, model.language_model.layers.24.pre_feedforward_layernorm.weight, model.language_model.layers.24.post_feedforward_layernorm.weight, model.language_model.layers.25.self_attn.q_proj.weight, model.language_model.layers.25.self_attn.k_proj.weight, model.language_model.layers.25.self_attn.v_proj.weight, model.language_model.layers.25.self_attn.o_proj.weight, model.language_model.layers.25.self_attn.q_norm.weight, model.language_model.layers.25.self_attn.k_norm.weight, model.language_model.layers.25.mlp.gate_proj.weight, model.language_model.layers.25.mlp.up_proj.weight, model.language_model.layers.25.mlp.down_proj.weight, model.language_model.layers.25.input_layernorm.weight, model.language_model.layers.25.post_attention_layernorm.weight, model.language_model.layers.25.pre_feedforward_layernorm.weight, model.language_model.layers.25.post_feedforward_layernorm.weight, model.language_model.layers.26.self_attn.q_proj.weight, model.language_model.layers.26.self_attn.k_proj.weight, model.language_model.layers.26.self_attn.v_proj.weight, model.language_model.layers.26.self_attn.o_proj.weight, model.language_model.layers.26.self_attn.q_norm.weight, model.language_model.layers.26.self_attn.k_norm.weight, model.language_model.layers.26.mlp.gate_proj.weight, model.language_model.layers.26.mlp.up_proj.weight, model.language_model.layers.26.mlp.down_proj.weight, model.language_model.layers.26.input_layernorm.weight, model.language_model.layers.26.post_attention_layernorm.weight, model.language_model.layers.26.pre_feedforward_layernorm.weight, model.language_model.layers.26.post_feedforward_layernorm.weight, model.language_model.layers.27.self_attn.q_proj.weight, model.language_model.layers.27.self_attn.k_proj.weight, model.language_model.layers.27.self_attn.v_proj.weight, model.language_model.layers.27.self_attn.o_proj.weight, model.language_model.layers.27.self_attn.q_norm.weight, model.language_model.layers.27.self_attn.k_norm.weight, model.language_model.layers.27.mlp.gate_proj.weight, model.language_model.layers.27.mlp.up_proj.weight, model.language_model.layers.27.mlp.down_proj.weight, model.language_model.layers.27.input_layernorm.weight, model.language_model.layers.27.post_attention_layernorm.weight, model.language_model.layers.27.pre_feedforward_layernorm.weight, model.language_model.layers.27.post_feedforward_layernorm.weight, model.language_model.layers.28.self_attn.q_proj.weight, model.language_model.layers.28.self_attn.k_proj.weight, model.language_model.layers.28.self_attn.v_proj.weight, model.language_model.layers.28.self_attn.o_proj.weight, model.language_model.layers.28.self_attn.q_norm.weight, model.language_model.layers.28.self_attn.k_norm.weight, model.language_model.layers.28.mlp.gate_proj.weight, model.language_model.layers.28.mlp.up_proj.weight, model.language_model.layers.28.mlp.down_proj.weight, model.language_model.layers.28.input_layernorm.weight, model.language_model.layers.28.post_attention_layernorm.weight, model.language_model.layers.28.pre_feedforward_layernorm.weight, model.language_model.layers.28.post_feedforward_layernorm.weight, model.language_model.layers.29.self_attn.q_proj.weight, model.language_model.layers.29.self_attn.k_proj.weight, model.language_model.layers.29.self_attn.v_proj.weight, model.language_model.layers.29.self_attn.o_proj.weight, model.language_model.layers.29.self_attn.q_norm.weight, model.language_model.layers.29.self_attn.k_norm.weight, model.language_model.layers.29.mlp.gate_proj.weight, model.language_model.layers.29.mlp.up_proj.weight, model.language_model.layers.29.mlp.down_proj.weight, model.language_model.layers.29.input_layernorm.weight, model.language_model.layers.29.post_attention_layernorm.weight, model.language_model.layers.29.pre_feedforward_layernorm.weight, model.language_model.layers.29.post_feedforward_layernorm.weight, model.language_model.layers.30.self_attn.q_proj.weight, model.language_model.layers.30.self_attn.k_proj.weight, model.language_model.layers.30.self_attn.v_proj.weight, model.language_model.layers.30.self_attn.o_proj.weight, model.language_model.layers.30.self_attn.q_norm.weight, model.language_model.layers.30.self_attn.k_norm.weight, model.language_model.layers.30.mlp.gate_proj.weight, model.language_model.layers.30.mlp.up_proj.weight, model.language_model.layers.30.mlp.down_proj.weight, model.language_model.layers.30.input_layernorm.weight, model.language_model.layers.30.post_attention_layernorm.weight, model.language_model.layers.30.pre_feedforward_layernorm.weight, model.language_model.layers.30.post_feedforward_layernorm.weight, model.language_model.layers.31.self_attn.q_proj.weight, model.language_model.layers.31.self_attn.k_proj.weight, model.language_model.layers.31.self_attn.v_proj.weight, model.language_model.layers.31.self_attn.o_proj.weight, model.language_model.layers.31.self_attn.q_norm.weight, model.language_model.layers.31.self_attn.k_norm.weight, model.language_model.layers.31.mlp.gate_proj.weight, model.language_model.layers.31.mlp.up_proj.weight, model.language_model.layers.31.mlp.down_proj.weight, model.language_model.layers.31.input_layernorm.weight, model.language_model.layers.31.post_attention_layernorm.weight, model.language_model.layers.31.pre_feedforward_layernorm.weight, model.language_model.layers.31.post_feedforward_layernorm.weight, model.language_model.layers.32.self_attn.q_proj.weight, model.language_model.layers.32.self_attn.k_proj.weight, model.language_model.layers.32.self_attn.v_proj.weight, model.language_model.layers.32.self_attn.o_proj.weight, model.language_model.layers.32.self_attn.q_norm.weight, model.language_model.layers.32.self_attn.k_norm.weight, model.language_model.layers.32.mlp.gate_proj.weight, model.language_model.layers.32.mlp.up_proj.weight, model.language_model.layers.32.mlp.down_proj.weight, model.language_model.layers.32.input_layernorm.weight, model.language_model.layers.32.post_attention_layernorm.weight, model.language_model.layers.32.pre_feedforward_layernorm.weight, model.language_model.layers.32.post_feedforward_layernorm.weight, model.language_model.layers.33.self_attn.q_proj.weight, model.language_model.layers.33.self_attn.k_proj.weight, model.language_model.layers.33.self_attn.v_proj.weight, model.language_model.layers.33.self_attn.o_proj.weight, model.language_model.layers.33.self_attn.q_norm.weight, model.language_model.layers.33.self_attn.k_norm.weight, model.language_model.layers.33.mlp.gate_proj.weight, model.language_model.layers.33.mlp.up_proj.weight, model.language_model.layers.33.mlp.down_proj.weight, model.language_model.layers.33.input_layernorm.weight, model.language_model.layers.33.post_attention_layernorm.weight, model.language_model.layers.33.pre_feedforward_layernorm.weight, model.language_model.layers.33.post_feedforward_layernorm.weight, model.language_model.norm.weight, lm_head.weight",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_generate_multi_accelerator_causal_mask",
      "trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.visual.patch_embed.proj.weight, model.visual.blocks.0.norm1.weight, model.visual.blocks.0.norm1.bias, model.visual.blocks.0.norm2.weight, model.visual.blocks.0.norm2.bias, model.visual.blocks.0.attn.qkv.weight, model.visual.blocks.0.attn.qkv.bias, model.visual.blocks.0.attn.proj.weight, model.visual.blocks.0.attn.proj.bias, model.visual.blocks.0.mlp.fc1.weight, model.visual.blocks.0.mlp.fc1.bias, model.visual.blocks.0.mlp.fc2.weight, model.visual.blocks.0.mlp.fc2.bias, model.visual.blocks.1.norm1.weight, model.visual.blocks.1.norm1.bias, model.visual.blocks.1.norm2.weight, model.visual.blocks.1.norm2.bias, model.visual.blocks.1.attn.qkv.weight, model.visual.blocks.1.attn.qkv.bias, model.visual.blocks.1.attn.proj.weight, model.visual.blocks.1.attn.proj.bias, model.visual.blocks.1.mlp.fc1.weight, model.visual.blocks.1.mlp.fc1.bias, model.visual.blocks.1.mlp.fc2.weight, model.visual.blocks.1.mlp.fc2.bias, model.visual.merger.ln_q.weight, model.visual.merger.ln_q.bias, model.visual.merger.mlp.0.weight, model.visual.merger.mlp.0.bias, model.visual.merger.mlp.2.weight, model.visual.merger.mlp.2.bias, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.q_proj.bias, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.k_proj.bias, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.v_proj.bias, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.q_proj.bias, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.k_proj.bias, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.v_proj.bias, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.norm.weight",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1618)  ValueError: The device_map provided does not give any device for the following parameters: model.visual.patch_embed.proj.weight, model.visual.blocks.0.norm1.weight, model.visual.blocks.0.norm1.bias, model.visual.blocks.0.norm2.weight, model.visual.blocks.0.norm2.bias, model.visual.blocks.0.attn.qkv.weight, model.visual.blocks.0.attn.qkv.bias, model.visual.blocks.0.attn.proj.weight, model.visual.blocks.0.attn.proj.bias, model.visual.blocks.0.mlp.fc1.weight, model.visual.blocks.0.mlp.fc1.bias, model.visual.blocks.0.mlp.fc2.weight, model.visual.blocks.0.mlp.fc2.bias, model.visual.blocks.1.norm1.weight, model.visual.blocks.1.norm1.bias, model.visual.blocks.1.norm2.weight, model.visual.blocks.1.norm2.bias, model.visual.blocks.1.attn.qkv.weight, model.visual.blocks.1.attn.qkv.bias, model.visual.blocks.1.attn.proj.weight, model.visual.blocks.1.attn.proj.bias, model.visual.blocks.1.mlp.fc1.weight, model.visual.blocks.1.mlp.fc1.bias, model.visual.blocks.1.mlp.fc2.weight, model.visual.blocks.1.mlp.fc2.bias, model.visual.merger.ln_q.weight, model.visual.merger.ln_q.bias, model.visual.merger.mlp.0.weight, model.visual.merger.mlp.0.bias, model.visual.merger.mlp.2.weight, model.visual.merger.mlp.2.bias, model.language_model.embed_tokens.weight, model.language_model.layers.0.self_attn.q_proj.weight, model.language_model.layers.0.self_attn.q_proj.bias, model.language_model.layers.0.self_attn.k_proj.weight, model.language_model.layers.0.self_attn.k_proj.bias, model.language_model.layers.0.self_attn.v_proj.weight, model.language_model.layers.0.self_attn.v_proj.bias, model.language_model.layers.0.self_attn.o_proj.weight, model.language_model.layers.0.mlp.gate_proj.weight, model.language_model.layers.0.mlp.up_proj.weight, model.language_model.layers.0.mlp.down_proj.weight, model.language_model.layers.0.input_layernorm.weight, model.language_model.layers.0.post_attention_layernorm.weight, model.language_model.layers.1.self_attn.q_proj.weight, model.language_model.layers.1.self_attn.q_proj.bias, model.language_model.layers.1.self_attn.k_proj.weight, model.language_model.layers.1.self_attn.k_proj.bias, model.language_model.layers.1.self_attn.v_proj.weight, model.language_model.layers.1.self_attn.v_proj.bias, model.language_model.layers.1.self_attn.o_proj.weight, model.language_model.layers.1.mlp.gate_proj.weight, model.language_model.layers.1.mlp.up_proj.weight, model.language_model.layers.1.mlp.down_proj.weight, model.language_model.layers.1.input_layernorm.weight, model.language_model.layers.1.post_attention_layernorm.weight, model.language_model.norm.weight",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_green_red_watermark_generation",
      "trace": "(line 659)  AttributeError: 'dict' object has no attribute 'validate'",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 658)  AttributeError: 'dict' object has no attribute 'validate'",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "import_or_config",
      "big_model": false
    },
    {
      "model": "generation",
      "gpu": "multi",
      "test": "tests/generation/test_utils.py::GenerationIntegrationTests::test_validate_assistant",
      "trace": "(line 1909)  torch.AcceleratorError: CUDA error: device-side assert triggered",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 1909)  torch.AcceleratorError: CUDA error: device-side assert triggered",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "cuda_runtime",
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "multi",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_with_compile_and_higher_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "multi",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_with_compile_and_lower_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "multi",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_without_compile_and_with_higher_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "peft_integration",
      "gpu": "multi",
      "test": "tests/peft_integration/test_peft_integration.py::PeftHotswapIntegrationTest::test_hotswap_without_compile_and_with_lower_rank_works",
      "trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 278)  RuntimeError: You set `ignore_mismatched_sizes` to `False`, thus raising an error. For details look at the above report!",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "other",
      "big_model": false
    },
    {
      "model": "utils",
      "gpu": "multi",
      "test": "tests/utils/test_cache_utils.py::CacheHardIntegrationTest::test_cache_copy",
      "trace": "(line 436)  AssertionError: Lists differ: ['You are a helpful assistant. Help me to [390 chars] is'] != [\"You are a helpful assistant. Help me to [385 chars] is']",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 436)  AssertionError: Lists differ: ['You are a helpful assistant. Help me to [390 chars] is'] != [\"You are a helpful assistant. Help me to [385 chars] is']",
      "first_failure_day": "2026-03-15",
      "last_green_day": "2026-03-14",
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "utils",
      "gpu": "multi",
      "test": "tests/utils/test_cache_utils.py::CacheHardIntegrationTest::test_dynamic_cache_hard",
      "trace": "(line 319)  AssertionError: \"Here[57 chars]ave fur, they have four legs, they have a tail[1045 chars]have\" != \"Here[57 chars]ave four legs, they have a tail, they have a f[1078 chars]They\"",
      "days_seen": 6,
      "first_seen": "2026-05-31",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 319)  AssertionError: \"Here[57 chars]ave fur, they have four legs, they have a tail[1045 chars]have\" != \"Here[57 chars]ave four legs, they have a tail, they have a f[1078 chars]They\"",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "output_mismatch",
      "big_model": false
    },
    {
      "model": "emu3",
      "gpu": "multi",
      "test": "tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_batched",
      "trace": "(line 2397)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 458.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 266.69 MiB is free. Process 586247 has 22.04 GiB memory in use. Of the allocated memory 21.42 GiB is allocated by PyTorch, and 120.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "days_seen": 5,
      "first_seen": "2026-06-01",
      "latest_seen": "2026-06-06",
      "latest_trace": "(line 2397)  torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 458.00 MiB. GPU 0 has a total capacity of 22.30 GiB of which 272.69 MiB is free. Process 1082951 has 22.03 GiB memory in use. Of the allocated memory 21.41 GiB is allocated by PyTorch, and 121.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf)",
      "first_failure_day": "2026-03-09",
      "last_green_day": null,
      "failure_mode": "OOM",
      "big_model": false
    }
  ],
  "totals": {
    "total": 737,
    "in_clusters": 0,
    "clusters": 0,
    "flaky": 351,
    "unpinned": 386
  },
  "regression_day_buckets": {
    "2026-03-09": 551,
    "2026-03-15": 93,
    "2026-04-06": 22,
    "2026-05-20": 13,
    "2026-03-17": 8,
    "2026-05-27": 6,
    "2026-04-10": 6,
    "2026-04-03": 6,
    "2026-05-21": 4,
    "2026-03-19": 4,
    "2026-04-09": 4,
    "2026-04-21": 4,
    "2026-03-21": 2,
    "2026-03-28": 2,
    "2026-04-29": 2,
    "2026-05-29": 2,
    "2026-04-05": 2,
    "2026-04-01": 2,
    "2026-05-19": 2,
    "2026-03-14": 1,
    "2026-03-16": 1
  },
  "window": {
    "dates": [
      "2026-05-31",
      "2026-06-01",
      "2026-06-02",
      "2026-06-03",
      "2026-06-04",
      "2026-06-05",
      "2026-06-06"
    ],
    "min_days": 5
  },
  "generated_at_utc": "2026-06-07T02:34:01"
}