---
language:
- en
license: other
library_name: transformers
tags:
- heretic
- abliteration
- nlp
- transformers
- safetensors
- mxfp4
- conversational
- 8-bit precision
base_model: openai/gpt-oss-20b
pipeline_tag: text-generation
model_name: gpt-oss-20b-heretic-scannerV1-1
---

# GPT-OSS-20B Heretic (Scanner V1.1)

This is a decensored version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b), made using a currently not available version of [Heretic](https://github.com/p-e-w/heretic).

**Trial 142 Results:**
- **Refusals:** 8/100 (Primary Goal)
- **KL Divergence:** 0.94

## Abliteration Parameters

| Parameter | Value |
| :--- | :--- |
| `direction_index` | 16.60 |
| `attn.o_proj.max_weight` | 1.47 |
| `attn.o_proj.max_weight_position` | 9.62 |
| `attn.o_proj.min_weight` | 1.37 |
| `attn.o_proj.min_weight_distance` | 8.09 |

## Methodology

This model was abliterated using a targeted intervention on the `attn.o_proj` layers, specifically focusing on **layer 10+** where refusal directions were identified via layer scanning. The `mlp.down_proj` layers were **excluded** from the intervention based on scan findings proving they contributed negligible divergence.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("arnomatic/gpt-oss-20b-heretic-scannerV1-1", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("arnomatic/gpt-oss-20b-heretic-scannerV1-1")

prompt = "Generate a story about..."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))