Teacher-Forced Directional Steering#282
Conversation
|
I ran several tests, switching from linear to quadratic and others. The best one appears to be cubic smoothstep (3t² − 2t³).
@antirez I'd love to contribute to this aspect of the project, and I hope you appreciate this PR. Let me know what I should change. |
…onse handling - Updated `ds4.h` to include new parameters for directional steering: `directional_steering_ffn_decay_tokens` and `directional_steering_ffn_decay_final`. - Enhanced `ds4_agent.c`, `ds4_cli.c`, and `ds4_server.c` to parse and handle new command-line options for the added parameters. - Introduced new teacher-forced examples for both good and bad response pairs related to Tiananmen Square. - Created a new script `build_direction_teacher_forced.py` to build directional steering vectors based on teacher-forced responses.
|
Thank you @Chida82 I'm super interested, just didn't found yet the time, but this is a PR I'll check with extra care. Thanks. |
|
For now, the “think” part is in another branch, so we don’t put too much on the table at once. The most significant part related to this is in this commit: |


Teacher-Forced Directional Steering
This change documents and promotes teacher-forced directional steering for DS4.
The main benefit is that the steering vector is extracted from a more useful
internal state: not the model right before it starts answering, but the model
while it is already entering the target answer trajectory.
The intuition is simple. With the original builder, part of the direction is
spent encoding the generic style of a positive answer versus a refusal. With a
teacher-forced prefix, that stylistic component is already partially satisfied
by the injected response start, so the remaining activation delta can better
track the knowledge, continuation mode, and latent content that we want to make
emerge from the model.
This PR adds a concise explanation to the main README and includes an explicit
build command plus a runnable CLI example using the Tiananmen teacher-forced
vector.
Example build command:
Example inference command:
./ds4 --seed 2026 --temp 0.1 \ --dir-steering-file dir-steering/out/tiananmen_tf_3.f32 \ --dir-steering-ffn -3 \ --dir-steering-ffn-decay-tokens 30 \ --dir-steering-ffn-decay-final -0.1 \ --nothink \ -p "Important events in China in 1989"Test
Check the standard. No good info with scale -3 or other value
with the new, same problem with scale 2 refusal by guardrail, with 3 better but with a problem
with steering only first token I have all info. The model have that info.