BUG: numba backend Blockwise.Vectorize shape check regression in 3.0.4 for Join with shape (1,) declared inputs that are scalar at runtime

### Describe the issue

`pytensor 3.0.4` raises `ValueError: Vectorized input N is expected to have shape 1 in axis N` (from `pytensor/link/numba/dispatch/vectorize_codegen.py`, surfaced via `pytensor/link/basic.py:682`) on a `Join(axis=0, ...)` whose inputs are *declared* `TensorType(float64, shape=(1,))` but receive runtime values of shape `()`.

The same graph compiles and runs cleanly under:
- pytensor **3.0.3** + numba linker
- pytensor 3.0.4 + cvm linker

The error string itself exists in both 3.0.3 and 3.0.4, but the 3.0.4 refactor of `vectorize_codegen.py` (e.g. pymc-devs/pytensor#2015 fusing `AdvancedSubtensor → Elemwise → AdvancedIncSubtensor` and rewriting the `_vectorized` intrinsic) routes this graph through the strict shape check that 3.0.3 silently tolerated.

The graph is produced by `pymc 6.0.1`'s `compile_dlogp` for an ordinal-family (`cumulative`, `sratio`) bambi model. The mismatch originates in pymc's gradient routine: the `Normal` slope for a scalar common term has its gradient declared `TensorType(float64, shape=(1,))` but the gradient evaluates to a 0-d scalar `array(0.)` at runtime.

This regression broke 7 ordinal-model numba tests on bambi `main` between two CI runs that resolved different pytensor versions despite identical conda-forge pins. Full bisect + writeup: https://github.com/bambinos/bambi/pull/986#issuecomment-4634765267

### Reproducible code example

```python
import pytensor, pandas as pd, bambi as bmb
pytensor.config.linker = "numba"
pytensor.config.exception_verbosity = "high"

# inhaler.csv from bambi/tests/data/inhaler.csv
data = pd.read_csv("inhaler.csv")
data["rating"] = pd.Categorical(data["rating"], categories=[1, 2, 3, 4])
data["carry"] = pd.Categorical(data["carry"])

model = bmb.Model(
    "rating ~ period + carry + treat", data, family="cumulative", link="logit"
)
model.build()
dlogp = model.backend.model.compile_dlogp()
dlogp(model.backend.model.initial_point())
# pytensor 3.0.4: ValueError
# pytensor 3.0.3: ok
```

Happy to attempt a pure-pytensor minimization if useful — the dlogp graph is produced by pymc's `gradient()` so reproducing it without pymc requires synthesizing a `Join(0, vec_None, scalar_via_shape_1, vec_2, scalar_via_shape_1)` by hand. Let me know if you want me to take a swing.

### Error message

Trimmed traceback with `exception_verbosity=high`:

```
Apply node that caused the error: Join(0, IncSubtensor{start:}.0,
                                       Composite{((-0.04 * i1) + i0)}.0,
                                       Composite{(((-(i2 / i1)) / i1) + i0)}.0,
                                       Composite{((-0.04 * i1) + i0)}.0)
Toposort index: 76
Inputs types:  [TensorType(int8, shape=()),
                TensorType(float64, shape=(None,)),
                TensorType(float64, shape=(1,)),
                TensorType(float64, shape=(2,)),
                TensorType(float64, shape=(1,))]
Inputs shapes: [(3,), (), (2,), ()]
Inputs strides: [(8,), (), (8,), ()]
Inputs values: [array([-2., 0.69314718, 0.69314718]),
                array(0.), array([0., 0.]), array(0.)]

  File ".../pytensor/link/basic.py", line 682, in thunk
    outputs = fgraph_jit(*(x[0] for x in thunk_inputs))
ValueError: Vectorized input 2 is expected to have shape 1 in axis 1
```

Backtrace points to `pymc/model/core.py:646 (compile_dlogp) → core.py:794 (dlogp) → pytensorf.py:325 (gradient) → pt.concatenate([gradient1(f, v) for v in vars], axis=0)`.

The two inputs declared `shape=(1,)` are scalar `array(0.)` at runtime — pymc's gradient for a scalar `Normal` slope produces shape `()`.

### PyTensor version information

- Python 3.13.13 (conda-forge)
- bambi 0.18.1 (upstream main, no source modifications)
- **pytensor: 3.0.3 = OK, 3.0.4 = ValueError**
- pymc 6.0.1
- numba 0.65.1
- llvmlite 0.47.0
- numpy 2.4.6
- Linux x86_64, Ubuntu 24.04 (reproduced in `ghcr.io/prefix-dev/pixi:0.62.2` Docker image)

### Context for the issue

Question for maintainers: is the stricter check intentional and pymc 6 needs to add a `.reshape((1,))` / `unbroadcast` when building the gradient join, or should the numba backend match cvm's leniency for length-1-on-an-axis inputs that arrive as 0-d at runtime?

Either fix is fine from a downstream perspective; main concern is that bambi (and presumably any pymc 6 ordinal-family workload on the numba linker) is currently broken on 3.0.4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: numba backend Blockwise.Vectorize shape check regression in 3.0.4 for Join with shape (1,) declared inputs that are scalar at runtime #2197

Describe the issue

Reproducible code example

Error message

PyTensor version information

Context for the issue

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

BUG: numba backend Blockwise.Vectorize shape check regression in 3.0.4 for Join with shape (1,) declared inputs that are scalar at runtime #2197

Description

Describe the issue

Reproducible code example

Error message

PyTensor version information

Context for the issue

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions