Hi and thanks for the great package!
While working with juliacall + PythonCall.jl, I ran into two issues related to pandas.Categorical handling.
(In all examples below, jlrefers to Main from juliacall, i.e., from juliacall import Main as jl.)
DataFrame conversion ignores Categorical columns
When passing a pandas.DataFrame with categorical columns (i.e., dtype='category'), those columns are silently converted to Int64 vectors in Julia (presumably the .codes). This results in CategoricalArray semantics being lost — so interactions in Julia formulas like id & η1 are treated as numeric rather than generating dummy variables.
jl.convert() can’t convert pandas.Categorical to any Julia type
I tried using jl.convert(CategoricalArray, col) directly on a pandas.Series with categorical dtype, but got a MethodError. It appears PythonCall doesn’t yet support converting pandas.Categorical to any Julia-native type.
To work around this, I convert the column to str in Python (so it arrives as a Vector{String}), then manually wrap it in categorical(...) on the Julia side. This works, but it's not ideal for type fidelity or automatic translation.
Let me know if there's a cleaner workaround — or if you'd be open to a PR to improve automatic CategoricalArray support.
Thanks again!
Hi and thanks for the great package!
While working with
juliacall+PythonCall.jl, I ran into two issues related to pandas.Categorical handling.(In all examples below,
jlrefers to Main fromjuliacall, i.e., from juliacall import Main as jl.)DataFrame conversion ignores Categorical columns
When passing a
pandas.DataFramewith categorical columns (i.e.,dtype='category'), those columns are silently converted toInt64vectors in Julia (presumably the .codes). This results in CategoricalArray semantics being lost — so interactions in Julia formulas likeid & η1are treated as numeric rather than generating dummy variables.jl.convert()can’t convert pandas.Categorical to any Julia typeI tried using
jl.convert(CategoricalArray, col)directly on apandas.Serieswith categorical dtype, but got aMethodError. It appearsPythonCalldoesn’t yet support convertingpandas.Categoricalto any Julia-native type.To work around this, I convert the column to
strin Python (so it arrives as aVector{String}), then manually wrap it incategorical(...)on the Julia side. This works, but it's not ideal for type fidelity or automatic translation.Let me know if there's a cleaner workaround — or if you'd be open to a PR to improve automatic
CategoricalArraysupport.Thanks again!