Skip to content

Implement pipe character escaping in CSV converter#2074

Open
Pranav970 wants to merge 1 commit into
microsoft:mainfrom
Pranav970:main
Open

Implement pipe character escaping in CSV converter#2074
Pranav970 wants to merge 1 commit into
microsoft:mainfrom
Pranav970:main

Conversation

@Pranav970

Copy link
Copy Markdown

Added a method to escape pipe characters in CSV cells for Markdown tables.

When CSV files contain pipe characters ( | ) in their cell values, the CsvConverter produces broken Markdown tables because pipe characters are table delimiters in Markdown syntax.

Example Problem:

Code
Input CSV:
name, description
John|Doe,Developer|Engineer
Alice, #@manager|Lead

Current Output (BROKEN):

name description
John Doe
Alice Manager

This breaks the Markdown table structure!

Added a method to escape pipe characters in CSV cells for Markdown tables.

@noezhiya-dot noezhiya-dot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple, correct fix. Pipe characters in CSV cells break Markdown table rendering because | is the column delimiter. Escaping them with | is the standard approach.

The implementation is clean:

  • Dedicated _escape_pipe method keeps it DRY
  • Applied consistently to header, data rows, and padding rows
  • No edge cases missed — empty cells pass through harmlessly

This is a common enough issue in real-world CSV data (product specs, names with pipes, etc.) that this fix will prevent a lot of broken tables. LGTM.

@ewise1988

Copy link
Copy Markdown

Why am I getting these?

@noezhiya-dot

Copy link
Copy Markdown

Simple and correct fix for the pipe-escaping issue in CSV-to-Markdown conversion.

The _escape_pipe helper is clean and the change is minimal. This would otherwise break Markdown table rendering for any CSV containing pipe characters (which is common in technical data).

One suggestion: consider also handling newlines within cells. A cell containing \n would break the Markdown table row format. Something like cell.replace("|", "\\|").replace("\n", " ") would be more robust.

Also, the convert method accesses self._escape_pipe but the method is defined as def _escape_pipe(self, cell) — this is correct since it's called on self, but the method could also be a @staticmethod since it doesn't use self. Minor style point.

Otherwise, good to merge.

@noezhiya-dot noezhiya-dot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Implement pipe character escaping in CSV converter

Good catch - unescaped pipe characters in CSV cells will break Markdown table rendering.

Strengths

  • Simple, focused fix with a dedicated _escape_pipe method.
  • Applied consistently to both header and data rows.
  • Clear problem statement in the PR description.

Suggestions

  1. Markdown escaping completeness: Pipe is not the only character that can break Markdown tables. Newlines (\n) within cells also break table rendering (Markdown tables require cells to be single-line). Consider also replacing newlines with
    or spaces for completeness.

  2. Backslash escaping: The current approach escapes | as |. In some Markdown renderers, the backslash itself may be visible in the output. An alternative is to use the HTML entity | which is more universally handled. That said, | is the standard Markdown escape and should work in most renderers (GitHub, CommonMark). This is a minor preference, not a blocker.

  3. Test coverage: The PR does not include tests. A simple test case with pipe characters in cells would prevent regressions.

Approved - this fixes a real bug that produces broken Markdown output.

@Pranav970

Copy link
Copy Markdown
Author

Tests also Updated.

Simple and correct fix for the pipe-escaping issue in CSV-to-Markdown conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants