Skip to content

Conversation

@A-nony-mous
Copy link

Description

This PR addresses a bug where Excel files generated from templates could become corrupted or throw an "XML error" upon opening.

Problem

When using MiniExcel.SaveAsByTemplate() with Excel templates containing SharedStrings, the generated files would fail to open in Excel with the error "We found a problem with some content in 'xxx.xlsx'".
image

template:
GachaLog.xlsx

demo error file:
1.xslx

logic code:

    private async Task ExportAsExcelAsync(long uid, string output)
    {
        using var dapper = DatabaseService.CreateConnection();
        var list = GetGachaLogItemEx(uid);
        var template = Path.Combine(AppContext.BaseDirectory, @"Assets\Template\GachaLog.xlsx");
        if (File.Exists(template))
        {
            await MiniExcel.SaveAsByTemplateAsync(output, template, new { list });
        }
    }

Root Causes

  1. Invalid cell type for SharedStrings: The code was setting t="str" with <v> element, but OpenXML requires t="inlineStr" with <is><t>...</t></is> structure for inline strings.

  2. Incorrect XML element order: Elements were written in wrong order. Per ECMA-376 specification, the correct order after </sheetData> must be: autoFiltermergeCellsphoneticPrconditionalFormatting.

  3. Duplicate namespace declarations: phoneticPr, conditionalFormatting, and autoFilter elements contained redundant xmlns="..." declarations when written to the output.

  4. Number cells with wrong type attribute: Numeric cells were incorrectly getting t="inlineStr" attribute, which should be omitted for numbers.

Related Issue

issue1
issue2

Changes

  • ReplaceSharedStringsToStr: Convert SharedString references to proper inlineStr format with <is><t> structure
  • New SetCellType method: Centralized cell type handling - uses inlineStr for strings, removes t attribute for numbers
  • WriteSheetXmlAsync: Fixed element order per ECMA-376 spec, extract and rewrite autoFilter in correct position
  • CleanXml: Added cleanup for xmlns="..." declarations
  • GenerateCellValuesAsync: Clean up empty <v></v> tags, support <is><t> structure in value lookup

Testing

  • Tested with template containing 3650 rows of data
  • Generated file opens successfully in Microsoft Excel without repair prompts
  • Verified XML structure compliance with ECMA-376 specification

Breaking Changes

None. These are bug fixes that produce valid OpenXML output.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @A-nony-mous, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves critical issues causing Excel files generated from templates to become corrupted or unreadable. The changes primarily focus on correcting the OpenXML structure, cell type handling, and XML element ordering to strictly comply with the ECMA-376 specification, thereby ensuring the reliability and integrity of exported Excel documents.

Highlights

  • XML Structure Compliance: Reordered XML elements (autoFilter, mergeCells, phoneticPr, conditionalFormatting) in WriteSheetXmlAsync to adhere to ECMA-376 specification, preventing file corruption.
  • Shared String Handling: Converted SharedString references to the correct inlineStr format (...) and introduced a SetCellType method for consistent cell type management.
  • XML Cleanup and Validation: Implemented CleanXml to remove redundant namespace declarations and GenerateCellValuesAsync to clean up empty tags, ensuring valid OpenXML output.
  • Numeric Cell Type Correction: Ensured numeric cells no longer incorrectly receive t="inlineStr" attributes, with the t attribute now being omitted for numbers as per spec.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a comprehensive fix for a bug causing corrupted Excel file generation from templates. The changes correctly address multiple root causes related to OpenXML specification compliance, including element ordering, cell type representation for strings, and handling of shared strings. The introduction of the SetCellType method is an excellent refactoring that centralizes and standardizes cell type management. The code is well-structured, and the fixes are thorough. I have a couple of suggestions to further improve robustness and code style.

Comment on lines 1146 to 1147
if (type == "b")
c.SetAttribute("t", "b");
else
c.RemoveAttribute("t");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with the rest of the file and to prevent potential bugs, it's recommended to use braces {} for if-else blocks, even for single-line statements. The indentation is also inconsistent here.

            if (type == "b")
            {
                c.SetAttribute("t", "b");
            }
            else
            {
                c.RemoveAttribute("t");
            }

- Change SharedString conversion to use inlineStr format with <is><t> structure
- Add SetCellType method to properly handle cell types (inlineStr for strings, remove t attr for numbers)
- Fix XML element order per ECMA-376 spec (autoFilter before mergeCells/phoneticPr)
- Clean duplicate xmlns declarations from phoneticPr, conditionalFormatting, autoFilter
- Remove empty <v></v> tags that cause invalid XML
- Support <is><t> structure in value node lookup

Fixes template SaveAs generating files that Excel cannot open
@michelebastione
Copy link
Contributor

Hello @A-nony-mous, we appreciate your contribution. Unfortunately 8 tests are failing after your modifications, so until you investigate why that happens and fix it this PR cannot be accepted.

@michelebastione michelebastione marked this pull request as draft January 25, 2026 18:21
@A-nony-mous
Copy link
Author

@michelebastione I am sorry for previous bugs I made and I have fixed them.

Besides, I updated

public void TestIssueI4DQUN()
{
var templatePath = PathHelper.GetFile("xlsx/TestIssueI4DQUN.xlsx");
using var path = AutoDeletingPath.Create();
var value = new Dictionary<string, object>
{
{ "Title", "Hello & World < , > , \" , '" },
{ "Details", new[] { new { Value = "Hello & Value < , > , \" , '" } } },
};
_excelTemplater.ApplyTemplate(path.ToString(), templatePath, value);
var sheetXml = SheetHelper.GetZipFileContent(path.ToString(), "xl/worksheets/sheet1.xml");
Assert.Contains("<v>Hello &amp; World &lt; , &gt; , \" , '</v>", sheetXml);
Assert.Contains("<v>Hello &amp; Value &lt; , &gt; , \" , '</v>", sheetXml);
}
Since this PR changes the output format of string cells from SharedString-like [<v>] tags to InlineString [<is><t>] tags (to satisfy ECMA-376 and avoid corruption), the assertion in this test needed to be updated to verify the new [<t>] tag format instead of the old [<v>] tag.

I've verified that all tests passed locally.

image Green text shows "Successful" in Chinese.

I will leave this as a draft for your review. If you are happy with this please let me know and make it ready for another test?

@A-nony-mous
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a comprehensive fix for a bug causing corrupted Excel file generation from templates. The changes are well-structured and address multiple root causes, including incorrect XML element order, invalid cell types for shared strings, and redundant namespace declarations. The introduction of the SetCellType method is a great refactoring for centralizing cell type handling. I've added a few suggestions to further improve code robustness and maintainability.

Fixing potential bug with inline string tag not being correctly identified when <v> tag is not null but also not matching the regex condition, and added GeneratedRegexAttribute for optimizing empty <v> tag regex replace function
@michelebastione
Copy link
Contributor

Looks good, I just want to make a couple more checks but I'll probably merge it tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants