ram: simplification of word slice creation#9690
ram: simplification of word slice creation#9690braydenlouie wants to merge 14 commits intoThe-OpenROAD-Project:masterfrom
Conversation
Signed-off-by: braydenl9988 <braydenl9988@gmail.com>
Signed-off-by: Brayden Louie <braydenl9988@gmail.com>
Signed-off-by: Brayden Louie <braydenl9988@gmail.com>
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the RAM generation logic to build word slices row by row, which improves code readability and simplifies indexing. The changes are generally good and achieve the stated goal. I've found one bug in the layout generation logic that could lead to incorrect behavior, and a couple of minor issues related to code clarity and consistency. Once these are addressed, the changes will be a solid improvement.
Note: Security Review did not run due to the size of the PR.
src/ram/src/layout.cpp
Outdated
| for (int size = layouts_.size(); size <= tracks; ++size) { | ||
| if (orientation_ == odb::horizontal) { | ||
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | ||
| } else { | ||
| layouts_.push_back(std::make_unique<Layout>(odb::horizontal)); | ||
| } | ||
| } |
There was a problem hiding this comment.
There is an off-by-one error in this loop condition. The loop for (int size = layouts_.size(); size <= tracks; ++size) will execute one more time than intended, creating an extra Layout object. For example, if layouts_ is empty and tracks is 10, this loop will run for size from 0 to 10 inclusive, creating 11 layouts instead of 10. The condition should be size < tracks to create the correct number of layouts.
| for (int size = layouts_.size(); size <= tracks; ++size) { | |
| if (orientation_ == odb::horizontal) { | |
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | |
| } else { | |
| layouts_.push_back(std::make_unique<Layout>(odb::horizontal)); | |
| } | |
| } | |
| for (int size = layouts_.size(); size < tracks; ++size) { | |
| if (orientation_ == odb::horizontal) { | |
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | |
| } else { | |
| layouts_.push_back(std::make_unique<Layout>(odb::horizontal)); | |
| } | |
| } |
src/ram/include/ram/ram.h
Outdated
| void makeCellByte(int start_idx, | ||
| int row_idx, |
There was a problem hiding this comment.
The first parameter of makeCellByte is named start_idx here in the header, but it's named byte_idx in the implementation file (src/ram/src/ram.cpp). Using byte_idx is more descriptive as it represents the index of the byte within a word. For consistency and clarity, it would be best to use the same name in both the declaration and definition.
| void makeCellByte(int start_idx, | |
| int row_idx, | |
| void makeCellByte(int byte_idx, | |
| int row_idx, |
src/ram/src/ram.cpp
Outdated
| int start_idx | ||
| = byte * bits_per_byte; // eventually replace with bits per byte |
There was a problem hiding this comment.
The comment // eventually replace with bits per byte is confusing because the variable bits_per_byte is already being used in the expression on the same line. This appears to be a stale comment from a previous stage of refactoring and should be removed to avoid confusion.
int start_idx = byte * bits_per_byte;
src/ram/include/ram/ram.h
Outdated
| int byte_number, | ||
| const std::string& prefix, | ||
| int read_ports, | ||
| void makeCellByte(int start_idx, |
There was a problem hiding this comment.
warning: function 'ram::RamGen::makeCellByte' has a definition with different parameter names [readability-inconsistent-declaration-parameter-name]
void makeCellByte(int start_idx,
^Additional context
src/ram/src/ram.cpp:139: the definition seen here
void RamGen::makeCellByte(const int byte_idx,
^src/ram/include/ram/ram.h:121: differing parameters are named here: ('start_idx'), in definition: ('byte_idx')
void makeCellByte(int start_idx,
^
src/ram/include/ram/ram.h
Outdated
| const std::vector<std::vector<odb::dbBTerm*>>& data_output); | ||
|
|
||
| void makeWordSlice( | ||
| const int bytes_per_word, |
There was a problem hiding this comment.
warning: parameter 'bytes_per_word' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]
| const int bytes_per_word, | |
| int bytes_per_word, |
src/ram/include/ram/ram.h
Outdated
|
|
||
| void makeWordSlice( | ||
| const int bytes_per_word, | ||
| const int row_idx, |
There was a problem hiding this comment.
warning: parameter 'row_idx' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]
| const int row_idx, | |
| int row_idx, |
src/ram/include/ram/ram.h
Outdated
| void makeWordSlice( | ||
| const int bytes_per_word, | ||
| const int row_idx, | ||
| const int read_ports, |
There was a problem hiding this comment.
warning: parameter 'read_ports' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]
| const int read_ports, | |
| int read_ports, |
| vector<vector<dbBTerm*>> byte_outputs(read_ports, | ||
| vector<dbBTerm*>(bits_per_byte)); | ||
| for (int port = 0; port < read_ports; ++port) { | ||
| std::copy_n(data_output[port].begin() + start_idx, |
There was a problem hiding this comment.
warning: no header providing "std::copy_n" is directly included [misc-include-cleaner]
src/ram/src/ram.cpp:5:
- #include <array>
+ #include <algorithm>
+ #include <array>4bccd78 to
d1d7a1d
Compare
|
@rovinski Here are the code simplification changes. At the moment the ok files are still not lining up, even when I tested it with the cmake build so still need to look into that. The differences are related to two of the pins and I believe some routing. I don't know if that's an issue with the changes I made or if they are a product of any changes to the other tools. The rest of the source files should be fixed barring linting and formatting changes. |
b894a5d to
0193870
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
rovinski
left a comment
There was a problem hiding this comment.
@braydenlouie let's start with this first round of review. All of the "byte" notation needs to be changed over to slice. After that I can take a look again.
src/ram/src/layout.cpp
Outdated
| if (orientation_ == odb::horizontal) { | ||
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | ||
| } else { | ||
| layouts_.push_back(std::make_unique<Layout>(odb::horizontal)); | ||
| } |
There was a problem hiding this comment.
Does this work?
| if (orientation_ == odb::horizontal) { | |
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | |
| } else { | |
| layouts_.push_back(std::make_unique<Layout>(odb::horizontal)); | |
| } | |
| layouts_.push_back(std::make_unique<Layout>(orientation_.turn_90())); |
| public: | ||
| Grid(odb::Orientation2D orientation); | ||
|
|
||
| Grid(odb::Orientation2D orientation, int tracks); |
There was a problem hiding this comment.
Can we eliminate all use of the term "tracks"? A "track" has a specific meaning which is associated with metal tracks in global/detailed routing.
Should this be "rows" instead?
src/ram/include/ram/ram.h
Outdated
There was a problem hiding this comment.
Are these declarations necessary? They should already exist by including layout.h.
| odb::dbDatabase* db, | ||
| utl::Logger* logger, | ||
| pdn::PdnGen* pdngen, | ||
| ppl::IOPlacer* ioPlacer, | ||
| dpl::Opendp* opendp, | ||
| grt::GlobalRouter* global_router, | ||
| drt::TritonRoute* detailed_router) |
There was a problem hiding this comment.
There is a mismatch here between the argument names here and in the header file (some args have underscores some don't). Please standardize on one.
src/ram/src/ram.cpp
Outdated
| void RamGen::makeCellByte(Grid& ram_grid, | ||
| const int byte_idx, | ||
| const std::string& prefix, | ||
| void RamGen::makeCellByte(const int byte_idx, |
There was a problem hiding this comment.
There should no longer be a concept of byte. All bytes should now be slices.
Remember how we discussed that we want each slice to be able to contain an arbitrary number of bits.
e.g. makeCellByte --> makeSlice (no reason to indicate "Cell" here)
byte_idx --> slice_idx
etc.
src/ram/src/ram.cpp
Outdated
|
|
||
| int num_sites = ram_grid.getRowWidth() / db_sites->getWidth(); | ||
| int num_sites = ram_grid_.getRowWidth() / db_sites->getWidth(); | ||
| for (int i = 0; i <= word_count; ++i) { // extra for the layer of buffers |
There was a problem hiding this comment.
Please make this comment on a separate line and make it more explanatory. e.g.
// One extra row is added at the top of the array for placing input buffers
There was a problem hiding this comment.
Actually, better yet, add the following
// One extra row is added at the top of the array for placing input buffers
const int rows = word_count + 1;
for (int i = 0; i < rows; ++i) {
//...
src/ram/src/ram.cpp
Outdated
|
|
||
| int max_y_coord = ram_grid.getHeight() * (word_count + 1); | ||
| int max_x_coord = ram_grid.getRowWidth(); | ||
| int max_y_coord = ram_grid_.getHeight() * (word_count + 1); |
There was a problem hiding this comment.
After the previous suggestion:
| int max_y_coord = ram_grid_.getHeight() * (word_count + 1); | |
| int max_y_coord = ram_grid_.getHeight() * rows; |
This makes it self-explanatory where the +1 comes from.
There was a problem hiding this comment.
I'm confused about why the header file is copied here? Also how is this working - the old one is not deleted?
There was a problem hiding this comment.
It's moved over here now that I have the grid as a member of the RamGen class, that way we're not constantly passing the ram_grid variable every single time, since we're only working with one grid anyway. The ram header file needs access to this file. Also, I'm considering renaming this to grid since the class that's being used in the cpp file is the grid. The old one should've been deleted. I must've missed it, but it's been fixed.
src/ram/src/ram.cpp
Outdated
| data_input[local_bit], | ||
| outs), | ||
| physical_col_idx); | ||
| ram_grid_.addCell(makeCellBit(name, |
There was a problem hiding this comment.
Any reason to name this makeCellBit and not just makeBit?
There was a problem hiding this comment.
All of the make methods should be just converted to make as opposed to makeCell. It was just the naming scheme that I had when I was setting up the grid class initially.
689e5ba to
f6e4f31
Compare
Signed-off-by: Brayden Louie <braydenl9988@gmail.com> Signed-off-by: braydenlouie <braydenl9988@gmail.com>
Signed-off-by: Brayden Louie <braydenl9988@gmail.com> Signed-off-by: braydenlouie <braydenl9988@gmail.com>
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
f6e4f31 to
d65b604
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
|
clang-tidy review says "All clean, LGTM! 👍" |
rovinski
left a comment
There was a problem hiding this comment.
Marked all the byte locations. Should be slice and should not be hard-coded.
As we discussed in the meeting, it would be best to have a user flag -mask_size which represents the number of bits per slice. The -bytes_per_word flag should change to -word_size which is the number of bits per word. Return an error "Word size must be a multiple of mask size." if it's not an integer multiple.
| if (orientation_ == odb::horizontal) { | ||
| layouts_.push_back(std::make_unique<Layout>(odb::vertical)); | ||
| } | ||
| layouts_.push_back(std::make_unique<Layout>(orientation_.turn_90())); |
There was a problem hiding this comment.
Is this the desired behavior? Note that before it wasn't pushing a value if orientation_ == odb::vertical. Now it does.
There was a problem hiding this comment.
Yes, I think I never added a value for orientation == odb::vertical because I wasn't sure how accommodating this grid class should be. The current implementation stores the layouts as columns but keeping this could give us flexibility if we need to change to storing the layouts as rows (or if in the future someone has an additional use for this class)
| {{storage_cell_->findMTerm("CLK") ? "CLK" : "GATE", clock}, | ||
| {"D", data_input}, | ||
| {"Q", storage_net}}); |
There was a problem hiding this comment.
Let's make a note to fix the hard-coded pin names in the next PR.
There was a problem hiding this comment.
This was a change made in relation to the switch from using sky130's dlxtp to dfxtpfor generating a behavioral model. I can change it so that it's just hard-coded to CLK for now until the next PR if that's preferable.
There was a problem hiding this comment.
What I meant was in a future PR, the dbSta interface should be used to find the clock port without needing knowing the name, such as in this PR: https://github.com/The-OpenROAD-Project/OpenROAD/pull/9680/changes#diff-98551916db74d39411a745c9ee3ff871d2b6473e09af929c8f78252dc85ddeafR119-R142
Although I'm not convinced that that PR does it in the simplest way. There's probably a getClockPort() function or something rather than iterating.
src/ram/src/ram.cpp
Outdated
| const vector<dbNet*>& selects, | ||
| const array<dbNet*, 8>& data_input, | ||
| const vector<array<dbBTerm*, 8>>& data_output) | ||
| void RamGen::makeSlice(const int byte_idx, |
src/ram/src/ram.cpp
Outdated
| const int bits_per_byte = 8; | ||
| const int start_bit_idx = byte_idx * bits_per_byte; |
src/ram/src/ram.cpp
Outdated
| for (int local_bit = 0; local_bit < bits_per_byte; ++local_bit) { | ||
| auto name = fmt::format("{}.bit{}", prefix, start_bit_idx + local_bit); |
src/ram/src/ram.cpp
Outdated
| const int bits_per_byte = 8; | ||
|
|
||
| for (int byte = 0; byte < bytes_per_word; ++byte) { | ||
| int start_idx = byte * bits_per_byte; | ||
|
|
||
| vector<dbNet*> byte_inputs(data_input.begin() + start_idx, | ||
| data_input.begin() + start_idx + bits_per_byte); | ||
| vector<vector<dbBTerm*>> byte_outputs(read_ports, | ||
| vector<dbBTerm*>(bits_per_byte)); |
src/ram/src/ram.cpp
Outdated
| bits_per_byte, | ||
| byte_outputs[port].begin()); | ||
| } | ||
|
|
||
| makeSlice(byte, | ||
| row_idx, | ||
| read_ports, | ||
| clock, | ||
| write_enable[byte]->getNet(), | ||
| selects, | ||
| byte_inputs, | ||
| byte_outputs); |
src/ram/src/ram.cpp
Outdated
| const int bits_per_byte = 8; | ||
| const int bits_per_word = bytes_per_word * bits_per_byte; |
src/ram/src/ram.cpp
Outdated
| q_outputs_); | ||
| } | ||
| for (int row = 0; row < word_count; ++row) { | ||
| makeWord(bytes_per_word, |
src/ram/src/ram.cpp
Outdated
| } | ||
|
|
||
| for (int bit = 0; bit < 8; ++bit) { | ||
| for (int byte = 0; byte < bytes_per_word; ++byte) { |
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
rovinski
left a comment
There was a problem hiding this comment.
Just a few last things and also fixing the unit test
src/ram/src/ram.cpp
Outdated
| // cell for WE AND gate/inverter | ||
| // extra column is for decoder cells | ||
| int col_cell_count = bytes_per_word * 9; | ||
| int col_cell_count = slices_per_word * 9; |
There was a problem hiding this comment.
Is this math and the comment above it still correct?
src/ram/src/ram.cpp
Outdated
| } | ||
|
|
||
| const int word_size_bit = bytes_per_word * 8; | ||
| const int word_size_bit = slices_per_word * 8; |
There was a problem hiding this comment.
Is this math still correct?
Signed-off-by: braydenlouie <braydenl9988@gmail.com>
|
I'll update the unit testing as soon as I get the chance to rebuild on a |
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
Signed-off-by: Brayden Louie <braydenl9988@gmail.com>
cb46b90 to
f31396c
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
rovinski
left a comment
There was a problem hiding this comment.
@braydenlouie after these clarity/simplification items this looks good to me. Please mark as ready to review and tag Matt.
|
|
||
| int numLayouts() const; | ||
|
|
||
| void setNumLayouts(int tracks); |
There was a problem hiding this comment.
tracks --> rows unless there's a better term for it
| if (idx == layouts_.size()) { | ||
| layouts_.push_back(std::move(layout)); | ||
| } else if (index < layouts_.size()) { | ||
| layouts_.insert(layouts_.begin() + index, std::move(layout)); | ||
| } else if (index > layouts_.size()) { | ||
| } else if (idx < layouts_.size()) { | ||
| layouts_.insert(layouts_.begin() + idx, std::move(layout)); | ||
| } else if (idx > layouts_.size()) { | ||
| return false; | ||
| } | ||
| return true; |
There was a problem hiding this comment.
This can be simplified, because an insertion at the end is equivalent to a push_back
| if (idx == layouts_.size()) { | |
| layouts_.push_back(std::move(layout)); | |
| } else if (index < layouts_.size()) { | |
| layouts_.insert(layouts_.begin() + index, std::move(layout)); | |
| } else if (index > layouts_.size()) { | |
| } else if (idx < layouts_.size()) { | |
| layouts_.insert(layouts_.begin() + idx, std::move(layout)); | |
| } else if (idx > layouts_.size()) { | |
| return false; | |
| } | |
| return true; | |
| if (idx > static_cast<int>(layouts_.size())) { | |
| return false; | |
| } | |
| layouts_.insert(layouts_.begin() + idx, std::move(layout)); | |
| return true; |
| if (num_layouts > layouts_.size()) { | ||
| for (int size = layouts_.size(); size < num_layouts; ++size) { | ||
| layouts_.push_back(std::make_unique<Layout>(orientation_.turn_90())); | ||
| } | ||
| } |
There was a problem hiding this comment.
You don't need the if condition because the for loop will already exit on the first check if num_layouts > layouts_.size()
| if (num_layouts > layouts_.size()) { | |
| for (int size = layouts_.size(); size < num_layouts; ++size) { | |
| layouts_.push_back(std::make_unique<Layout>(orientation_.turn_90())); | |
| } | |
| } | |
| for (int size = layouts_.size(); size < num_layouts; ++size) { | |
| layouts_.push_back(std::make_unique<Layout>(orientation_.turn_90())); | |
| } |
| } | ||
|
|
||
| if { $word_size % $mask_size != 0 } { | ||
| utl::error RAM 26 "The -word_size argument must be divisible by the -mask_size argument." |
There was a problem hiding this comment.
For clarity
| utl::error RAM 26 "The -word_size argument must be divisible by the -mask_size argument." | |
| utl::error RAM 26 "The -word_size ($word_size) must be divisible by the -mask_size ($mask_size)." |
|
@gemini-code-assist /review |
There was a problem hiding this comment.
Code Review
This pull request is a significant and valuable refactoring of the RAM generation logic, moving from a fixed byte-oriented approach to a more flexible slice-based implementation. The changes improve code clarity and configurability by introducing parameters like mask_size and word_size. The code is cleaner and more readable overall.
I've found one critical bug in the TCL script where an incorrect variable is used, which could lead to generating RAMs with the wrong dimensions. I've also included a suggestion to improve the clarity and efficiency of a section of C++ code in makeWord.
| -bytes_per_word $keys(-bytes_per_word) \ | ||
| -word_count $keys(-word_count)] | ||
| -mask_size $keys(-mask_size) \ | ||
| -word_size $keys(-mask_size) \ |
There was a problem hiding this comment.
There appears to be a copy-paste error here. The -word_size argument for generate_ram_netlist is being set with the value of -mask_size ($keys(-mask_size)) instead of its own value ($keys(-word_size)). This will cause the RAM to be generated with an incorrect word size if -word_size and -mask_size are different.
-word_size $keys(-word_size) \
| vector<vector<dbBTerm*>> slice_outputs(read_ports, | ||
| vector<dbBTerm*>(mask_size)); | ||
| for (int port = 0; port < read_ports; ++port) { | ||
| std::copy_n(data_output[port].begin() + start_idx, | ||
| mask_size, | ||
| slice_outputs[port].begin()); | ||
| } |
There was a problem hiding this comment.
The creation of slice_outputs can be made more concise and potentially more efficient by constructing the inner vectors directly from iterators instead of default-constructing and then copying. This would also improve readability, which aligns with the goals of this refactoring.
| vector<vector<dbBTerm*>> slice_outputs(read_ports, | |
| vector<dbBTerm*>(mask_size)); | |
| for (int port = 0; port < read_ports; ++port) { | |
| std::copy_n(data_output[port].begin() + start_idx, | |
| mask_size, | |
| slice_outputs[port].begin()); | |
| } | |
| std::vector<std::vector<odb::dbBTerm*>> slice_outputs; | |
| slice_outputs.reserve(read_ports); | |
| for (int port = 0; port < read_ports; ++port) { | |
| const auto& port_outputs = data_output[port]; | |
| slice_outputs.emplace_back(port_outputs.begin() + start_idx, | |
| port_outputs.begin() + start_idx + mask_size); | |
| } |
Changes in RAM creation order