Skip to content

inonitz/tree

Repository files navigation

Contributors Forks Stargazers MIT

Treelib

AVL Tree Implementation in C & C++

About The Project

A Long while ago I had to write a Virtual Address Space Manager for primOS -

At the time, reading forum posts, some educational materials and some of the linux kernel documentation
led me to believe that Searching the Address Space would be way more common than insertion/deletion.
Moreover, since AVL Trees Do keep their height to a minimum boundary, I found it most fitting to pick an AVL Tree as my first choice

I did eventually find a working, open-source implementation online that fit the bill (See Acknowledgements Section),
But I never had the time to research the inner-workings of AVL Trees.

This project Implements Generic AVL Trees in C & C++, with recursive & iterative implementations available.
Specifically, the C++ interface offers both Iterative & recursive versions, while C only offers an iterative version
All relevant data structures in this project are fully working, tested and Benchmarked



Project Structure

The project has the same structure as my other project util2, except that I added benchmarks & Proper Unit Testing

Getting Started

Prerequisites

Building & (Maybe) Running

Downloading

If you're only interested in the library itself, i.e no Testing/Benchmarking:

git clone https://github.com/inonitz/tree/tree.git --branch onlylibrary desired_folder_path_from_cwd
cd desired_folder_path_from_cwd/tree

If you want to build Tests/Benchmarks Too:

git clone --recurse-submodules https://github.com/inonitz/tree/tree.git --branch master desired_folder_path_from_cwd
cd desired_folder_path_from_cwd/tree



Configuring

Because This project is somewhat big and building manually is cumbersome,
I Wrote build scripts build.sh, build.ps1 for Linux & Windows Respectively

By Default, The project will try to build EVERYTHING - If you do not want that,
add the following flags to your cmake invocation (Or If Building with the scripts, disable in your platforms' Script):

  • Project Specific:

    • -DENABLE_SANITIZER_ADDRESS=OFF
    • -DENABLE_SANITIZER_UNDEFINED=OFF
    • -DENABLE_SANITIZER_MEMORY=OFF
    • -DENABLE_LINK_TIME_OPTIMIZATION=OFF
    • -DTREELIB_BUILD_TESTS=OFF
  • Dependencies:

    • -DCMAKE_C_COMPILER=clang
    • -DCMAKE_CXX_COMPILER=clang++
    • -DCMAKE_EXPORT_COMPILE_COMMANDS=1
    • -DCMAKE_COLOR_DIAGNOSTICS=ON
    • -DBUILD_GMOCK=OFF
    • -DINSTALL_GTEST=OFF
    • -DBENCHMARK_ENABLE_INSTALL=OFF
    • -DBENCHMARK_INSTALL_DOCS=OFF
    • -DBENCHMARK_INSTALL_TOOLS=OFF
    • -DBENCHMARK_DOWNLOAD_DEPENDENCIES=OFF
    • -DBENCHMARK_ENABLE_TESTING=OFF
    • -DBENCHMARK_ENABLE_GTEST_TESTS=OFF
    • -DBENCHMARK_USE_BUNDLED_GTEST=OFF

Windows:

.\build.ps1 -Help
.\build.ps1 -BuildType release -LinkType shared -Action configure
.\build.ps1 -BuildType release -LinkType shared -Action build
.\build.ps1 -BuildType release -LinkType shared -Action test/benchmark

Linux:

./build.sh --help
./build.sh release static configure
./build.sh release static build
./build.sh release static test/benchmark



Usage

In Source Build

In your CMakeLists.txt:

add_subdirectory(your_directory/tree)

Also, Don't forget to link to the library:

target_link_libraries(your_target_executable/library PRIVATE TREELIB::treelib)

Out-Of-Source (Submodule/etc...) Build

git submodule add https://github.com/inonitz/tree your_dependency_folder/tree
git submodule init
git submodule update

In your CMakeLists.txt:

add_subdirectory(your_dependency_folder/tree)

Also, Don't forget to link to the library:

target_link_libraries(your_target_executable/library PRIVATE TREELIB::treelib)



Roadmap/TODO

  • Optimize binaryTree::AVLDeleteIterative in AVLTreeGeneric.tpp, similarly to AVLInsertIterative in AVLTree.c
  • Add Automatic Testing Matrix For architecture/OS/Build Type (Shared/Dynamic/Static, Debug/Release, ...)
    • (involves CI/CD Pipelines which I'm not very familiar with/not interested in currently)



Contributing

If you have a suggestion, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

License

Distributed under the MIT License. See LICENSE file for more information.



Benchmarks

A few notes before the graphs

The following benchmarks were tested on my i7-8750H Laptop with 16GBx2 2666Mhz DDR4 Ram

  • Search, Deletion & Insertion were Tested on 6 Different Data Structures

    1. std::set (clang-18)
    2. std::unordered_map (clang-18)
    3. C Implementation of an AVL Tree
    4. C++ Implementation Of an Iterative AVL Tree
    5. C++ Implementation Of a Recursive AVL Tree
    6. C++ Implementation Of a 'Flat' Iterative AVL Tree, specifically:
      • Specialized Freelist Node-Allocator
      • Reusing of freed nodes using a local Stack
      • Metadata packing to 8 bytes, enabling a maximum of ~258 Million Elements
        This was done to reduce memory footprint, memory chasing & improve
        cache utilization
  • Each Data Structure was specialized to 3 Base Types

    1. u64 (trivial plain old data type)
    2. DummyRecord (Multiple Cache-Lines in Size, but still POD)
    3. std::string (Non-Trivial Type with complex construction, copying and destruction)
  • Benchmarking was done concurrently to save time (Approx ~5 Hours Serially)

Finally, the benchmarks!

Searching for a Node

Inserting Nodes

Deleting Nodes

Conclusions/Lessons Learned

  • Recursion is pretty good for small N, but gets out of hand pretty quickly
  • Looking at the benchmarks, I can confidently assume that FlatAVLTree had way better cache utilization,
    due to the spatial locality of the internal nodes being placed in a big buffer,
    as compared with the usual implementation requiring pointer chasing and using malloc/free/new/delete
  • Premature Optimization is still the root of all evil, but indeed a very convenient learning tool
  • Never assume anything until you benchmark it
  • The Standard Library folks really do know what they're doing
  • If you think a task will require an allocated Time N, it will probably take 2N, if not more
  • Testing is required to ensure growing complexity in a system/project doesn't completely overwhelm and impede the rate of development,
    more formally known as Lehmans' Law of Software Evolution (literally just found this out now lol)

Acknowledgements

References

Releases

No releases published

Packages

 
 
 

Contributors