So far we have discussed * running on N batches, chosen by the user * selecting the correct batch size depending on the amount of GPU memory. * defaulting to CPU if no gpu is available * supporting flexible array types (ndarray, torch tensor, cupy array) * model specific input processor, possibly using `transformers`, see #18 * selecting the correct set of metrics for semantic segmentation (initially, and then other downstream tasks) * adding dependencies for torchmetrics, possibly torcheval if needed for semantic segmentation (definitely for object detection, COCO mAP)
So far we have discussed
transformers, see integration with huggingface transformers? #18