tumourkit.utils.preprocessing

Module with utility functions for preprocessing.

Copyright (C) 2023 Jose Pérez Cano

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

Contact information: joseperez2000@hotmail.es

Functions

add_node

Given a dictionary with vectorial features, converts them into one column per dimension and add them into the global dictionary.

apply_mask

Apply a binary mask to an RGB image.

compute_perimeter

Compute the perimeter of a given contour.

create_dir

Creates a directory at the specified path if it does not already exist.

create_geojson

Converts a list of contours and their labels to a list of dictionaries containing GeoJSON-formatted data.

extract_features

Extracts features from a given RGB bounding box of a cell and its mask.

format_contour

Formats a contour in cv2.findContours format to an array of shape (N,2).

get_centroid_by_id

Given an image and an id representing a component, returns the centroid of the component as a tuple (x, y).

get_mask

Given segmentation mask with indices as pixel values, returns the mask corresponding to the given index.

get_names

This function returns a list of all files in a directory located at <path> that contain the substring <pattern> in their name.

parse_path

Parses a file path and checks for a trailing slash.

read_centroids

Reads a CSV file from the specified directory containing centroids and returns their contents as a NumPy array.

read_csv

Reads a CSV file from the specified directory and returns its contents as a Pandas DataFrame.

read_graph

Reads a graph in CSV format from the specified directory and returns it as a Pandas DataFrame.

read_gson

Reads the GeoJSON file at the specified path with the given name.

read_image

Given name image (without extension) and folder path, returns array with pixel values (RGB).

read_json

Reads a Hovernet JSON file from the specified path and returns the nuclei information as a dictionary.

read_labels

Reads a PNG and CSV file from the specified directories and returns their contents as a tuple.

read_names

Returns a list of file names in the specified path that contain the given pattern.

read_png

Reads a PNG file from the specified directory and returns its image data as a NumPy array.

save_centroids

Saves the given centroids to the specified directory with the specified name and a '.centroids.csv' extension.

save_csv

Saves the given CSV file to the specified directory with the specified name and a '.class.csv' extension.

save_geojson

Save a list of geojson features to a file with the given name at the specified path.

save_graph

Saves the given graph as a CSV file to the specified path.

save_png

Saves a PNG image to a file.

save_pngcsv

Saves the given PNG and CSV files to the specified directories with the specified name.