NeuroCAAS | Analysis Intro

BarDensr

Analysis description

BarDensr (BARcode DEmixing through Non-negative Spatial Regression) is designed to demix the molecular signal from a set of images obtained from spatial transcriptomics data. The input images are generated from multiple detecting processes, where nucleotides from the target barcodes are detected and the corresponding signals are imaged sequentially using multiple laser channels. In NeuroCAAS implementation, BarDensr runs the model on the spatial patches of the entire image that is uploaded by the user, making the process scalable to a large image with a large number of barcodes. The demixed output image for each barcode is a compressed, sparse image, and it facilitates the downstream blob detection process, which will be done by the users with their favourite blob detection algorithms. This implementation also returns the quality analysis results on the detected spots using singular value decomposition on the cleaned images, which will help assess the model fit to the uploaded images.

Useful links

BarDensr Paper Link

BarDensr Github Repo Link

BarDensr Bash Script Link

BarDensr Demo Link

How to use this analysis

The NeuroCAAS implementation of BarDensr works with the newest version of BarDensr (currently 0.10). Look for live logging in bardensr_out.txt in addition to standard DATASET_NAME files.

Args.
Input (hdf5): the hdf5 that specifies the round and channel for each image. Each image name should be formatted as ‘round_r, channel_c’, where r and c starts from 1 and are the indices of round and channel for each image. Each image should be formatted as 3D (e.g., (2048x2048x1) for a 2D image with 2048 pixels in X and Y directions) image, and they must be properly registered across rounds and channels before input to this platform. The images are also considered to be preprocessed (e.g., removing the background by filtering, deconvolving the spots, etc).
Config (yaml): a file containing the parameters used for the analysis and specifies the codebook that will be used. Note that users also need to specify if any ‘unused’ barcodes are included in the codebook in this file.
Details on the parameters are as follows:
-`blur_level`: indicates the expected size of the rolonies. Example: (2, 2, 0).
-`downsample_level`: scale of the downsampling for coarsening process (see the coarsening section in the paper for more detail). A larger downsampling level gives a faster computation. Example: (5, 5, 1).
- `tile_size`: the size of the tiles for patching. Each tile will be analyzed independently and the barcodes that are absent in each tile will be removed from the later process (see barcode sparsifying and coarsening acceleration section in the paper for detail). Example: (100, 100, 1).
-`detection_thre`: between 0 to 1. This controls the threshold for removing the absent barcodes in each tile, as well as the threshold for detecting the blobs. A higher threshold gives a sparser result and a smaller number of detected blobs . Example: 0.9.
- `lam`: sparsity penalty. A larger lam will yield a sparser result. Example: 1.
-`codebook_name`: specify the name of the codebook file uploaded (see blow). Example: ‘codebook5nt.csv’
-`unused_included` and `num_unused`: these two arguments specify if and how many of the unused barcodes are included in the uploaded codebook. Unused barcodes are necessary for the analysis to determine the threshold to detect the absent barcodes in each tile, as well as to detect blobs. Note if `unused_included` is yes and `num_unused` is 3, BarDensr assumes that the unused barcodes are the last three barcodes in the codebook. If `unused_included` is no and `num_unused` is 2, BarDensr will generate two unused barcodes for the analysis.
Codebook (csv): binary codebook. The first column should be the names of the barcodes. The second column is the binary barcode, with the channel-major order (i.e., r1_c1, r1_c2, …, r2_c1, r2_c2.., ). Note that users do not need to select the codebook but need to specify the uploaded codebook from the config file.

Outputs
NeuroCAAS implementation returns three outputs:
The first output will be a .csv file named “output_table_DATASET_NAME.csv” where each row stores the information on a pixel with non-zero value. Column names are ‘barcode_ID’, ‘barcode_name’, ‘x’, ‘y’, ‘z’, ‘intensity’. Where x,y,z indicate the coordinate of the pixel and barcode_name and barcode_ID are the name (if provided) and ID of the barcode in that pixel.
The second output will be an hdf5 file, where each data name is ‘barcodes’, ‘imgs’, ‘spatials’, ‘temporals’, ‘r2s’, and ‘coords’. Each of them is a list with the same length S (the number of spots detected using `local_peak_max` algorithm in the uploaded image). ‘barcodes’ indicates the barcode ID for this spot; ‘imgs’ is the original spot image on the rolony density; ‘spatials’ and ‘temporals’ are the two vectors generated from the top component of Singular Value Decomposition (SVD) on the cleaned images. ‘r2s’ quantifies the similarity between the reconstruction from these two SVD vectors and the original cleaned image. ‘coords’ is the spot’s x,y,z coordinates in the original image.
The third output will be a png file that helps the users to assess the quality of the results. For each detected rolony, we estimate the quality of evidence for that rolony by taking a key correlation coefficient, and the png file shows a histogram of these quality measures over all spots.

You must login to use an analysis.