Online analysis only accepts a .csv or .h5ad file, which contains an expression matrix with cells as rows and gene symbols as columns (or the opposite). For .csv files, a raw count matrix is expected in order to reduce the file size and online upload burden. For .h5ad files, a log-normalised expression matrix (to 10,000 counts per cell) is expected (raw-count adata processed by scanpy.pp.normalize_total(target_sum=1e4) and scanpy.pp.log1p).
Detailed model information can be found here. For immune cell types, we recommend users to start from the default model (Immune_All_Low.pkl).
Majority voting refines the prediction result in a local cell cluster by choosing the dominant cell type label but may increase the runtime especially for a large dataset due to the over-clustering step. This approach usually improves the cell annotation, as voting is conducted in small subclusters derived from over-clustering (cells belonging to a given cell type will be assigned the same label regardless of potential batch effects separating them).
The prediction result consisting of three tables will be sent to the email provided by the user.